Notice: MediaWiki has been updated. Report any rough edges to marcan@marcan.st

Code Integration

From OpenKinect
Jump to: navigation, search

Open Kinect Repo Policy

In an effort to keep up with the additions from the community while keeping a clean and usable source history in git, we've established a policy on merges to the main repository. This integration policy will hopefully make it easy for us to find bug and figure out where new features come in, while also keeping conflicts with new and ongoing development to a minimum.


Git Basics

For a good overview of git, check out

Workflows

These are terse descriptions of our expected workflow. For a plain english description of how we use git, check out the Developer Story portion below.

Developer workflow

  • Developer clones repository
  • Developer makes a new branch to work on feature with. Master branch should be reserved for upstreams unless developer knows what they are doing with git.
    • git checkout -b [new feature branch name] master
  • Developer makes changes to feature branch, signing off on every commit.
  • Developer submits feature branch via pull request or patch filed as project issue.
  • Once patch is merged to main repo, developer updates from remote repo and merges main repo into their master.
  • Developer then makes new branch off master, continues on new feature.
  • If developer has another feature they have worked on between the time they have submitted, needed commits can be cherry picked to new branch.

Integration workflow

Whenever an integrator brings in a new commit, it should be rebased to the head of the master branch on the main repo. However, due to the volume of pull requests and patches we're receiving, this may not always be the case for pull requests. This workflow makes sure we maintain linearity in the main repo while also not changing code out from under developers

  • Integrator receives pull request or patch
    • If pull request, integrator makes sure it is rebased.
      • If not rebased, and developer is in IRC channel, talk to developer there to have them properly rebase. If developer is not in IRC channel, integrator pulls to local repo and rebases. If conflicts arise, notify developer via pull request comment.
    • If patch, integrate on top. If conflicted, contact developer via github issue comments.
  • Integrator merges patch into main branch. Should always come in at head if rebase was successful.
  • Integrator pushes to main repo

Development Story

Introduction

Before we begin, it's very worth getting up to speed on git at

http://progit.org

I'll be covering git operations here, but not going into too much detail. The Pro Git book site covers everything in hideous, yet readable and understandable detail.

WHY ARE YOU MAKING THIS SO HARD I JUST WANT TO CODE

So, you've decided you want to develop on an OpenKinect project. Yay! We need all the help we can get. As project maintainers, We want to make sure that everyone's help is both attributed and commited correctly, as well kept in line with everyone else that's trying to help. That's why we've created our development and integration workflows, listed above. We realize these lists are somewhat opaque to people not familiar with git or development using distributed systems, so this portion of the document is a plain-english explanation of how development of new features should work on OpenKinect repositories.

To see what we're trying to avoid doing, check out the example and explanation at

http://blog.xebia.com/2010/09/20/git-workflow/

or even worse

http://blog.spearce.org/2007/07/difficult-gitk-graphs.html

This tutorial will also assume that you've created a github account and are using github for your repo storage. If you aren't, it is assumed you know what you're doing, so you can skip the github steps.

Starting Development

The first thing you'll do as a developer is make your own fork of our repository. For those familiar with other code versioning systems, this isn't as serious as it sounds. Forking is a core practice of git, since everyone has their own repo.

To fork on github, sign into your account, go to

https://www.github.com/OpenKinect/libfreenect

and hit the "Fork Repo" button. This will cause the current state of the repo to be copied to your account, and you will now have a libfreenect repo available to work on. Once this is done, open up a terminal and run:

git clone git@github.com:[your_account_name]/libfreenect.git

This will bring the git repo to your file system, and set up your repo as a remote named "origin". Unlike centralized versioning systems, git allows you to set up multiple repositories you can sync with. In our case, you'll want to sync with the main OpenKinect repo too. So, while your "origin" will be your fork of the OpenKinect repo, you'll want to add the main OpenKinect repo as another remote. This can be done by running:

git remote add upstream git://github.com/OpenKinect/libfreenect.git

in the repo clone you just made. Now if you run

git remote show

You'll see you have two remotes

  • origin - Your own fork of our main repo
  • upstream - The main repo

You will have read/write access to the origin remote, but only read access to the upstream remote. This means you can push changes to your own copy of the repo, but can only read changes from the main repo. Changes to the main repo have to be made by the repo integrator, which we'll talk about later.

Keeping up to date with the main repo

Now then, you'll want to have a branch in your repo that's kept up with the latest changes from the upstream remote. This is how you will be able to integrate changes made by other people into your work, and what you will need to base your changes off of when you submit changes to us. We'll use the "master" branch of your repo for this. So the latest commit on your master branch should always match the latest commit on the upstream master branch. If it happens that at some point it doesn't, we'll talk about how to fix that later.

Since your current master is cloned from your repo, which was forked from the upstream repo, your master and the upstream master should still match in terms of latest commit. However, to make sure, run

git remote update

You should see git try to get updates from origin and upstream. This means it's checking both of those remote repositories for any changes, and downloading them to their respective remote sandboxes. Doing a remote update doesn't change any of your local branches, it just changes what your repo knows about where the remote repos are and what updates they have. If you see there are changes with the remote repository, you can do

git checkout master
git merge upstream/master

Which tells your repository to merge what's in the upstream's master branch to your local repo master branchh. This will keep you up to date with everyone else's changes.

Making your own changes

Once you've made sure you're got a place that's up to date with the upstream, it's time to start making changes. First off, you'll want to make a branch off of upstream:

git checkout -b [feature-branch-name] master

I'm not going to cover the process of how commits work here, as that's covered quite well in other basic git tutorials. Also, please make sure to read about how to sign off on your commits at the Contributing Code page.

However, there's one thing we require that probably doesn't come 'til later chapters in the git book. We always want your changes to be on top of the upstream changes. So, for instance, let's say you're ready to start a new feature, and the commit chain of the upstream repo look like

A -> B -> C

(I'm using letters here to represent the SHA1 commit hashes)

So, C is the current latest change in the upstream repo. You start making commits on top of this in your feature branch, and get something that looks like

A -> B -> C -> D -> E

So upstream was at C, and you added commits D and E. However, while you're doing your work, other people have submitted their finished work, and it's come into the upstream. So the next time you do

git remote update
git merge upstream/master master

Upstream now looks like

A -> B -> C -> F -> G

Which means your branch and upstream now differ by two commits, respectively. You have commits D and E, it has commits F and G. Since the upstream repo should be the reference repo, we want to get your commits up on top of what's on the upstream repo. This is done by an operation called "rebasing", because we want to change the "base" of your D and E commits from commit C of the upstream repo to commit G. With your feature branch currently checked out, this is done via the command

git rebase master

This peels your commits off of your branch, finds the common ancestor (commit C), so it starts at

A -> B -> C

merges in commits from the upstream, so we have

A -> B -> C -> F -> G

then starts replaying your commits on top of that, so we get

A -> B -> C -> F -> G -> D

Now, if there's a conflict in D, it'll stop at this point, as you to fix it, and tell you how to continue or stop the process. Otherwise, it'll keep going, and we end up with the IDEA OF:

A -> B -> C -> F -> G -> H (formerly D) -> I (formerly E)

D is pretty much equal to H, and E is pretty much equal to I. However, since they're at different places than they started, git calls them by different IDs now. So while the contents of the commits are the same, their positions in history are different, so they get new and different IDs.

Anyways, what we end up with is what the integrator would like to see when getting your commits.

Getting your changes integrated

Check out Contributing Code to see how to get your changes integrated.

Dealing with our integration process

Once you've submitted your code to us, we'll try to integrate it as soon as possible. However, it's quite rare that we get one integration request at a time. Usually, we integrate 3+ patches in a go, which means your code, while being rebased up to the latest of what you know the repo to be, may not be current by the time it comes in, because we're making changes quickly. So, you submitted

A -> B -> C -> F -> G -> H -> I

to us, but we're integrating patches before yours, and we now have

A -> B -> C -> F -> G -> J -> K -> L

At this point as integrators, we have two choices:

  • We could kick your patch request back and wait for you to send us a merge with an update
  • We can rebase your stuff ourselves.

As of this writing, we're taking a workflow that looks like this:

  • We'll try and rebase your stuff
  • If it conflicts, we kick the pull request back to you to fix or tell us to fix, because we don't want to change commits with your name signed to it without your approval.
  • If it doesn't conflict, we merge the newly rebased code.

This means that when your stuff is merged, it may come in with ids you aren't expecting. In the above case, to review, you sent us

A -> B -> C -> F -> G -> H -> I

We were already merging and had

A -> B -> C -> F -> G -> J -> K -> L

Now we've taken your H and I and put them on the end of that and due to how rebasing works we have

A -> B -> C -> F -> G -> J -> K -> L -> M (formerly H) -> N (formerly I)

So, when you do a

git remote update
git merge upstream/master upstream

next, your upstream will look like

A -> B -> C -> F -> G -> J -> K -> L -> M -> N

However, since integration isn't instantanious but you may want to work on new features with the work you've gotten done currently while waiting for us to integrate, you may've gone off and made a new branch that looks like

A -> B -> C -> F -> G -> H -> I -> O -> P -> Q

Of course, H and I are in the main repo now, but now they're called M and N instead. Git can figure this out for itself and will make sure not to reapply like patches. So, if you rebase, you'll get:

A -> B -> C -> F -> G -> J -> K -> L -> M -> N -> O -> P -> Q

I accidentally commited to my upstream tracking branch, what do I do?

By "upstream tracking branch" here, we mean whatever branch you're trying to keep synchronized with the upstream master. In the case of this document, that's your local master.

First off, don't panic. Git makes it pretty hard to lose changes that are already commited. We just need to get your master branch looking like the upstream master branch again. First off, make a new branch off of your current master.

git branch -b [name of fix branch] master

Next, you'll want to do what git calls a "reset". This resets the latest commit your branch points to, to another commit. Since we want this to look like the master of the upstream, we do

git checkout master
git reset upstream/master

This will make the head of our local master looks like the upstream master. You may see some lines like

M c/lib/cameras.h

This means that in a commit that was not on the upstream master branch was removed, but git left the changes to the file there. They are now just unstaged changes. If you want to completely remove all changes to files, do

git reset --hard upstream/master

This will make sure any commited files that had changes before the reset are now in the state that matches upstream/master. Your changes will still be in the [name of fix branch] branch you made earlier, so you haven't lost anything. Now you can check out the branch you made by doing

git checkout [name of fix branch]

and can then continue on with your work, with a cleaned master branch.

Are there other ways to do all of this?

Yes. This document takes a rather paranoid view of the "git pull" operation, instead prefering to set remotes and always do merges locally. Another option for updating your feature branches to be on top of the upstream would be to do

git pull --rebase git://github.com/OpenKinect/libfreenect.git master

Which will rebase your changes to the top of whatever is on the remote without the need for an upstream tracking branch.

There are probably even more ways of dealing with this, so please feel free to add to this document as is needed.