Using Git for core development
Posted: September 13, 2007 by Ted Kulp
I'll warn everyone now that this is a fairly geeky post. If you're not into core development, source code management or *gasp* command line... feel free to run screaming. Still here? Excellent. Come, let's geek out a bit. What is Git? Git is a source code management system. Though, if you read it's description, it doesn't say that. Don't listen to them! It's a subversion-like system for managing source code. It was written by Linus for the Linux kernel after their fallout with Bitkeeper. It is similar to Bitkeeper in some ways, but it's very unique as well. Git is very unix-y still. Just like CVS and Subversion are by default. It's still too new to have a lot of fluffy GUIs like Tortoisesvn or the like. However, for a regular console jockey, this isn't an issue. Uhh... Why? But you already have subversion. It works well. You've been using it since CMSMS first started. Why would you want to use something else? I totally agree. Subversion is still the right tool for the CMSMS universe. It offers the lowest barrier to entry and reliability. However, I don't necessarily think it's the best tool for myself as a developer, and I have several reasons for this.
- I can branch as much as I want. Branching and merging is not painful. Not nearly as painful as it is in svn. In fact, branching is so painful in subversion that I barely use it... and in a large scale system with a lot of users, that's not a good thing. Committing everything to trunk even if it's broken is just wrong.
- It pretty much seemlessly integrates with svn. There is an added piece in git that allows you to basically push and pull from an upstream respository. This basically means you can use git on your local machine and not screw it up for everyone else. You branch/merge/etc to your hearts content, and then push it all up to the subversion server when you're done.
- It's distributed. The more people that use this, the more people I don't have to give subversion commit access to. It's very painless for me to get patches via email and merge them in and commit to subversion. It means I can watch the patches coming in and make sure that we're not allowing junk to get into the core. And end users can screw around with the code as much as they want... it never has to touch the main repository.
- It's disconnected. I can branch and merge as much as I want without being online. For a person like me who lives on a laptop and codes whenever they get a free moment, it's essential. No more waiting to get online to switch from trunk to 1.2 or another branch, etc. I can even diff against another version without touching the internet... this is huge.
- The history is totally pulled off to everyone's machine. The more people that use git, the more backups we have of our project history. Everytime you clone, you have the whole history on your local machine. And distributed backups are the best kind.
git svn init -T trunk -t tags -b branches http://svn.cmsmadesimple.org/svn/cmsmadesimple cms-gitThis will create a cms-git directory and have all the proper pointers to the svn respository in it. It's also empty still. Now you need to pull down some data. Normally you would do:
cd cms-git git svn fetchThat would pull the whole repository locally. Branches, tags, etc. However, git-svn doesn't seem to like a repository move I did way back at revision 2719. Instead, you should pull the data for revision 3000 and above. Actually, I recommend 4000 if you're not a purist... it's more than enough history for anyone's needs.
cd cms-git git svn fetch -r 4000:HEADThen you wait. And wait. When it's done, you'll have a nice snapshot of the CMSMS core development.
git branchIf you do this, you'll only see master listed. Without getting into explaining git entirely (and there are much better texts for this), let's just say that's a local branch. That master branch automatically points to the trunk of your subversion repository. Let's say you want to work on 1.2 instead. You do something like,
git checkout -b 1.2 branches/1.2.xThis will do 2 things. First it creates a local branch pointing to the 1.2.x branch in subversion. It will then "checkout" that code into the local directory. So now you have an up to date version of 1.2.x ready to be developed on. What next? Let's say you're going to work on a new feature or bug fix. The best way to handle this would be to make a new local branch and work in that. That way, if you want to work on several changes simultaneously and not make a big mess. In this example, we'll say there's a bug in the admin panel login procedure. It's bug #1234 in the bug tracker.
git checkout -b bug_1234_login_problem 1.2You're now making a copy of the 1.2 local branch and making a topic branch specifically for bug 1234. You can change things to your heart's content. If you commit, your changes live in that branch and don't pollute anything else. You can commit as much as you want and no one has to see it. If you decide you hate everything you've coded, you can reset or toss the branch away or whatever you'd like. No one has to know that your code was absolutely dreadful... and there's no reason to every push broken code back up to the svn repository. Ok, you've fixed the bug. It was a one-liner in admin/login.php. Now commit it.
git commit -a -sJust like subversion, an editor will pop up and you can explain what you did. You'll notice that an extra line was added that shows "Signed-off by: Ted Kulp <firstname.lastname@example.org>" or whatever your email is. The -s did this, and it allows you to have an audit log if you're passing patches around. It's a good habit to get into and I highly recommend doing it for every commit. Ok, you're branch is now good to go. Now what I would do is now merge this branch back into your "1.2" branch. This way, you can be sure that it's being applied cleanly before you either push it back to the svn repository or send it out via email to a maintainer.
git checkout 1.2 git svn rebase git merge bug_1234_login_problemWhat you've essentially done is go back to the master 1.2 branch (which should match svn), update it with the latest changes from svn (if there are any) and then merge in your changes from your topic branch. If the merge caused any conflicts, you can easily fix them now and now that when you apply these changes to the upstream repository, they'll apply cleanly. Now, if you do have commit access to the repository, you can just do the following.
git svn dcommitThis will apply any changes to the repository and anyone using svn will be none the wiser. However (and here's the beauty), if you don't have commit access, then you can easily send out a patch via email to the maintainer.
git format-patch -M -n -o patches/ origin git send-email --to email@example.com patches rm -fr patchesNow those patches are sent directly to the maintainer for easily integration into the source code. It's a beautiful thing. Conclusion Git isn't for everyone. It's not even for the masses yet. It's a specialized tool that requires a certain mindset to even use. But once you "get it" you wonder what you did before. Also, this isn't the end-all of tutorials for git. Look below for some great links on getting started with it. If anyone is going to do any major core development, I'd like you do at least examine this option. It allows us to not give commit access to the free world and allows for great amounts of experimentation by the end user without interrupting other users. It's a very viable solution, so please at least give it a look and see for yourself. Enjoy! Important links: