Source Control Management Systems

TL;DR: It lets you make significant code and structural changes without fear, experiment features safely and much more.

This post is a bit different from the rest of the posts in this blog. It is aimed towards developers in their starting stages (typically), primarily single developers working on even the smallest project. But the ideas apply to everyone, and experienced developers might learn a thing or two, so it’s worthwhile for them to continue reading as well.

The need

Developers, especially in the beginning of their journey, don’t understand what an SCM is (SCM is an abbreviation to Source Control Management software, things like Subversion, Mercurial, GIT, Perforce and so on; sometimes called “Revision Control”) or don’t always realize its importance. When I began programming, my most advanced “version control” practice was copying & pasting the whole project to a folder aside, making changes, and then DIFFing the copied folder with the current working folder. Backup meant sending myself the project directory structure by email.

If you don’t know the basics of SCM systems, just take some small amount of time and learn it (links at the bottom of this post). Everyone loves GIT; I love GIT And Mercurial, although GIT is much more in fashion these days. But I think Subversion (SVN) is much more straight-forward to use when you’re a single developer. Its Windows tooling, if you’re using Windows, is no doubt better than the others, and I’ve used at least four different SCMs til now.

Even the simplest “Hello, world” program these days consists of external libraries or packages besides your own code. Adding a package is an action that throws in multiple server and client files, in multiple locations, in addition to configuration changes scattered in XML or JSON files. Often you want to add a feature package, test it and if it doesn’t work OK just remove it. Tracking what files and configuration snippets a package adds, without an SCM of some kind, is a real hassle. You basically have to copy all your project aside, add the package, save everything and DIFF the old package with the new one. Then if you indeed want to wipe it out by manually restoring everything, you find that it is a tedious and error-prone work. Yes, development frameworks do allow you to uninstall a package but there are always leftovers.

Sometimes you suspect that some files are not really needed, or are auto-generated. You want to clean up your project. With an SCM it’s easy. Just delete these files with no fear and test the results. No good? just revert and the files are back in their place.

Another often overlooked benefit of an SCM is that it allows you to track image file changes. (Not user-generated image files; I’m talking about images that are actually part of your app.) You can apply image compression and easily diff the “before” and “after” to watch for any details loss, then easily revert if needed and try again with a different compression tool. If you hook a good file diff tool (like Beyond Compare) to your SCM you’ll be able to do all sorts of file comparisons, including comparisons of binary files of certain types (images, PDF, DOC and more).

My SCM usage workflow with a new project

My usual workflow with small projects is this (it is still true for bigger ones, with or without other team members, but this time I apply additional SCM practices like branching – but that’s beyond the scope of this post):

  1. I start with a fresh, stock boilerplate / scaffolded project. In .NET it’s the File -> New Project wizard, which at the end creates a whole project structure with config files, server packages, code files, sample CSS etc. In NodeJS+Express it’s any express app command and with Yeoman it’s yo generator. Whatever your environment is, it probably has some baseline project to begin with.

  2. I then immediately and before anything else create a repository (I use GIT but as I said before, Mercurial or Subversion are perfectly fine as well) and do a first commit with the message: “Initial commit”.

  3. At this stage I build the first code (if needed) and make sure it’s working and displaying some meaningful output. Depends on the project and the sophistication of the scaffold, this can be anything between nothing to a full sample HTML page. What’s important here is to make sure there are no errors whatsoever, because if there are – then you should recheck your initial project initialization procedure for any mistakes. It even happened to me more than once that there was actually a bug in the tool I used itself.

    This step is highly critical. In case of any error arising at this stage, you can be sure that if you followed the project initialization / scaffold instructions carefully, then the error does not result in you having done something wrong but rather with the tool you’ve used itself or with the files it generates.

  4. At this point you should go ahead and iteratively add your code, validating your project functionality often. Whenever you have a new functional chunk, commit it (again, there are other important stuff like testing but we’re not covering it here right now). But in particular, make sure your commits are atomic – that is, each commit should only change one thing in your project. If, while coding, you’re fixing code indents, whitespace etc. – commit them in a commit of its own, without any functional changes. If you rename variables or do any other refactoring – commit it on its own. If you’re adding a package – first commit the package addition itself, and only then commit new code that uses the package.

Working this way is awesome. I’m using SCMs for 10 years now and can’t imagine how I lived before. Creating copies of root directories? sending ZIP files to myself? OMG.

And it’s so easy to jump on and use an SCM.

There are even more scenarios where an SCM really shines. When I code some feature, very often I experiment things and try. I’d quickly create temporary code, probing code and debug statements, I’d comment blocks here and there, sometimes I’d even not bother naming variables appropriately if all I want to see is some proof that some technique I’m using is really functioning. Since I keep my commits atomic (isolated), I don’t get these experiments mixed up with other stuff. Then at the end, when I’m happy with the functionality, I just use GIT to show me the code differences between now and the last “stable”/”committed” version and I start cleaning up the junk, probably renaming variables and class/method names, deleting all debug statements etc. and what I commit to GIT is a very clean version of the code.

I could continue with more examples but I want to keep this post not too long. Hopefully I’ve covered the bigger advantages for using an SCM.

If you’re looking for a good recommendation for a starter, I wouldn’t necessarily recommend GIT although as I said I use it, and many people will argue with me on that. Subversion is much more useful for single-developer scenarios, and its Windows tooling are superb, much better than GIT’s and Mercurial’s, in my opinion. Note: eventually you’d progress to using command-line for SCM actions but really don’t worry about it now.

Links

Two useful links to begin with:

  1. A Visual Guide to Version Control – explains the ideas regardless of any specific SCM or platform. Skim through it quickly at least. It’s nicely worded and shown many graphical illustrations to explain the various points.

  2. The 10 commandments of good source control management – true for every environment but mostly focussed on Subversion, Windows and .NET so if that’s your environment I wouldn’t skip this one.

and one great Kindle eBook:

Good luck!

The logos at the beginning of this post are courtesy of: GIT: Jason Long; Mercurial: Matt Mackall; Subversion: Tyrus Christiana

Why You Should Use a Source Control Management System for Even the Simplest, Single-Developer Project