Basics of Source Control


Code For Cash

In most development organizations, source control plays a critical role in just about everything. Far from simply providing a place for code to be stored securely, source control in modern organizations underpins things like software build processes, automated testing, deployment, and may even have legal implications. Source control has become a critical part of all software development and is one of those things you are expected to understand.

In addition to being a critical underlying technology for software operations in general, proper use of source control can provide an organization with significant advantages when it comes to the ability to maintain software that is already in use. It makes it far easier to respond to critical bugs, even if your team was in the midst of developing major new features that aren’t ready for prime time yet. Source control also facilitates software teams – without it, it quickly becomes very difficult to merge code written by multiple developers. Good source control systems make sharing and modifying code possible in a large team. Additionally, controls on such systems make it much easier to control the code that is being released, which not only helps with stability, but also makes releases easier.

As always, we’re taking many of our definitions from wikipedia because they say better than we can, and then explaining things in basic English. We’re going to organize this outline based on the typical process used to make a feature branch, do some work, and then get those changes out into the master branch, so that we aren’t spending half an hour boring you with definitions. We’re also leaning a bit towards git, as that is one of the most common distributed source control systems out there, although most of this stuff will apply to most version control systems out there.

Source control is a critically important part of any developer’s workflow. Like many tools, such as development environments, package managers, and the like, you can get by without it. However, it’s generally extremely painful to try to do so. Good practices around source control make it far easier for both you and your team to do their jobs well. There is no honor in failing to use good tools to do a good job; you aren’t proving anything by doing it the hard way. In fact, you may be doing things poorly by doing them the hard way.

Episode Breakdown

The Code Repository

Code is stored in a source control repository that can be thought of as a directed acyclic graph. The “trunk”, release branch, or master, reflects the version of the code that is currently in use in production. We’re going to call it the master branch. A commit to master tends to do thinks like trigger automated builds, unit tests, and deployments. This needs to be controlled carefully to make sure code is vetted before being released. The current version of master, or its head, is the most recent commit to master.

Branches occur when you need to snapshot what is currently in production, modify it, and be able to make changes to it that will eventually end up in master. There will be a number of other branches, that branch off of a particular commit to master (not necessarily the head). These are often referred to as “feature branches”. If you aren’t doing frequent deployments of the content of master or you need to maintain old versions of the code, these will often end up living in feature branches off of a prior commit. Some version control systems simply allow you to tag or label a commit as well. This is often used to mark versions that are in use.

Commits tend to be atomic and expressed terms of changes to all the files in the repository. This seems counter-intuitive until you realize that most changes involve multiple files and that files are interdependent. Code is often moved between files as well.

The feature branch

To begin, you’ll make a new branch for your chunk of work. You may do this off of the head of master if you are doing new work. For patches, you may do this off of a previous commit on master or another branch entirely.

These changes can be done locally or within the source control system. If you are doing branch per feature and have additional per-branch configuration that needs to happen, this will often occur at the source control repository, rather than locally. If the feature branch itself is going to be tested by QA, then you may create additional branches off of it for smaller sets of work and use a PR system to merge back into the feature branch (more on this in a second).

The feature branch will eventually merge back into master. Once a feature branch is completed, you will do a pull request back into master. Pull requests allow your code to be reviewed by your team and act as a gate control on stuff entering the master branch (and/or triggering builds, releases, etc.). You’ll tend to work on and test the feature branch in isolation because it makes it easier to reason about things when they don’t work – you know you are the one that screwed up, rather than some other person.

Master may be merged down into the feature branch. This often happens before you merge the feature back into master to avoid nasty conflicts and to test how changes that happened to master in the intervening time may interact with changes to the feature. This is often accomplished by making a branch off the head of the feature branch, merging master into it, then doing a PR back in the feature branch. The altered feature branch will then be PR’ed into master. This allows multiple levels of code review and control and can keep changes from being automatically deployed before they need to be.

Doing the work.

With the feature branch pulled down to your local environment, you’ll start working. If commits to the feature branch result in code being deployed, you probably will want to work in a branch off of the feature branch. We’ll call this your “working” branch. During your work, you’ll periodically commit to your working branch. This makes it easy to rollback changes locally if you need to do so. Your working branch will be for doing some testable subset of the total work you are doing. If it’s a small thing, the working branch may be for all of it, but if it’s big, you might do multiple working branches during the course of finishing things.

Once done with a set of work in your working branch, you’ll push this up to the source control system and do a pull request into the feature branch. It’s probably not a bad idea to regularly push your working branch up to the source control system so that you have backups. If other people are working on their own working branches off of the feature branch, you may need to periodically update it locally and merge it into your working branch before doing a pull request from working to feature. You issue a pull request to tell other people that your code is ready to be merged into the feature branch. This allows for code review and also acts as a brake on any automated processes that are triggered from a commit to the feature branch itself.

This is a good place to insert integration testing and code quality control. You may not want to write big integration tests on volatile working branches, and if you use them, you want them to apply before stuff goes into master. This tends to mean putting them on feature branches. You should also be managing your testing with source control and its best for tests to travel with the code under test, just for sanity’s sake.

The Feature Goes to Master

Once the feature branch is out somewhere to test and is tested, now you need to get changes out to the master branch. Unless your feature branch is extremly short-lived (or if you are the only person on your team), it’s likely that your master branch has changed since your feature branch was created. This means that you are going to have to do some merging. You also don’t want to have master in a broken state while this is happening.

You have to merge master in. Here’s what this might look like. You pull the latest version of your feature branch locally. You pull the latest version of master locally. You make a branch off of your feature branch for the merge and switch to it. You merge master into this branch, and fix all the merge conflicts you get. You then push that branch up and do a PR from it back to the feature branch.

Regarding Merge conflicts

When you are merging branches, you’ll eventually get merge conflicts of some sort. This occurs when someone else has made a change to the source branch very close to where you’ve made a change in your branch. Typically tools will allow you to compare the changes a select one or both to include in the new version. You will need to run your application and all of your tests after any merge, but especially after one with conflicts, as it’s easy to cause a build error when resolving conflicts.

Merge conflicts need to be handled carefully in regards to the rest of your team as well. Make sure that conflicts are noted and the owners of the conflicting code are consulted. Be especially alert for regressions in places that call into code that had conflicts; it’s easy to get burned with this.

Some general rules with source control for yourself.

Prefer short-lived feature branches to long-lived feature branches. Have processes that gate entry into any branch that deploys to a machine used in production or testing. Delete feature branches once they have been abandoned for a specific period of time to avoid clutter and confusion. Only commit completed, logical units of work to a feature branch. Don’t include multiple things unless they are very small. Don’t include partial fixes. Don’t commit generated code, it just adds noise. Commit frequently and push to a working branch in the source control system. This gives you backups and easy rollbacks. Don’t leave the office without checking in and pushing up to something that is backed up. That way a coworker can take over if they need to. Write commit messages and make sure they make sense. Don’t commit large binaries, such as packages, because they increase noise. Run static code analysis, tests, and other sanity checks before submitting a pull request. Never release anything to clients that isn’t in source control and tagged. You probably shouldn’t share a working branch with your team mates.

Source control rules for teams.

Watch for stale branches. Long-lived branches that haven’t been touched in a while indicate that they are either abandoned, or someone isn’t checking in. Clutter invites mistakes.

Back up your source control system regularly. Daily at the very least. Use changes in your source control system to trigger builds and deployments rather than doing it manually. Have a consistent workflow within your team and enforce it. Have a means of locally checking code quality and correctness before pushing into the build pipeline. Enforce standards on commit messages so that you can tell what’s going on. Have separate repositories for separate things. Don’t spread the same thing across multiple repositories. If you want to see weird errors, have independent parts of the system changing separately. The unpredictable will happen frequently if you do that. Make every developer capable of stopping a release due to code quality.

Use source control for more than code.

In addition to your code, you have a number of software assets that can be source controlled. Documentation is a big one and should ship with your code. This helps ensure that the docs match the version that the client has. This gets even more important if you have multiple languages in the mix.

Non-sensitive configuration information should often be source controlled as well. This includes everything from package management files, to docker configs. Obviously, don’t include anything with credentials, keys, and the like in there. This allows you to recreate environments as they were at a specific time, configured as they were configured at that time. This is really nice when you have cloud environments like amazon in the mix.

Tests should also be source controlled, and live alongside the code under test. This is primarily because it is code itself and needs many of the same tools and tactics. This also makes it easier for developers to run some level of testing locally where possible, rather than waiting on the QA team to do it.

Sample application data is often handy to keep in source control as well. In addition to being used by tests, sample data also allows developers to quickly spin up a working system from scratch. This makes it easier to completely wipe a database if you screw up, and then recreate it.

Book Club

The Healthy Programmer: Get Fit, Feel Better, and Keep Coding

Joe Kutner

Chapter 6 addresses a common problem among developers, back pain. It starts off talking about the invention of the harpsichord and how musicians had to develop proper techniques for playing the keyboard without pain. Since then back pain has become the number 2 reason for doctor visits. The first section talks about Unit Testing Your Core Muscles. In it author Joe Kutner discusses the Kraus-Weber (K-W) Test of Minimum Muscular Fitness. This measures strength and flexibility of large muscle groups. Kutner points out that in clinical trials people with back pain were unable to pass the tests, then after time exercising were able to pass the tests and experienced reduced or no back pain. However, the people who did not continue to exercise had a return of back pain years later. He goes on in the next section to describe the anatomy of the back. Well, it’s a rough description. He does attempt to explain the concept of referred pain. Basically the nerves that innervate the muscles of the back also contain sensory innervation for the back, but not always in the same location as the muscles. In the third section of this chapter Kutner describes several exercises that President Kennedy used in the White House to help his back pain. In the final section before the retrospective he talks about developing better ergonomics and posture. While posture is dependent on the individual, he gives a few common rules to follow. These include supporting your body, distributing weight evenly, and keeping your feet on the ground. Throughout the chapter he emphasizes exercising as a way to strengthen your core and prevent back pain.

Tricks of the Trade

A long enough feedback loop is indistinguishable from no feedback loop; a short enough feedback loop is indistinguishable from spontaneous correct action in the first place.

Editor’s Notes:

>Beej is working on setting up his audio recording environment with his new microphone.

Tagged with: , , , ,