There’s an old joke that goes something like this:
Proposition one: all programs have bugs.
Proposition two: all programs can be shortened by one line.
Conclusion: every program can be reduced to one line of buggy code.
Corny, I know ;-;)
But hey, there is a point in the life of every piece of software when the entire system consists of one line of code. That time is at the very beginning of a project, when one has just typed the first bit of code into one’s text editor.
You might be wondering: why are we talking about projects that contain only one line of code? How could automation possibly help there? And wouldn’t it be overkill to set up tooling to supports a trivially small, new project?
I’ll explain how two automated tools can help you maintain a project, even at the point where you’ve just typed your first line of code. These tools are code review and static analysis.
Start By Establishing A Culture Of Code Review
Recently I sat and talked with Erik Kastner about his thoughts on code review and testing. Erik’s a thoughtful, experienced guy and after working with him for a couple of years I have found that his opinions have become very important to me. Erik says great stuff like “code review is just reading someone else’s code and understanding it before it ships.”
That’s one of the things I like about Kastner — he does more than just propound his methodology. When Erik talks about how he thinks software engineering should or shouldn’t work, he always qualifies his statements. And there can be some pretty surprising insights wrapped up in those qualifications.
So let’s look at this statement again:
Code review means reading and understanding someone else’s code.
This implies that if you and I are working on a project together, you’re going to read my diffs before I commit or merge them into trunk. We might be doing the review in FishEye within a GitHub pull request, or you might just be looking at my commits in our SCM.
But I expect you to do more than just read my changesets. I also should expect you to fully comprehend how the diffs I’m showing you are going to change the behavior of the system. Like many of Kastner’s qualifications to software methodologies, this is a subtle but large distinction.
Ideally every changeset I write gets reviewed by someone else before it goes to production. This is the practice at a lot of large, successful organizations like Google and the JPL. Having a human review every changeset does impose an upper limit on how fast you can deploy code to production. For a new, relatively small project, you might feel that reviewing every changeset is too heavyweight. And you might be right. But keep in mind that it’s a lot easier to put this kind of process in place at the beginning than it is to wait until your application is mature — and you’re definitely going to want a code review process in place at that point.
Now consider the case where I have written the following one line of code and I ask you to review it. This is a trivial case of course, but I hope it’s still illustrative of why you should spend the time to set up these tools before you write a single line of code. Anyway, here’s my changeset, would you review it before I push it to prod?
<?php echo "hello world''
Did you catch both of the errors in my code? Probably you did. And I’m sure you noticed the missing semicolon immediately. But did it take you just a moment longer to realize there was something wrong with that closing double quote? If it did, then you were experiencing a trivial increase in cognitive load.
As our application gets larger and my changesets grow in complexity, you’re going to have to endure a greater and greater amount of cognitive load every time you review and debug one of my changesets. That’s not great. You’re a good hacker and our project is going to win because you’re using your whole brain to think about solving hard problems. It’s too bad that instead our new code review process is causing you to fill your brain with thoughts about whether or not I got my punctuation right.
Besides, checking other people’s syntax is boring drudge work and drudgery is evil. So it’s actually really important that we take a little bit of time at the beginning of our project to make sure that code review imposes as little unnecessary cognitive cost as possible.
Both of the errors I made above actually cause the PHP interpreter to barf. So by induction, there must be a way to catch those errors programmatically. And of course there are several open source tools to help us do exactly that. But the simplest option is to just use the PHP interpreter’s built-in syntax checker:
php -l index.php Parse error: syntax error, unexpected $end, expecting T_VARIABLE or T_DOLLAR_OPEN_CURLY_BRACES or T_CURLY_OPEN in foo.php on line 3 Errors parsing index.php
Great. Just by running
php -l on my code before you review it, you can
now avoid winding up as a human syntax-checker. This saves us both
time and frustration as we continue to work on our project. Even
better, I could run the syntax check on my own code before I send it
over to you for review.
It’s worthwhile for us to informally agree that we won’t bother reviewing any code that doesn’t pass a syntax check.
Is It Worth Automating Static Analysis At This Point?
So we’ve made an agreement to always run static analysis on our code before asking someone else to review it. This implies that any code we deploy to production will have been run through static analysis at least once. Even though our project and our team are small, we’ve managed to put in place some important cornerstones on which we can build a healthy engineering culture.
We could codify our new agreement by writing it down in our wiki (if we have one). Another way to codify our contract would be to set up a CI server and configure it to fail the build if anyone commits a file that doesn’t pass the syntax check. Yet another way to do this would be for each of us to run watchr on our laptops, and configure it to throw up a Growl alert whenever the syntax check fails. We can pick one of these automated solutions, spend a couple of hours setting it up and get its benefit throughout the life of our project. So that seems like a worthwhile thing to do, even though so far we only have one line of code.