Dreadful news, everyone. There’s an insidious new computer virus on the loose that has only one target on its kill list - your precious, precious source code. As it rampages its way across the world, it scrambles anything that looks like a programming language - and won’t terminate until every line, token and delimiter is muddled up. (Yes, that includes Brainf*ck, and isn’t even fooled by Shakespeare.)

But also… good news, everyone! This possibly definitely not made up virus has a significant flaw - a critical bug in its code means it is oblivious to comments, and is incapable of editing them. How foolish of its creators!

OK, obviously that’s all lies - as far as I know there is no such a virus on the loose. It just so happens that the older I get, the more I appreciate the value of the very comments that such a dastardly piece of work would mercifully leave behind. Good comments can and do make the difference between a good day and a bad day (especially if I was the fiend who neglected to write them!)

“Good comments” are hard to define, and if you’ve spent any time at all in software engineering, chances are you’ve heard all sorts of different definitions of what constitutes ‘well-commented’ code from various ends of the spectrum, blogs and books - from “every line needs a comment” (srsly) through to “comments are a scourge on the universe and must be autonomously eliminated en-masse, wait- that gives me a devilish idea…”

Anyway, this is how I’ve ended up adding comments to the code I write or work with:

Firstly, I assume that the reader (even if it’s me) does not know (or remember…) much or anything about the code itself. My aim is to provide just enough information in the comment to write code that would produce the same or similar result as what’s already there - plus any useful context or gotchas that might be useful to someone reimplementing the code. Say, because a horrible virus mangled it.

Secondly, because code is for computers (mostly) and comments are for humans (mostly), I’ve adopted a more colloquial style - as if I’m sitting next to the reader, guiding them through the code. I value readable, friendly, clear, and glancable comments; scanning through the comments should be as valuable as reading the code itself, but magnitudes quicker.

My final test is this: do the comments contain sufficient instruction and context to allow me to write the code again and be reasonably sure of the same result? Would the same edge cases and bizarre or business-critical legacy behaviours be retained?

(Obviously, software, situations, and people differ, and this is merely the style I try to adopt when given the freedom to do so. Your project or employer may have very different - quite likely better, regulation-mandated, or suspiciously contradictory standards. Always defer to them first.)

Go does it best

I think Go strikes the perfect balance when it comes to comments.

For the uninitiated, Go has very few formal conventions when it comes to comments - there are no byzantine XML schemas to write against, nor oceans of @-tags that the linter insists must be adhered to for that document generation tool you never look at the output of.

Nope, none of that.

To document a thing in Go, all you need is // Thing does blah blah or // Thing is bleh bleh above the thing, and GoDoc works out the rest (if you even care for GoDoc at all.) It’s quick, easy, concise, and it makes commenting human again.

Examples

Diving into some of my old (and regretfully unmaintained) Go code in GitHub, I tried to find places where I had fallen afoul of my own advice in the past, and landed upon ocrpdf, a tool for converting document scans into searchable PDFs.

I found a seemingly-important function that made me scratch my head. Scanning through, I couldn’t understand what it did or why it did it. It had the following, succinct, context-free comment:

1
2
// AddWords adds the specified words to the page.
func (d *Document) AddWords(words []Word) {

As I spent more time looking at the function, I was finally able to work out what it does, and it turned out to be critical to the functionality of the tool.

Today, I would rewrite this as such:

1
2
3
4
// AddWords adds each of the given words to the document as cells with the same
// approximate position and size as the original word in the scanned image, allowing
// them to be selected/highlighted like normal text.
func (d *Document) AddWords(words []Word) {

Suddenly, the rest of the function made sense, and I could take a pretty good stab at rewriting from scratch if I had to. Had I spent a few extra moments when I first wrote the code, I could have saved myself much more time later on.

Delving deeper into this function, I also sniffed this:

1
2
// Scaling factors
sx, sy := 1.0, 1.0

These terse-yet-accurate comments are what I’d describe as ‘mechanical’ rather than ‘human’ - it is supporting the code, rather than the reader. It doesn’t pass my test, so I’d choose to rewrite it like this:

1
2
// Default to 100% scaling in both X and Y axis.
sx, sy := 1.0, 1.0

Suddenly we have a comment that not only describes what is going on, but helps document the process that this AddWords function is performing.

The rest of the function wasn’t much better - again, today, I’d be a lot more verbose, per this snippet of corrected code below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// Get coordinates of the word's bounding box within the scanned image.
x, y := float64(word.Left), float64(word.Top)
w, h := float64(word.Width), float64(word.Height)

// Get expected dimensions of the text at the current font size.
sw := pdf.GetStringWidth(word.Text)
_, sh := pdf.GetFontSize()

if sw == 0 {
    // Use the original width of the word if the text is empty (or unknown)
    sw = w
}

// Determine size and shape of cell depending on the scaling mode selected by
// the user.
switch d.textScaling {
    case ContainTextScaling:
        // Scale cell until it fits within the bounding box of the original
        // word, but without stretching the text.
        if sw*h > sh*w {
            // Use full width of the box if the text is wide/short,
            // leaving a gap underneath the cell.
            sx = w / sw
            sy = sx
        } else {
            // Otherwise use the full height of the box if the text is 
            // thin/tall, leaving a gap on the right of the cell.
            sx = h / sh
            sy = sx
        }
    case MatchTextScaling:
        // Scale cell to match the bounding box of the original word
        // exactly, even if the text is stretched as a result.
        sx = w / sw
        sy = h / sh
    }
}

You get the idea.

tl;dr: Write comments as if there’s an evil virus rummaging through your source files messing all your code. What comments would you wish you left?