Git log as archaeology

The source file you’re looking at is a summary. The history is the full document. Most of the time you don’t care – you’re working on the current shape of the code and the summary is enough. But sometimes the current shape stops answering.

I reach for git history during RCAs, bug hunts, and questions the code can’t answer from its current form. Why is this file organised this way? Who introduced this assumption? When did this fallback stop being a fallback and start being the main path? The commit log knows. The current source doesn’t.

Git is the only archaeology you have. The people who had the original conversation are gone, or the conversation was never had out loud.

Reading one file backwards

The pattern of discovery is always the same shape. You have a question the current code can’t answer. You find a commit that touched the relevant line. The commit message is terse – it wasn’t written for you – so you read the diff. The diff is almost always more informative than the current source.

Three commands earn their place in this practice.

git log -S "string" – the pickaxe. Finds commits where a specific string count changed. Most people use it for “when did this function first appear,” but it’s more useful than that: it finds when a concept entered the codebase. Search for a config key, an error message, a magic number – the commit that introduced it will usually carry the reasoning in the message or the diff.

git log -S "EnableBackgroundRetry" --source -p

-p shows the diff inline. For a design question, this is usually faster than grep plus git blame.

git log --follow path/to/file – tracks a file across renames. This matters more than people think. A reorganisation can hide history from any tool that treats the current path as identity. I’ve spent time convinced a file had barely been touched only to find it had been renamed months earlier and heavily modified before the rename. --follow catches this. Tools that build their own history graph usually don’t.

git blame -C -w – the two flags that matter. -C detects copies across files (pass it multiple times to look across more commits). -w ignores whitespace. Without these, a reformat deletes the real author and you end up blaming whoever ran gofmt. An extract-method refactor is worse – every caller points to the refactorer, and the person who wrote the original logic is invisible. With -C -w, the blame walks back through both.

The observation underneath the commands: commit messages are almost always terse, and diffs are almost always more informative than the current source. The summary gets dated as the code moves on. The diff doesn’t – it’s a fixed record of a specific change, the moment someone decided this.

What merge commits don’t say out loud

A different kind of archaeology. Not one file across time, but the team across moments of almost-breaking.

main is clean now. The current source never shows you the six reverts it took to get there, or the branch that merged and was backed out the next day, or the week the build was red. Merge commits are the only place that record survives.

A few things they quietly carry:

Reverted merges. “We tried this and backed out.” That decision never appears in a source file. It lives only in the history as a merge followed by its revert, usually within days. When I’m designing something that resembles a previous attempt, the revert is the first thing I want to find. It tells me what the team learned the hard way and what shape of the problem will bite again.

Ugly conflict resolutions. Two engineers working on the same surface without coordinating. The code in main is clean now – whoever resolved the conflict made it clean – but the merge commit shows you the collision. Both sides, the chosen resolution, and often a subtle drift where the resolution didn’t quite preserve the intent of one branch. When bugs cluster in a region, check whether that region was recently the site of a contested merge.

Long-lived branches that never cleanly landed. A feature the team couldn’t close. You see the merge finally happen, but the shape is wrong – partial-land, a whole subsystem disabled, a stub where the hard part was. The diff on the merge tells you which piece got cut to get the rest over the line.

Squash-only workflows that flatten authorship. If every PR is squashed into a single commit, git shortlog reflects who merged, not who wrote. Before drawing conclusions about ownership, check the merge strategy. A fast shortlog with one dominant name might be one engineer writing everything, or it might be one engineer merging everyone else’s work. Different situations, same output.

The observation underneath: merge commits are where team dysfunction becomes visible. Not individual bugs – collective patterns. The shape of how a team merges tells you more about the team than any individual commit does.

Three lenses

Three ways of reading, for three kinds of questions:

The current source shows you what.
git log on a file shows you what used to be and why it changed.
git log on merges shows you what almost happened and what the team argued about.

When the code stops answering, read backwards. The diary nobody asked for is the only diary you have.