Longer Form Thoughts on Naming and a Potential Alternative Emphasis

There is a popular quote that occasionally gets tossed around among software developers:

There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton

When invoked with the intent to emphasize the latter of the two hard things–naming–it is often in response to frustration with trying to mentally evaluate source code or read documentation that someone else wrote.

Naming is hard

The following are at least two common issues with names that I have seen lead to frustration:

  • Names are sometimes too generic or abstract (i.e. they don’t provide enough information). For example a variable is named e instead of element or error or theFifthLetterInTheAlphabetSinceIAlreadyUsedTheFirstFour.
  • Names are misleading (i.e. they provide inaccurate information). For example Model is an overloaded term in software development (and really in modern English). Everyone has their own understanding about what a model is, so there is no chance of consistency in its usage. And even if a project comes up with a more formal internal definition, there is always the possibility that over time the meaning will need to change or the term will get reused with different connotations.

Also, discussions about how to name things often devolve into bikeshedding.

Naming is easy

The computer does not care what you name things as long as the names are unique.

Throughout history disciplines such as philosophy and math have used abstract symbols such as the letters of the alphabet as shortcuts for creating identifiers when communicating ideas. Today we write computer programs that programmatically generate unique names.

Even if it takes some time, our brains have an impressive capacity to learn new names/symbols, and to associate new definitions with already known names/symbols.

Any specific word (or concept name) in a human language is arbitrary. When coming up with a new word we can cleverly combine Greek or Latin roots, or we can string together a few random letters. Either works pretty well. (And as for those Greek or Latin roots, or whatever their predecessors were, they too are just approximately random strings of letters or sounds.)

An interesting experiment would be to take a code base that uses completely arbitrary abstraction names and see how long it takes for someone to be able to work with that code. Not only would this potentially illustrate how unimportant names really are, it would likely also uncover problematic abstractions and less ideal patterns and practices.

Naming is impossible

The problems I described above (names not providing enough information, or the reuse of names being potentially confusing or misleading) are impossible to overcome, especially by just trying to improve the way things are named.

There will always be the need to learn new name/definition combinations, and there will always be the possibility that new meaning or new definitions will be attributed to an existing name/symbol making any given application less clear. There will always be some frustration.

An alternative emphasis

Rather than trying to get better at naming things perhaps we should focus more on improving how we define things (i.e. how we author abstractions), and come up with more powerful approaches for learning new definitions and disambiguating between definitions.

Type systems for example are a more formal tool that diminish the importance of naming.

Computers provide a rich medium for authoring and relating ideas. We can and have already started to leverage graphics, audio, etc… to create richer more engaging and interactive representations of our ideas and abstractions.

A couple straightforward examples are testing tools and IDEs. (Both provide an interactive means for exploring and understanding abstractions.)

Post publication thoughts

I think ideally we can get to a place where thanks to our tools abstraction names are completely arbitrary, and generating names is an automated process.

Tools could instead allow for human annotations. And every copy of a code base could potentially have its own set of human language names and annotations. Developers could share annotations, but any given developer could customize their own annotations.

One thing this could facilitate is international collaboration (e.g. developers could annotate code in their primary language).

But, even more important than naming and annotations, is for our tools to make it easy to quickly understand the formal definition of an abstraction, and for local abstractions make it easy to review how they are currently being used. (Confusing or unexpected abstraction use could also be an indicator of less ideal patterns, practices or architecture. A simple example is a method being overly complex to the point that it’s hard to infer from the context the role a local variable is intended to play.)

Naming shouldn’t hamper creativity

Sometimes (often?) as developers we are playing the role of inventor or ideator. And our new ideas may not be conducive to being succinctly identified with even a combination of existing names/symbols. Hopefully we provide ourselves some leeway to also invent new names that may not have much if any intrinsic meaning.

If this sentence is true, I like cherry tomatoes.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store