write.as

Against Librarization

Some programmers—me included—can hardly resist to the inner voice that lures for immediate code modularization. Our analytical brains are wired to extrapolate whatever shallow quirks we devise to solve a problem; code that came to light as an unassuming couple of auxiliary functions then becomes an overarching representation of imprecise thought models. I took the liberty to give this process the name of librarization,1 i.e., the excessive appeal to non-overlapping fictional problem domains in order to justify the design and implementation of new software modules.

I hold the view that librarization is the main responsible for increasing software complexity, and the greatest foe of sane code reuse. However, it is a hard behavior to eliminate, mostly because we underestimate our own tendency to take part in it.

This essay sheds no more than a feeble light on the subject, with the limited help of my personal experience dealing with accidental complexity. It does not aspire to compose a comprehensive treatise on the problem and its solutions.

The war on code reuse

Let me start with an example that gets me on my nerves. Some years ago, I was assigned to remove any OpenSSL API code from a C++ distributed objects library. We've decided to use Boost.Asio as a substitute (which, ironically, depends on OpenSSL). I developed a very compelling heuristic for unnecessary code complexity back then. The physical copy of The Feynman Lectures on Physics on my bookshelf consisted of three volumes, with 1,552 pages total. This PDF version of the Boost.Asio documentation, on its turn, has 1,305 pages. Add to that the C++17 draft page count, 1,608—that was the easiest C++ draft to find, sorry.

You will certainly have to invest more reading time to figure out how Boost.Asio operates in conjunction to C++ than to understand the workings of the whole Universe.

And that barely scratches the surface. The picture is incomplete because those are only dependencies. Other than that, I didn't even mention lines of code. We cannot handle that level of complexity anymore.

There is a reason why we are consistently delegating crucial parts of our software infrastructure to third-party firms. We've collectively accepted, without realizing it, that code reuse is a liability. If something doesn't work, you contact customer service in order to get it fixed. If it still doesn't work, you may simply ask for your money back.

Our cherished home-brew libraries, though, are deemed high value assets. I wonder how long it will take for us to realize that they are ailed by the same affliction.

The inner dynamics of librarization

I will try to describe2 the process dynamics in a broad manner, from its onset as a manifest cognitive dissonance blooper to its propagation and augmentation as a fractal event.

The process usually begins with a heuristic search for potentially reusable snippets inside ad hoc code, a step that involves the refinement of the model to a set of axiomatic hypotheses, usually backed up by no formal reasoning whatsoever. Once an ill-defined “model inconsistency” is observed in, say, an auxiliary function, it is good enough an excuse to turn it into a much bigger function to be used only in a couple of places.

Every coincidental case of code reuse involving that function requires additional changes and, at this point, we're already past the mistake. Reducing dependencies is very much like fighting the second law of thermodynamics. Unfortunately, common sense tells the programmer that a function that big shouldn't be a single function at all. The function is then split into a handful of new functions, the lucky ones escaping to a module of their own.

The refinement then unfolds in a somewhat fractal manner. The newly created modules will individually go through the same refinement cycle, giving rise to new appendages, in a transformation that resembles the growth of a tumor.3

Libraries and wrappers pop up from such process with an astounding frequency, because no model is good enough for the day. In reality, very few of those will face widespread use.

Conclusion

Much to the exasperation of my work colleagues, I pontificate every other day on how code quality is flimsy and depends so much more on our psyche than on our technical knowledge. In this particular case, excessive self-confidence hits you back hard when you miss a deadline due to the unnecessary levels of indirection you brought in to the game. The biggest problem is that it rarely does. You do the wrong thing and escape unscathed.

I am certainly not the first one to mention the problem: Don't Let Architecture Astronauts Scare You comes to mind as an excellent description of the librarization process from a third-person point of view which, unfortunately, brings to it a rather evasive tone. The architecture astronaut is the civilized individual inside the city gates, the overly creative force in an organization and, therefore, extremely unproductive.4

With the help of the ideas mentioned earlier in the text, I developed some rules of thumb to help me avoid librarization. Let me list some of those.

Consider the effects tentative redesigns on distribution and networking. If a redesign does not help you separate a program into different computing units, you are starting off on the wrong foot. Any approach other than that will increase the complexity of the monolith.

Not every hack is a great idea in its infancy. Don't overdevelop them to come to terms with your own hubris.

Have in mind that time spent understanding code is, at best, linear in relation to its size. Every new line of code makes life harder for the next maintainer. Modularity comes easier with smaller code.

Modules are not meant to be comprehensively understood. Good software modules allow its users to read the documentation of a couple of necessary features and use them. If your module forces its users to understand the model, you are bringing unnecessary complexity to the table.

 

 

Footnotes

  1. The intention behind such neologism is to avoid any use of this essay as an unfounded criticism to modular programming.
  2. First of all, a disclaimer: I am not attempting here to falsify any arguments for the benefits of good modularization. Also, I am taking for granted that every seasoned programmer will have found themselves involved in such process someday, either as its main character or as an anecdotal observer, and will then understand my point; your mileage may vary. Finally, if it is not clear yet, I see very little use in dissecting the advantages which may derive from librarization. The proposition unfolds from an inherently bad head start, hence, any possible good outcome is the result of sheer luck.
  3. I really tried to avoid the reference, but couldn't find any better examples. Any other analogy referring to undesired, uncontrolled growth patterns would do.
  4. The text is now 17 years old and, in my formative years as a programmer, I strongly identified with the archetypal architecture astronaut. This is the main reason why I am writing this essay.