Czar Pino

EST. 2017

On code quality

EXCUSESWARNING: This is a scratch/draft note for an internal talk I did on code quality. I find it helpful to write my thoughts even if somewhat disorderly in order to better articulate ideas. It is quite rough as drafts go and I will be working to refine this in the future. 

Code quality is a nebulous subject of software development. There are no clear and well-defined criteria for the classification and quantification of code quality. Most approaches to do so are typically approximations i.e. people agree they good enough representations by consensus.

Code quality as a subject is also almost never discussed directly. Similar but not equivalent references to reusability, readability, testability, or even productivity are typically given attention to instead. While these subject do indeed relate closely, they are not a good enough representation of the breadth of code quality.

As developers, we typically imagine code quality as clean readable code, ideally with unit and functional tests. It will have the right combination of design patterns employed and observes the SOLID principles for OOP. Modules are highly cohesive and loosely coupled. By the book characteristics of quality code.

In a technical perspective, this is typically a no-brainer conversation: (1) code quality is important and (2) we should do it (i.e. what ever improves code quality). In most cases however, the technology serves a business. And at a level above the technology (namely the business), code quality, as imagined by a developer, is a vaguely relevant concern at best. It’s not important — or it appears so at least. To a point, lack of reusability, readability, or testability typically does not have any immediate or directly observable impact to a business.

Why might we care about the ignorant opinion of business people about code quality? Quite simply because technology serves the business. They are supposed to derive value from the product. And if we truly build something of value to the business, then the quality of the underlying technology we build transitively becomes the quality of the business as well. This transitive importance of quality also applies to other domains scientific research or a non-profit endeavor (or something else).

What commonly gets lost in the study of good engineering practice is the purpose. Why are the SOLID principles significant? Why does it matter if modules are not cohesive and tightly coupled? Due to the non-trivial effort to keep software quality high, the means often get confused for the ends. SOLID principles, testability, coupling and cohesion are means to an end — software quality.

So again we ask, what defines a quality codebase? If we bring these development principles all together and generalize what they are trying to do, it all simply has to do with the programmer. Does it make the dev more effective and efficient? How easy is it to introduce changes and how quickly can they be made? Code quality is evaluated in terms of how effectively (correctness) and efficiently (speed) it allows the dev to introduce changes.

Is this definition transitively meaningful to the business? Yes! How correctly and quickly a business can build or change it’s product is critical to effectively capture to the market, to close a deal, or to get investment.

Why is code quality important

I think this is self-evident in it’s definition. High code quality allows the rapid development of bug free features — again, it improves speed and correctness. Put simply, code quality is the characteristic of a codebase to improve productivity and reduces (human) errors.

How can it be quantified

So, we have a basic idea of why code quality is important. And, we have established a high quality codebase is beneficial to the business it serves. Naturally, we want to improve the quality in order to further benefit the business. How do we improve?

There is a business adage, not verbatim, that says: If you can’t measure it, you can’t improve it. People in the industry (software dev) have devised several ways to quantify code quality. At best however, they remain but approximations even today. This is because code quality, chiefly desired because of it’s effect on increased productivity and reduced (human) errors, is mainly a qualitative characteristic and is not straightforwardly quantifiable. In fact, Martin Fowler has written that productivity in software development simply cannot be be measured. And productivity being a defining characteristic of code quality indicates code quality is as equally difficult to quantify.

That said, there are a few quantifiable metrics which, although makes no guarantee on correctness nor improved productivity, approximate quality reasonably well in practice. Bear in mind however, metrics are only useful if you know what to do with the answers you get.

Test coverage (Unit)

Test coverage is a staple in many teams as a metric of code quality and for good reason. It is often misunderstood, however, in it’s use. It is easy to assume a high coverage is good and indicative of high quality. In truth, test coverage is only useful for revealing weaknesses in a code base. It, basically, reveals areas of the codebase that are not being tested.

In practice, a low coverage is a strong indicator of low codebase quality. Untested areas are possible points of failure and there is no certainty in their behavior. No tests will fail to signal they no longer behave as intended. The more there is of such areas the more the likelihood of a poor quality codebase as well.

On the contrary, a high coverage is a weak indicator of good quality. This is important to note. The quality of the code cannot be determined merely by the existence of a test that covers it. We need to examine both the test and the code itself to verify. Put simply, high coverage is desirable only because low coverage is not.

What are common reasons for poor coverage? There are many reasons for a low coverage most of which are symptoms of other issues that hint at code quality.

Typically, there just 2 root causes: (1) CC is not sufficiently valuable to the team/management/developers or (2) it is difficult to write tests. A poor coverage should instigate an evaluation of the codebase for improvements. This is what makes test coverage an interesting metric.

Cyclomatic complexity

Cyclomatic complexity refers to the number of individual execution paths a program may take. To quote from wikipedia:

For instance, if the source code contained no control flow statements (conditionals or decision points), the complexity would be 1, since there would be only a single path through the code. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.

In essence, cyclomatic complexity as a metric aims to measure program complexity. High complexity implementations are undesirable in that they require a proportionally high cognitive load to analyze and understand. These kinds of codes are typically the type that are scary to modify months after it was written. They are hard to understand and modify. They are prone to error and pose a huge risk to system stability and maintainability.

High complexity is a strong indicator of low quality. The design principle KISS may come to mind here. A complex program probably did not keep it simple or otherwise did not employ the necessary abstractions.

function in_array($e, $array)
    // sort
    for ($i = 0; $i < count($array); $i ++) {
        for ($j = $i + 1; $j < count($array); $j ++) {
            if ($array[$i] > $array[$j]) {
                $array[$i] += $array[$j];
                $array[$j] = $array[$i] - $array[$j];
                $array[$i] = $array[$i] - $array[$j];

    // search
    for ($i = 0,
        $a = 0, $b = count($array) - 1, $mid = (int)(($b - $a) / 2) + $a;
        $i ++ < 5 && (($a < $mid && $b > $mid) && $e != $array[$mid]);
        $a = $e < $array[$mid] ? $a : $mid + 1,
        $b = $e < $array[$mid] ? $mid - 1 : $b,
        $mid = (int)(($b - $a) / 2) + $a

    return $e === $array[$mid];

Abstractions help reduce complexity and are generally useful when carefully managed.

function in_array($e, $array)
    $x = _search($e, _sort($array));

    return $e === $x;

Cyclomatic complexity is useful for spotting areas of the codebase need review and possibly benefit from abstractions.

Coding standards

Another popular metric for code quality is adherence to coding standards. Coding standards are a set of guidelines that must be followed when writing a program; it does not necessarily dictate how implementation must be done but rather how to format codes. Although, coding standards also do occasionally lay out rules concerning implementation.

Adherence to coding standards is typically evaluated using static code analysis tools which inspect source code and check against configured rules. A high number of violations is a strong indicator a low quality codebase that is messy and has poor readability.

Adherence to coding standards is arguably one the most important indicator of code quality. It has enormous impact it has on readability (which in turn is crucial to maintainability) and it is not as hard to comply with as other metrics. It can even be said that a failure to establish a coding standards does not only tell of a poor quality codebase.

Qualitative analysis

So we’ve been talking about quantitive approaches to evaluating code quality. Aside from quantitive approaches, there are also qualitative approaches to evaluating code quality. In fact, qualitative approaches are more reliable or sensible or rational in a sense than the quantitive methods just discussed and are a better indicator of how well or fine a code’s quality is.

Broadly, there are 3 qualities to look at in a program: testability, re-usability, and readability. These three cover a lot of relevant characteristics of good code. And, in most cases they are a very practical choice of attributes to look at when evaluating code and you will find them repeatedly mentioned in subjects of quality.


Testability refers to the ability of a piece of code to lend it self for unit testing. Testability is a desirable quality largely because it allows automated and reproducible verification on whether or not a piece of code behaves correctly. Units that do not behave correctly will naturally cause all other components that rely on them to behave incorrectly. Unit tests makes incorrect behavior easily detectable and testability makes writing unit tests possible.

Testability and writing good tests is a huge subject so I want to focus on one key asp
ect of testing which I think would improve most unit tests I’ve seen — and that is the isolation of the system under test (SUT). When you are testing, you want to isolate the behavior you want to observe/check/verify from extraneous factors. You want to make sure that when a test fails, it squarely means the SUT itself is not behaving correctly. Isolation ensures the failure is not cause by any other factor other than the SUT itself. Otherwise, testing the unit becomes meaningless.

In that context, ff there is just one thing to remember when designing a testable code then it would be dependency injection (DI). DI makes the use of test doubles possible to isolate the SUT when it depends on other components/units by allowing a unit’s dependencies to be provided by the calling module. This is the reason why occurrences of new Class() or Static:method outside the DI container or controller are considered a code smells.

When evaluating testability, think DI!


Extensibility refers to the ability of a program to lend it self to modification. Extensibility is important for continued development of software. A poorly extensible program is hard to modify and will be unable to fulfill the evolving needs of the business that rely on it.

When evaluating extensibility, there are a lot of things to keep in mind. Almost every design principle written is concerned with program extensibility. Fortunately, several of these design principles actually overlap and overshadow one another. I have found it most productive to stick to less than a handful few points and most other principles just reveals themselves as needed.

An important cognitive hack I use when thinking about design principles is by using a concretion: examples or rules of thumbs. What is an example of that? What does that mean concretely?

The human brain is so good at discovering patterns — there are many discussions about this — that once you have one concrete mental model of a design principle, your brain automatically processes for resemblance. This way you do not need to check against every single design principle — such an unproductive endeavor.


Reusability is the ability of a program to lend itself for re-use. Reusable modules are important because of their ability to serve as components for larger and more complex modules.

As with extensibility, there are several principles and practices that are concerned with reusability. To pare down complexity, I similarly look at a few things to evaluate an acceptable level of reusability.

While reusability is desirable, striving for too much of it is a bad idea. Generally, keeping things DRY and making sure modules are extensible naturally results in reusable modules.


Readability is the ability of code to lend itself for reading and understanding. On average, 50% of software development is reading and as a project gets bigger there will be more to read.

There are a lot of approaches to improve readability but one that has generally worked in my case was a simple adherence to a coding standards. A single sensible coding standards usually does it. That is all.

Technical debt

Technical debt is a clever metaphor for compromises in code quality for a short term productivity boost. It has been talked about extensively by martin fowler and is a popular tool for helping non-technical people understand code-quality-to-productivity tradeoff.

To a point, technical debt may even be necessary such as when TtM is crucial as when start up a product/project. In the long run, it is unsustainable and will eventually lead to huge productivity problems.

Technical debt should be gradually reduced to a minimum by refactoring. Keeps things DRY, decouple logic, modularize, fix CS violations, increase test coverage etc.

Thoughts on over-engineering

Code that solves problems you don’t have.

You have heard of over-engineering before. While giving careful thought to software design is good, too much of it however. Over-engineering is, in a way, actually worse than technical debt or poor quality because productivity is traded for unneeded complexity.

In the quest for too much quality, we end up becoming unproductive.

Published by Czar Pino on Monday January 29, 2018

Permalink -

« How to pick a starting weight for NLP - How to renew your letsencrypt certificate »