Symptoms of lacking software quality

What makes up good software quality is different from person to person, making it somewhat subjective and hard to define measurably.

It is discouraging working on projects where you feel that every time the team fixes a bug, it introduces two more. Having a hard time tracking down the bug and understand how it hit production in the first place makes matters worse. Unable to confidently give assurance that it will not happen again is downright frustrating.

Having been through a few different projects myself, I noticed that some symptoms seem to reappear.

Symptom 1 - None to too few test cases

As a developer, having automatic tests makes you feel more confident making changes to the codebase, knowing there is a safety net that reduces the risk of introducing bugs. A codebase with few automatic test cases can indicate that the task of writing tests is complicated. Test complexity is usually proportional to the complexity of the code subject to testing.

Code complexity should NOT be confused with complex business logic. You can have straightforward code with very complex business logic. In my experience, code complexity increases over time, mostly because the code tends to become harder and harder coupled. Coupling is just what happens by default without careful consideration of every new feature and bug fix.

Demanding a high degree of test coverage of an overcomplicated codebase is going to VERY time consuming and very likely not worth the time investment. Instead, focus on how to simplify (decouple) code.

I find it an excellent strategy to concentrate the side-effects i.e., in specific functions and namespaces. Simple code will beg for test cases.

Symptom 2 - Loads of build/runtime warnings

On the one hand, a warning in itself is not a problem. It is just a warning. But warnings are noise in which a new (maybe important) warning can hide. On the other hand, suppressing all warnings isn’t a solution either, as you might miss when that critical warning suddenly shows up.

We should not treat warnings superficially - say, solely by running the software and concluding “hey - it still seems to work”. A warning is often about some special case that will trigger unwanted behavior. Understand why the warning is there. Evaluate if it could have an impact either now or in the future. Make keeping a warning around a well-considered choice and document your reasoning. Such documentation could be as simple as a comment by the line of code that introduced the warning. The comment should include the warning itself, so it is easy to search for (i.e., in the future, when you have forgotten or a new team member doesn’t understand why).

This approach is not about having a religious “zero warnings” policy. Warnings describe potential problems, and ignoring them will deprive you of the opportunity to make a deliberate choice. Furthermore, having too many warnings will hide new warnings in the noise. Find the right balance.

Symptom 3 - Sparse code documentation

Lots of languages allow for documentation as part of the code. Simply demanding this documentation to be present does not solve the problem. I’ve seen lots of documentation that repeats a function name or repeats what the function does line by line. You don’t want that. Documentation is HARD.

It is worth repeating… Documentation is HARD.

Good documentation isn’t just a long term investment, “only” useful for developers somewhere in a distant future. For writing documentation like JavaDoc, Clojure doc-strings, etc., I’ve experienced that I need to understand things more thoroughly to articulate the meaning for somebody else. By immersing myself, I found a better solution or identified a problem while writing documentation. It is almost like “rubber ducking”, except you do it when you think you got it all figured out.

Symptom 4 - Poor commit hygiene

Fix bug and Fix bug for realzies this time are just outright useless commit messages. Poor commit hygiene signals reviewing and bug-hunting as low priorities. Well-formed commits allow for a smoother review process. Bug hunters are now so much better off because the commit history does provide them context. Temporarily fixing a bug might be as easy as reverting a commit, which would not be possible with poor commit hygiene, such as squeezing multiple changes into a single commit.

You help your team and yourself when providing quality commit messages. I will use this opportunity to direct your attention to Chris’ excellent guide: How to Write a Git Commit Message.

Remember that the commit history IS NOT ABOUT YOU (developers come and go), but about how the software evolved. Six months from now, nobody cares that you correct code based on review feedback. Imagine having baked a great batch of cookies; nobody is interested in the two failed attempts that got thrown away. What is interesting is the “recipe” to make those cookies (assuming you came prepared).

I strongly encourage sanitizing your commits via “rewrites” in a feature branch because it allows describing how the software evolved. Rewriting commits leaves behind all detours and dead ends (usually as comments in the PR). “Lessons learned” and other things worthwhile remembering belong in the documentation or as comments in the code - never in a commit message or worse: Somewhat derivable from the “commit history.”

Symptom 5 - Poor review process (LGTM)

A good review process can help to reduce “poor commit hygiene.” But the review process itself adds so much more value:

  • the debate on a solution is a learning tool for submitter, reviewer, and readers snooping in,
  • it can assist with finding bugs early, and
  • it increases code quality i.e., with better function naming and more understandable documentation, etc.

I’ve seen loads of LGTM (read: Looks Good To Me) approvals on PR’s, and when that is the norm, it signals the team doesn’t take the review process seriously enough.

Angie Jones (@techgirl1908) did a superb blog post (The 10 commandments of navigating code reviews) on how to embrace code reviews.

Conclusion

Clinging to easy to measure metrics like test and doc coverage, enforcing linting or rules about commit messages (length and line breaks), etc., can not and will never guarantee high-quality software on its own.

The common denominator of all the symptoms above is that they usually appear when we cut corners, not taking the required time.

The actual code base is only a small piece in the big puzzle of software quality. A change in processes like planning and review, communication, documentation, available tools, tooling usage, culture, and prioritizing time can significantly affect the quality of software, both for better and worse.

Avoid symptom treatment. Look for the real problem.

2020 in retrospect

This post is about me. By me, for me.

For a long time I did not dare to be the person that I am. Rather I tried to be the person that I led everyone around me to believe I was. Ashamed of admitting that I pretended and correct course, I kept fleeing. To maintain the illusion I found myself always looking for the smallest signs of dissatisfaction from others, and adjust behaviour accordingly to go unnoticed. Always afraid of the conflict that standing up for myself would bring. Afraid of being put down and have my feelings reasoned with logic. Logic that would conclude dreams and desires wrong - that I was “wrong”. Defective.

Finding and walking the path that is right for me, isn’t something I just started doing from one day to the next. But slowly I have been leaving my fear and guilt and with that my depression behind me. It takes continuous effort, awareness and hard work, because it is too easy to slip into old thought patterns and suppress myself.

Though I stumble now and again, I’ve felt notable improvements in 2020 and most days are good. It is almost like regaining lost abilities.

I might seem very confident when I loudly express that: I don’t care about what other people think (of me), and how there should be room for quirks. To be honest I need to be convincing, mostly because I need to repeatedly convince my self. A steady reminder for me to stay on course and not neglect again:

  • Helping others, but not at the cost of myself.
  • Learn from the past without dwelling in it.
  • Don’t lie, especially to myself.

Realtime DB synchronisation to frontend

I’ve set out to solve: How to synchronize “low volume” parts of a database to a frontend in realtime exclusively for reads. In this scenario, “low volume” means few entries with a low update frequency. Let’s say we’re talking about less than a thousand entries affected by fewer than ten updates every minute across all entries.

The backend database will use a “insert-only” pattern, causing a single entry to take up an extra row (relational DB) or node (graph DB) for every change. The “insert-only” pattern also hinders using internal database ids (like auto increment) for referencing an entry since every “update” will generate new ids. Instead, use an application managed entry id like a UUID or, for a human-recognizable id, a slug.

Watch this video for a bit of background of why I find a “insert-only” pattern interesting.

The frontend is only concerned about the newest version of an entry, throwing away the old version on an update. When it comes to the availability of entries in the frontend, we can compare it to a “database view,” from which data can only be Read. The rest of the CRUD operations (Create, Update, and Delete) must go to the backend, from where changes will propagate to all connected frontends.

Only caring for “low volume” data, disregarding “historical data” and a one-way data flow, requirements for the frontend “database” storage naturally becomes easy to meet.

The following example will assume the backend database being a graph database, but the same pattern should work with a relational database. The frontend will assume the use of a state management pattern like Redux or Vuex, and data changes will be transferred using a WebSocket.

Initial state

Overview of the data of a specific node type in the Graph database:

Node Properties Relation(s)
A {"slug": "car", "color": "blue"}  
B {"slug": "fish", "color": "red"}  
C {"slug": "car", "color": "marine"} REPLACES->A

Which should result in something like the following in the UI state:

{
  "car":  {"slug": "car",  "color": "marine"},
  "fish": {"slug": "fish", "color": "red"}
}

Create

Now a new entry gets added:

Node Properties Relation(s)
A {"slug": "car", "color": "blue"}  
B {"slug": "fish", "color": "red"}  
C {"slug": "car", "color": "marine"} REPLACES->A
D {"slug": "bike", "color": "cyan"}  

Which should result in something like the following in the UI state:

{
  "car":  {"slug": "car",  "color": "marine"},
  "fish": {"slug": "fish", "color": "red"},
  "bike": {"slug": "bike", "color": "cyan"}
}

Update

Now the “car” node gets updated:

Node Properties Relation(s)
A {"slug": "car", "color": "blue"}  
B {"slug": "fish", "color": "red"}  
C {"slug": "car", "color": "marine"} REPLACES->A
D {"slug": "bike", "color": "cyan"}  
E {"slug": "car", "color": "azure"} REPLACES->C

Which should result in something like the following in the UI state:

{
  "car":  {"slug": "car",  "color": "azure"},
  "fish": {"slug": "fish", "color": "red"},
  "bike": {"slug": "bike", "color": "cyan"}
}

Delete

Archive might be a better term than delete, but delete is something commonly known from CRUD operations. Now both the “fish” and “car” node gets deleted:

Node Properties Relation(s)
A {"slug": "car", "color": "blue"}  
B {"slug": "fish", "color": "red"} <-DELETED
C {"slug": "car", "color": "marine"} REPLACES->A
D {"slug": "bike", "color": "cyan"}  
E {"slug": "car", "color": "azure"} REPLACES->C, <-DELETED

Causing new state in UI:

{
  "bike": {"slug": "bike", "color": "cyan"}
}

Update external id

The external id “bike” is replaced with “ball,” and changing properties at the same time works just the same:

Node Properties Relation(s)
A {"slug": "car", "color": "blue"}  
B {"slug": "fish", "color": "red"} <-DELETED
C {"slug": "car", "color": "marine"} REPLACES->A
D {"slug": "bike", "color": "cyan"}  
E {"slug": "car", "color": "azure"} REPLACES->C, <-DELETED
F {"slug": "ball", "color": "navy"} REPLACES->D

Causing new state in UI:

{
  "ball": {"slug": "ball", "color": "navy"}
}

Syncing the frontend

Now that we have an overview of the data operations, let’s look at the data flow between the back- and the frontends. A single event representing a change to a single entry can handle all the cases above. The event should carry the new data along with the id of the entry that is now obsolete.

Create

{
  "event":   "changed",
  "new":     {"slug" : "bike", "color" : "cyan"},
  "replace": null
}

Update

{
  "event":   "changed",
  "new":     {"slug" : "car", "color" : "azure"},
  "replace": "car"
}

Delete

The deletion of two entries will trigger two changed events.

{
  "event":   "changed",
  "new":     null,
  "replace": "fish"
}

{
  "event":   "changed",
  "new":     null,
  "replace": "car"
}

Update external id

{
  "event":   "changed",
  "new":     {"slug" : "ball", "color" : "navy"},
  "replace": "bike"
}

Implementation

The following implementation is made with Clojure to simulate “state management,” but most of the code below is the data for changed events. The function with all the juicy stuff is change-state (function body with four lines of code). This function takes the current state and event that is changing the state and calculates a new state. Leveraging Clojure Atoms for state changes ensures atomic changes (comparable to database transactions).

Run and play with the code on repl.it.

(def ui-state
  "Atom containing UI global state - assuming it has already been initialised."
  (atom {"car"  {"slug" "car", "color" "marine"}
         "fish" {"slug" "fish", "color" "red"}}))

(def create-event
  {"event"   "changed",
   "new"     {"slug" "bike", "color" "cyan"},
   "replace" nil})

(def update-event1
  {"event"   "changed",
   "new"     {"slug" "car", "color" "azure"},
   "replace" "car"})

(def delete-event1
  {"event"   "changed",
   "new"     nil,
   "replace" "fish"})

(def delete-event2
  {"event"   "changed",
   "new"     nil,
   "replace" "car"})

(def update-event2
  {"event"   "changed",
   "new"     {"slug" "ball", "color" "navy"},
   "replace" "bike"})

(defn change-state
  "Takes the current state along with a \"changed\"-event and returns the new state."
  [state event]
  (-> state
      (dissoc (get event "replace"))
      (conj (when-let [k (get-in event ["new" "slug"])]
              [k (get event "new")]))))

;; Apply all events to the UI state in order
(swap! ui-state change-state create-event)
(swap! ui-state change-state update-event1)
(swap! ui-state change-state delete-event1)
(swap! ui-state change-state delete-event1)
(swap! ui-state change-state delete-event2)
(swap! ui-state change-state update-event2)

Conclusion

We can easily apply the pattern above to keep several parts of a backend database in sync with a “read view” in the frontend. The frontend will only need a single event listener for an entry type to create, update, and delete actions to keep data in sync.

This was definitely simpler than I expected when I set out to figure this out.

Why do we allow poor software quality?

Work environment and software

As a professional, you need to have a good work environment for optimal performance. As a software developer, this goes beyond the desk, chair, computer, and colleagues. The entire toolchain like editor, CI/CD, project management, etc., is a big part of this environment. Regardless of these tools being virtual or not countless hours are spent here. Sadly, we often neglect one of the most critical parts of our work environment: The software itself.

I’ve experienced people mistake a refactoring for “just aesthetics” or “gold plating,” but a tidy piece of software is simply more maintainable. It is easier to navigate, understand, and reason about, which makes the developer more confident when faced with having to implement changes.

How come we allow for poor quality code?! We have to work “in” it every day. I’m confident that if a chef or a carpenter were forced to work in a mess, it would only be a matter of time before they went on a cleaning spree! But developers are not allowed to clean when and however they want to.

In a poor working environment, you lose the overview. You make mistakes. It pulls you down - eventually to the bottom.

As a junior developer, I only saw the software as what was being produced, the output, the “final” result. But software isn’t that final. Nowadays, when a colleague would describe our software or parts of it as “done” or “finished,” I would annoyingly remind them that:

As long as someone uses the software, the “stream of wishes for features and bugfixes” will flow, and we (developers) will never be truly done.

Developers might not describe their software as part of their work environment, or even think of it as such. They don’t have to. How they describe their frustrations speaks volumes: “I wish we would have time to improve this.” or “Oh no, I’ve been assigned a story in THAT part of the code… AGAIN!”

Nobody invited “poor quality”?!?

There are three things I attribute to poor quality software: inexperience, lack of time, and decay. An inexperienced or sloppy programmer with limited supervision probably has the most direct impact of the code quality. Being a direct cause is NOT the same as being the most significant one, but it is the easiest to explain (and the easiest to blame).

Having a sloppy programmer is slightly different from an inexperienced one because the result of poor quality is caused by a deliberate choice not to use the necessary time. However, having both the experience and the willingness will only take you so far if you aren’t allowed to spend enough time.

“Nothing important comes into being overnight; even grapes or figs need time to ripen. If you say that you want a fig now, I will tell you to be patient.”

— Epictetus

While a company can replace a sloppy programmer, it might not be enough. That company will see hard times ahead if its decision making people fail to accept or even understand the fact that good software takes time. Lack of time is software quality’s real enemy.

Doing a half-arsed job, either by choice or circumstances, will at the least result in harder to maintain software. On top of that, chances are that it is also poorly specified, tested, and documented (if at all). This is how software quality gets sacrificed at the altar of the “deliver now” philosophy, so common among managers on competitive markets.

It doesn’t matter what your excuse or your reason is. If you keep choosing “fast” over “quality” for deliveries, at some point, it becomes IMPOSSIBLE to deliver fast. Also turning such a piece of software around will take a massive amount of effort and time - therefore money.

Lastly, software has a tendency to deteriorate over time, even (and sometimes especially) those parts that aren’t changed. This happens because the light in which the software is seen, changes as a developer gains experience or new business requirements are introduced. Suddenly flawed decisions are highlighted. Avoiding flawed decisions is unrealistic, and the software will ALWAYS have some code lying around that could be improved. That is OK. It is possible to navigate around a few inconveniences. Problems arise when poor code quality (like weeds) isn’t kept in check. Adding a time constraint on top will prevent the team from addressing these issues and, at the same time, make it less likely to choose optimal implementations for upcoming features.

How do we improve our situation?

The issue runs deep. We are dealing with sloppy or inexperienced programmers, deadlines, poor change management, hasty decisions under competitive pressure, and the widespread practice of evaluating code quality with “but it works?!” - And the list could continue. This way of working is embedded in company cultures big and small and can’t be changed on an individual level. A single person trying to “take enough time” can end up (wrongly) being identified as being slow, and even when allowed to stick to their methods the amounts of crap being produced elsewhere will quickly flood the good parts.

The first step of changing the circumstances is admitting that there is an actual problem and that you most likely are part of it yourself.

Demand a good working environment and acknowledge that (software) quality takes time along with continuous care.

How to write concise code in Clojure

During code reviews I’ve seen the following repetition pattern a lot. I am going to use Clojure to illustrate, but it also happens in other programming languages:

(ns myapp.butterfly)

(defn create-butterfly
  [butterfly-attributes]
  ..)

(defn calculate-butterfly-wing-size
  [butterfly-type]
  ..)

Notice how butterfly is being repeated. Imagine how using unnecessary long symbols over and over again will lengthen the code, and slowly shroud the purpose of the function in unnecessary noise.

butterfly-attributesare already in the context of a create-butterfly function, which in turn already resides in a .butterfly namespace.

I will argue that the following is better. The code is more concise without loosing its meaning because the namespace provides a meaningful context.

(ns myapp.butterfly)

(defn create
  [attr]
  ..)

(defn calc-wing-size
  [type]
  ..)

I took the liberty to shorten attributes and calculate with common abbreviations, just like Clojure does with concat over “concatenate”.

Code from a different namespace, would have looked like the following:

(ns myapp.other-ns
  (:require [myapp.butterfly :refer [create-butterfly]]))

(create-butterfly {:name "Brimstone" :color "green"})

and now it can look like this:

(ns myapp.other-ns
  (:require [myapp.butterfly :as butterfly]))

(butterfly/create {:name "Brimstone" :color "green"})

It is not about making the code as short as possible… but it is. Just not at the cost of context / readability. Clever use of namespaces can help with that.