A Positive Way to Think About Bugs

Bugs are the things of nightmares. There are the ones that appear unannounced in production systems, at a critical time, and are spotted by the most vocal hater of your company. There are the intermittent ones that you spend hours trying to track down, and the small, irritating ones that you know will never be prioritised. There are the ones you’ve never seen but suspect are there, lurking. Waiting.

Over the years I’ve had an additional bug nightmare. One where someone expects me to spend a significant amount of my time capturing bug report metrics. They want to know how long bugs take to fix, how long we took to find the bug, or how good a developer is based on the number of bug reports we have.

When you take all this negative bug energy and throw in Jira. Or Mantis. Or BugZilla… it’s easy to understand why we all hate bugs.

But there is a positive side to bugs.

It’s too common to hear of bugs surprising teams and nothing changing. Some bugs are unexpected, and it doesn’t always make sense to try to stop a similar things happening again, but those are the exceptions. It is much more common for similar bugs to appear in different parts of the systems, these are strong indicators that things are not quite right. Observing and sharing these patterns can be a great trigger to motivate teams to change.

Every wondered how you can convince people to give you time to work on tech debt, or process improvement, or make progress on that cool infrastructure project? Well bugs can be very persuasive…

When a release showstopper bug, or production issue is found use it to its full potential. Understand the root cause, try to establish exactly why the bug appeared when it did. Was the issue poor communication, lacking test environments, or something else? Don’t make this a blame game about individuals but use it to find pieces of your process or technology that could be improved. In my experience bugs are almost never a result of poor code, they are much more likely to indicate code complexity, lack of context, poor test data, or simple lack of experience.

Once you have your information tell your story. Keep telling it. You’re not trying to force people to change things, you want to simply highlight the issues that are occurring as a result of things being the way they are. Once you find your interested audience work with them to make improvements.

Step by step you’ll find things improve.

What is Quality?

A few months ago I asked people to complete a short survey on quality. Thanks to all who took part.

Quality is often defined using Jerry Weinberg’s definition of “Value to some person who matters”. Although testers tend to use this definition frequently, and comfortably, I’ve found it to have some limitations. The word ‘value” is almost as fuzzy as “quality” and who exactly are those people who matter? It seems that even with a widely accepted definition it’s still hard to translate quality into a tangible concept for developers or customers.

I decided to try turn a definition of quality into a feeling that people could actually relate to, and hopefully use to better their own work. To do this I asked people to rate the quality of LinkedIn. I know many of you will be laughing at the idea that LinkedIn might represent quality but it turns out to be a very divisive topic. People who want to manage a professional network generally rate LinkedIn’s quality highly. Testing professionals usually don’t.

Although heavily leaning towards the “no”, we can see a decent number of people do think LinkedIn is a quality service.


If a service offers value then many people are able to ignore, or maybe don’t even notice shortcomings. I asked people to rate the quality of LinkedIn using just a yes, or a no option. As many respondents pointed out quality is definitely on a scale. What’s interesting to me is that the people who wanted to give a “yes, but no” response tended to say that LinkedIn offers value in managing a professional network (so it is a quality service), but that it it unintuitive to use (so it isn’t a quality service).

The frequency of “intuitiveness” as a measure of quality is of particular interest to the tester side of me. How often do we get caught up testing something against the requirements and ignore the unspecified side of the system? We spend hours hunting down weird and wonderful edge cases, maybe too often choosing to spend our time logging bugs rather than helping to improve the intuitiveness of a system.

We frequently discuss whether testers should learn to code or not. Maybe now is the time to start talking about whether testers should be learning about UX design. At Songkick we see huge benefit in including all team members in testing and quality activities. UI designers are experts at creating usable designs. UX designers understand how users interact with the system. Developers understand how code complexities can cause unexpected behaviour. Ops Engineers know how the system actually runs in production. Product Owners know exactly what they want the system to do. Including all of them in testing discussions brings a wealth of knowledge and skills that, when combined with a testing expert, can have a huge impact on quality.

Quality isn’t the (sole) responsibility of the test team. Everyone involved in developing software should have a good understanding of who the intended users are, their goals, and at least a working understanding of the methods we can use to meet expectations. Facilitate discussions to identify the “people who matter” for your particular system. Help connect team members to users. Even a single session on the customer support desk can have a huge impact on helping people see the real-life impact of a poor design choice. Even just making sure you have a solid approach to managing code maintainability will impact quality as least as much as prioritising features correctly.

Treat quality as a team responsibility. Work together to assist each other in building systems that people actually want to use.

What’s the cost of shipping bugs?

A tiny question to reveal huge insight.

At Songkick we use Continuous Deployment for many of our releases. This approach requires a risk-based approach for every change. Are we tweaking the copy, or re-designing the sign up system? Are we changing a core system, or an add-on feature? The answers will determine how much risk we’re willing to take. And subsequently decide whether this feature can be automatically deployed by our Jenkins pipeline or not. The most useful question we ask ourselves before writing any code is “What’s the cost of shipping bugs?”.

If the answer is “low”, perhaps because this is an experiment, easily fixed, invisible to users etc then we know we can move faster. Developers can be more autonomous. Maybe they don’t need a code review. Maybe the testers don’t need to spend so long testing things before we release. Maybe we don’t need to update our test suites.

If however the answer is “high”, perhaps because we’ve broken something like this in the past, or it’s going to be hard to fix, or highly damaging, or we’re all about to take a week off to visit New York. Then we know that we need to be more cautious. Testers need to be more involved. We need to consider releasing this with feature flippers, or using a dead canary release. We’ll make sure the release takes place at a time when there are people available to monitor the release, and get involved if needed.

It’s a tiny question that takes just a minute to ask but this tiny question can shape our entire development and release approach.

How do you estimate the cost of shipping bugs?