Some of the best podcasts about testing aren’t about testing at all. Enter: last week’s episode of 99% Invisible on Cautionary Tales.
The episode is about examples of things going horribly wrong, and why. Obviously there some superficial connection to testing already: one reason we test software is to help prevent things from going horribly wrong. Often when we see things go publicly wrong, we take note of the #TestingFail that must have happened to get us there. But that’s not the part that interested me.
The story they use to frame the episode is the one about what led to the famous gaff of announcing La La Land as the winner of Best Picture at the Oscars instead of the actual winner, Moonlight. One often cited reason for the error was simply the typography of the card inside the envelope. It emphasized the wrong information, and so was easily misread.
But they go one step further. The fact that there was a wrong envelope to open at all was because there are two copies of every envelope, one on each side of the stage. Though the duplicates were there as a fail safe to prevent errors when someone might end up on the wrong side of the stage at the wrong time, in this case the duplicate actually allowed the error. It could not have happened if there was only one envelope per award, because the Best Actress envelope would have already been opened.
This is just one of the stories in the episode of a safety system put in place to prevent catastrophe ends up causing it. Examples range from award ceremonies to ancient architecture to nuclear meltdowns. Though not mentioned, recent problems with Boeing’s autopilot systems also come to mind as an automated safety system kicks in to correct a perceived fault, with lives lost as a result.
It got me thinking: if tests are a safety system put in place to prevent production bugs, and safety systems can cause the problems they are put in place to prevent, can tests themselves cause bugs?
(Sidenote: Pedants will be quick to jump on the fact that testing doesn’t “prevent” bugs; I’m going to ignore those people.)
Some things that we know to be true:
- Scripted tests (whether executed by a human or a computer) are code.
- Code will have bugs in it.
- More complex code likely has more bugs in it.
So, we already know that tests can have bugs, which can lead to bugs not being identified. The more test code we have, the more likely this will happen. But missing bugs that should have been caught isn’t the same as causing them.
I can think of other effects:
- The more we trust and rely on our test systems, the less diligent we might become against new threats being introduced.
- Test results that are noisy or inconsistent can lead us to miss real warnings buried among the noise (see: alert fatigue).
- Changes made to an application to support testability could introduce a bug impacting non-test systems.
I think only the last one really qualifies as causing a bug, and even then one might argue if it was the tests that caused the bug (by requiring a particular feature) or the introduction of the feature itself.
This is but one direct—and thankfully fictional—example I came up with from my own testing context:
- Someone writes a test to ensure that visits (or other metrics) on a web app are being tracked correctly. It runs regularly and passes, confirming that the tracking code is correct. By either error or negligence, executing the test sends tracking signals to the production system, thereby making all the tracked stats incorrect.
So my answer is yes. Tests can actually create the problems they are designed to prevent.
Someone just today said to me that one heuristic they use for coming up with test cases is to assume the code, or the user, is maliciously out to get you. This exercise reminds me of that. Start from the assumption that your tests are actually trying to cause bugs instead of catch them, and think about how they might do it.
What other examples can you think of?