Recently I was sent a job posting that embodied what I will call the Pernicious Myth of the QA Automation Engineer. Here’s just one line:
Own automated testing capabilities across Web, mobile and production processes.
An engineer cannot by definition take responsibility (that’s what “ownership” means — right?) for features that cut across multiple groups. In all but the smallest companies, Web and mobile would be managed by different teams. With different MANAGERS.
Engineers don’t take responsibility for the behavior of MULTIPLE managers across different teams.
This is a Director’s job.
Managing software capabilities across multiple teams is the job of a mid-level Engineering manager such as a director or (in larger organizations) a VP.
Why not? Because decades of computer engineering experience show that it never works.
Management (with all its wondrous hierarchical levels) is responsible for behavior of people within and across teams. Engineers and designers are responsible for the behavior and organization of the product. Not the people. People are a management problem. Especially at scale. Organizations that forget this fail.
Takeaway: do not sign on for a director’s job at an engineer’s salary
I’m not saying no one should take responsibilty for the “tests and testability” of an application or service.
What I am saying is that someone should be explicity responsible for testing across the whole organizaiton and that person should be at the director or executive level. Never at the engineer or team lead level. Ever.
The problem with asking where testing/qa “fit in” to devops is that testing/qa is part of dev.
It’s a historical mistake that test/qa was marginalized to the point it’s now seen as a separate discipline.
In the tech world we worry a lot about scaling. Whenever someone comes up with software innovation, one of the first questions they are likely to hear is: nice concept, but will it scale?
But what is it that does not scale?
In growing systems, technology and process may not scale. In the life cycle of a mature / legacy / successful system, it is communication that does not scale. Communication between individuals and between teams, is universally found to be the bottleneck on execution in very large organizations.
Therefore the fact that CI can scale engineer communication almost indefinitely is critically important! It means that CI is a tool for dealing with diseconomies of scale, at least as pertains to an engineering organization.
Any "IT Crisis" then can be re-understood as hitting the steep rightward end of the diseconomies of scale curve (shown below) with costs spiking at the beginning and end of the life of the organization. The spike at the end is due to communication costs, which again: CI mitigates communication costs, at least for an engineering team.
More than the act of testing, the act of designing tests is one of the best bug preventers known. The thinking that must be done to create a useful test can discover and eliminate bugs before they are coded — indeed, test-design thinking can discover and eliminate bugs at every stage in the creation of software, from conception to specification, to design, coding and the rest.
The following is excerpted from Software Testing Techniques, 2d. Ed. by Boris Beizer.
First Law: The Pesticide Paradox
Every method you use to prevent or find bugs leaves a residue of subtler bugs against which those methods are ineffectual.
That’s no too bad, you say, because at least the software gets better and better. Not quite!
Second Law: The Complexity Barrier
Software complexity (and therefore that of bugs) grows to the limits of our ability to manage that complexity.
Corollary to the First Law: Test suites wear out.
Yesterday’s elegant, revealing, effective test suite will wear out because programmers and designers, given feedback on their bugs, do modify their programming habits and style in an attempt to reduce the incidence of bugs they know about. Furthermore, the better the feedback, the better the QA, the more responsive the programmers are, the faster those suites wear out. Yes, the software is getting better, but that only allows you to approach closer to, or to leap over, the previous complexity barrier. True, bug statistics tell you nothing about the coming release, only the bugs of the previous release — but that’s better than basing your test technique strategy on general industry statistics or myths. If you don’t gather bug statistics, organized into some rational taxonomy, you don’t know how effective your testing has been, and worse, you don’t know how worn out your test suite is. The consequences of that ignorance is a brutal shock. How many horror stories do you want to hear about the sophisticated outfit that tested long, hard, and diligently — sent release 3.4 to the field, confident that it was the best tested product they had ever shipped — only to have it bomb more miserably than any prior release?
Gresham’s Law states that counterfeit currency will tend to be exchanged by otherwise honest actors. What does that have to do with software engineering? In the programming world, code is the currency of exchange. So Gresham’s Law in the programming world is: bad code will tend to get written by otherwise intelligent engineers.
How does Gresham’s law apply to test coverage?
Consider the case where engineers are asked by management to contribute unit tests such that code coverage remains at/above a numerical target such as 80%. There is by definition no direct business benefit to providing these tests, since tests are never seen by the customers. Therefore if it is possible to fake test contributions by gaming test coverage metrics, then engineers will tend to regard this subversion as the only ethically viable choice. Time not spent on test coverage is time spent increasing business ROI.
In light of recent research serious doubt has been cast on the usefulness of numerical targets such as 80% test coverage.
Modern behavioral research by psychologists such as John Seddon, Dan Ariely and Dan Pink, suggests that numerical targets are actually harmful to the emotional well-being (or if you prefer: reduce the ROI) of knowledge workers.
I have long been fascinated by the phenomenon of software teams pursuing a hard numerical target for code coverage. Therefore I have always made it a point to find out about this practice whenever I visit a software development shop.
Over the years it has always proved interesting to hear engineers’ responses to the following eleven questions:
RSA Animate — Drive: The surprising truth about what motivates us from Daniel Pink on Vimeo.
Eleven weird old questions that will reveal whether your code coverage efforts are useful or just well-intentioned?
I try to ask these questions of engineers whenever discussing a new or existing test coverage project.
- What is the specific, day-to-day benefit of covering every single line of code with a unit test?
- What would be the specific, day-to-day benefit of achieving 80% code coverage?
- How would a codebase (and a system) with 80% coverage behave differently than it does today?
- How much worse would it be to achieve, say 60% coverage instead?
- What about 79% coverage?
- Why only 80% coverage as a goal — why not 90%?
- What are the factors that contribute to system determinism?
- Specifically how would increased test coverage contribute to system determinism?
- When you talk about “code coverage” do you mean line coverage, branch coverage or statement coverage, or a combination of some-or-all of these?
- Does your current code coverage metric include files that have no tests at all?
- In other words, does your test coverage metric include all of your untested code, or do you only measure how well you have covered the code for which unit tests exist?
It is important to listen carefully to how these questions are answered. Does the team in fact have a reasoned answer for each of the 11 questions? Do the coverage metrics that are in use actually make sense from a business perspective? Has the team examined low-cost code quality strategies such as code review and static analysis? Is there an explicit mapping of test automation benefit to widespread organizational benefit, at least within the engineering team? Are the problems the team is facing actually soluble via the route of adding test coverage?
It is unfortunately very easy for humans to place undue faith in numerical targets. This is afaict a consequence of our psychology. That numerical targets are intrinsically deceptive is not a problem to be solved, rather it is a serious limitation that must be considered when designing test infrastructure.
cf “Why most unit testing is waste" as well as "Stop Writing Automation.”