Hello, I'm Noah Sussman. I am a scientist studying technosocial systems in New York.
I recently stumbled across this list of interview questions from back when I was hiring Software Engineers In Test for a team I was leading.
"Technical QA" means a lot of different things to a lot of different people. I found that no matter how I wrote my job postings, I always wound up interviewing candidates with an astoundingly wide range of skills and experience ranging somewhere along the gamut from senior engineer to human-computer interaction researcher.
This is a list of ideas. I’d never ask all of these questions in an interview. I’m also the kind of person who likes to skip around and I wrote this list with that in mind.
However the questions are ordered roughly according to level of technical knowledge required to answer, starting with the least technical.
Here are 41 interview questions to ask when hiring QA Automation Engineers.
- When did you first start using a computer?
- What was the first code you ever wrote?
- When did you become interested in QA? (“Everyone has a different answer.” —Stephen Donner)
- Why are you interested in QA now?
- What blogs do you read to keep up with the QA industry?
- Describe the role of QA in the software life cycle: what does QA do and when does it happen?
- Why does software have so many bugs?
- Would it be a good idea to automate all the tests and dispense entirely with QA?
- How does a file system work? what is a file? how does a directory “contain” files?
- What happens when you empty the trash?
- Imagine you are testing the validation of a new user sign up form for a web site. what kinds of names should the validation reject? what about email addresses?
- How do Web sites protect credit cards? from being intercepted in transit? from being stolen off the server?
- What is daylight saving time?
- How do blind people use the Web?
- How good are you at figuring out how to do things using Google?
- What’s an example of a sophisticated thing you’ve taught yourself to do using only resources you found on the internet? eg play ukulele, knit a sock, provision an ec2 instance
- How good are you at communicating with other people? with engineers?
- How good at command line are you?
- How would you find a file on a Mac? using only the shell?
- How good of a programmer are you?
- What programming languages are you comfortable with?
- What is html? what is valid html? what is a table-based layout?
- What is css? the cascade? the box model?
- Describe what happens after I type a URL in my browser and hit return. Explain how the browser and server get from that point to a fully loaded web page.
- Have you used Charles and/or the Net tab in Firebug/Chrome Inspector?
- What are “boundary conditions” aka “edge cases”?
- Why do people use “boundary” and “edge” to describe these types of problems? edge of what?
- What is a regular expression? what is a search pattern? a wildcard?
- How would you make the same change to two different files? to 2000 files?
- Have you used an SCM (for instance: git, svn, cvs, vss, ClearCase, Mercurial, Perforce)?
- Imagine you found two copies of the same file on your machine. how do you tell if the copies really are the same, or if one has some changes that didn’t make it into the other copy?
- How do engineers typically use source control branches?
- What does it mean to patch a web application?
- How does Google search work?
- How do password reset emails work?
- What are some ways that passwords get stolen? how would you advise me to keep my users’ passwords safe?
- What is a botnet? what is malware?
- How does malware get onto computers?
In 2009, I switched from front-end development to focusing on systems. I’ve since had the opportunity to help develop a range of tools including automation for new platforms and continuous integration systems for entire organizations. More importantly, I’ve gotten to use the tools I’ve built in production, and I’ve learned things. And by “learn things,” I mean I’ve made a lot of mistakes. Which I learned from.
Here’s 11 things I learned the hard way about implementing test automation for Web scale systems.
- Nondeterministic automated tests are worse than no automated tests at all.
- If you don’t know what a test does, it’s useless.
- Tests that make network connections are intrinsically intermittent.
- Simple tests help more than complicated tests.
- Every failed test is a context switch.
- Test failures must be actionable. Alerts for test failures that are not actionable are harmful.
- Humans under pressure cannot be relied upon to properly interpret a stack trace.
- Automation adds complexity.
- Writing tests is harder (and a lot more expensive) than not writing tests.
- "Automated testing" actually consists of two distinct features: the test codebase itself and the testability of the application under test.
- If you expect engineers to maintain the automated tests for a software application, implement the tests in the same language(s) as the application.
Any newly successful software organization needs a Continous Integration (CI) system in order to continue to survive
What is a CI System? It is a communication tool.
What is a version control system? It is a communication tool.
Who are these communication tools for? These tools are for engineers.
These are the best communication systems for engineers. Way better than email or even IRC.
Why is CI such a good communication system? Because tests break without anyone taking any explicit action — that is, changing application code “naturally” breaks tests, without anyone explicitly changing the test codebase. When an automated test breaks in CI, it means a potential communication path opens up between the person currently working with that area of the application code, and the person who owns (or just as often, last touched) that test.
The fact that failing tests can open up the right person-to-person communication path, between engineers, is a non-trivial competitive advantage. It is a historically successful solution to the "Dunbar’s Numbers" problem. Software testing is a way to transcend diseconomies of scale.
When we talk about software that “failed to scale” — what specifically is the quality that does not scale? In growing systems, technology and process may not scale. But in a mature / legacy / successful system, it is communication that does not scale.
Communication between individuals and between teams, is universally found to be the bottleneck on execution in very large and very old organizations.
Therefore the fact that CI can scale engineer communication is non-trivial. It means that CI is a tool for dealing with diseconomies of scale, at least as pertains to an engineering organization.
CI then is so valuable because it facilitates communication between engineers and so mitigates the rising communication costs that typify mature organizations.
Google, Amazon and Facebook all are using very aggressive Continuous Delivery workflows and have been doing so for years.
Wow that kind of turned into a rant. Sorry. Anway — as I was saying: Google, Amazon and Facebook all are using very aggressive Continuous Delivery workflows and have been doing so for years.
Weekly, daily or even hourly production deployments are a commonly-practiced, enterprise-scale software development life cycle. (Editor’s note: in retrospect we can see how “commonly” might be taken to mean “universally” in this sentence. That was not our intent. In our opinion, no methodology is universally practiced. Many successful and competitive organizations still practice Waterfall. See also diseconomies of scale.)
Today, numerous “Web-scale” organizations (notably Google, Amazon and Facebook) practice continuous deployment, with real-time production monitoring and a lot of exploratory testing taking place in production as well.
Here are a dozen articles about Web-scale continuous deployment in the enterprise
After reading these links it should be clear that Continuous Delivery is Yet Another Mainstream Software Methodology, just like Agile and Waterfall. Please reach out to me on Twitter if if you think you can still make the case — in light of this evidence — that Continuous Delivery is still “untried” or especially risky ;-)
One other thing to take note of here: almost all these articles are old. Some were written 2 years ago or more. Rapid iterative development may or may not work for a specific organization — but the general problem of implementing Continuous Delivery is clearly well-solved and has been so for a while now.
Google practices Continuous Delivery
- At Google, 15,000 engineers work from the HEAD revision of a single Perforce trunk.
- 50% of the code will be changed in any given month.
- Google’s test infrastructure is legendary and they’ve written a comprehensive book about how they perform QA while continuously releasing.
- They’ve also put a lot of effort into scaling Perforce.
Here is a fantastic deep-dive into Google’s deployment pipeline:
Amazon practices Continuous Delivery
- At Amazon, new code is deployed to production at a staggering rate of once every 11.6 seconds during a normal business day.
- That’s 3,000 production deployments per day.
- They’ve invested an enormous amount of time and money into creating an architecture that facilitates small, orthogonal, frequent code pushes.
Here’s Amazon’s Jon Jenkins breaking down their deploy stats at the O’Reilly Velocity conference:
Facebook practices Continuous Delivery
- At Facebook, each of 5,000 engineers commits to trunk HEAD at least once a day and the code at trunk HEAD is pushed to production once daily.
- Facebook has no dedicated QA team. All responsibility for testing rests with the software engineers.
- They’ve invested heavily in infrastructure that provides zero-downtime deployment at Facebook scale.
Here’s Facebook’s release manager, Chuck Rossi, going into detail about how Facebook engineers balance their experiments against the risk of breaking some fundamentally important part of the site:
I’m surprised there is no commonly-available solution for viewing Git logs as JSON documents.
It is very useful to convert logs to JSON, because JSON is immediately consume-able by almost all general-purpose data visualization tools — everything from jQuery UI to MatPlotLib, “speaks” JSON. So with Git logs converted to JSON, it becomes possible to perform all sorts of ad hoc historical analysis of source code repositories.
Historical analysis of source code repositories is very important, because code churn metrics are the best bug predictors known.
However, there is to my knowledge no simple, stand-alone tool to do the conversion! In practice, converting Git logs to JSON requires either relying on some large, third-party library that has already implemented git-log-to-JSON functionality, or writing regular expressions that turn out to be a bit of a pain in the ass.
Since there’s no commonly-available simple tool, I wonder how many people wind up putting off their Git log analysis because of the time overhead involved in JSON output conversion. If this were the case, it would really be too bad because code churn metrics are the best predictors of bugs; as discussed in this video from GTAC, which details a recent study of the Eclipse editor code base:
Making it easy to get the Git log as a JSON document!
Here’s a gist I wrote, which hopefully takes the mystery out of converting to to JSON. It has since been referenced from a popular question on StackOverflow. I’m glad someone else found this useful!
I set up Plato for continuous static analysis recently and it was pretty simple. I’ve provided the code here: tl;dr! just take me to the code!
The other day I published the table of contents for a forthcoming book on technical QA. Since then I have several times heard the criticism that many QA analystss could not get started with this process as written.
I agree that setting up a simple Web site with PHP is beyond the technical skills of many people who would otherwise like to learn how to perform technical QA.
But I’d say to everyone who objects —are you seriously telling me that you think someone can “learn to be a technical tester” without also getting to know how the Web stack works?
learning to automate browsers well is hard enough without having to learn to program at the same time. learn to program first.— adam goucher (@adamgoucher) May 16, 2013
Programming is hard
The mailing list for open source test tools are full of frustrated people who are legitimately experts in black-box testing, deduction, bug finding and problem-solving. But they are trying to extend and apply their limited or nonexistent software engineering expertise to, for example Selenium.
But Selenium is one of the most complex web development tools out there. Selenium is a tool comprised of program-inside-a-server-inside-a-Web-browser, and that’s before you’ve written a line of your own test code let alone picked a test harness and gotten your changes to run in CI.
It’s true that what I’m describing is a learning process that at first glance may appear totally beyond the capability of most QA analysts. But it’s also the case that… this is the only road I know —that anyone knows— to actually being a contributing member of a software test engineering team.
The challenge then is to work out how a Web site’s “non-technical staff” (eg, QA Analaysts) can bridge the software engineering knowledge gap.