Software testers increasingly have to deal with production systems. Some tests make sense only with production systems, such as Nessus-style vulnerability scanning. And an increasing number of systems is hard to reproduce in a test bed as the system is really a mashup of services, sharing infrastructure with other systems on various levels of abstraction.
Testing production systems imposes an additional requirement upon the tester, production safety. Testing is production-safe if it does not cause undesired side-effects for the users of the tested or any other system. Potential side effects are manifold: denial of service, information disclosure, real-world effects caused by test inputs, or alteration of production data, to name just a few. Testers of production systems therefore must take precautions to limit the risks of their testing.
Unfortunately it is not yet very clear what this means in practice. Jeremiah Grossman unwittingly started a discussion when he made production-saftey a criterion in his comparison of Website vulnerability assessment vendors. Yesterday he followed up on this matter with a questionnaire, which is supposed to help vendors and their clients to discuss production-safety.
The time is just right to point to our own contribution to this discussion. We felt a lack of documented best practice for production-safe testing, so we documented what we learned over a few years of security testing. The result is a short paper, which my colleague and co-author Jörn is going to present this weekend at the TAIC PART 2009 conference: Testing Production Systems Safely: Common Precautions in Penetration Testing. In this paper we tried to generalize our solutions to the safety problems we encountered.
The issue is also being discussed in the cloud computing community, but their starting point is slightly different. Service providers might want to ban activities such as automated scanning, and deploy technical and legal measures to enforce such a ban. They have good reason to do so, but their users may have equally good reason to do security testing. One proposal being discussed is a ScanAuth API to separate legitimate from rogue scans. Such an API will, however, only solve the formal part of the problem. Legitimate testing still needs to be production-safe.