4 Types of Continuous Performance Testing for a DevOps World

July 05 2018

Build resilient code using proven test patterns and continuous performance testing.

No one wants to write fragile, unreliable code. Developers want to build software that is bulletproof and that bounces back if there is a deficiency in a backend service. Coding well takes talent and experience. But resiliency is ultimately the result of performance testing, rigor, and quality feedback. In my last blog post, “The Importance of Application Decomposition,” I laid out a methodology for breaking down an app and properly instrumenting it. Today, we’ll look at test patterns that will help you build software that operates efficiently in an environment that is intrinsically unreliable.

1) Establishing a baseline

This test is an extension of App Decomposition, and with this test, we want to look at the performance of the application when the application is under no load whatsoever to derive a baseline. To do this, we run a single virtual user in a loop against every touchpoint, e.g., the endpoints on the APIs and batch job triggers, and ensure we are well defined with respect to transaction detection.  While I don’t put much emphasis on test harnesses here, it should be relatively obvious that part of the equation on deriving quality baselines depends heavily on your ability to invoke your application to do important things. Make sure you are investing time in building out your ability to test as much of the app as seems reasonable.

Baseline tests represent the best case scenario, i.e., our business transactions are never going to respond in less than this amount of time. I generally like to have a single instance of the current release candidate running continuously and reporting to a monitoring solution for this.  With every service running in a loop, I have baselines for the fastest and slowest calls and can easily pull up the transactions with the lowest and highest percent time spent on the CPU. I’ll discuss how to leverage this information in my next post.

2) Finding the breakpoints

Now that we have baselines, we want to ramp up a single instance of the service until it breaks. From a charting perspective, we are tracking response time and throughput for each transaction. We want to figure out the performance ceiling relatively quickly, in 30 minutes or less, because we don’t want 4-6 hours of data to comb through. We want to know the throughput at the breaking point, and we want to look at secondary metrics that might be trending as the system is breaking down.  Above all, we want to identify the root cause of the break and determine if we want to optimize.

As most app teams are only concerned with their code, we want to ensure we’re only testing our code and remove the possibility of a dependency being the cause of a breakage during this optimization testing. This is where it becomes really beneficial to have mocks for your downstream dependencies.

3) Scaling Factors

Instead of a single instance under load, with these tests we are looking at 2-5 instances, and we are load balancing among them. Essentially, we are trying to determine if the scaling factor is close to one to one. We want to understand if the application will perform better if I scale it horizontally by adding instances or if I scale it vertically by adding more memory, compute, and I/O shares.

4) Soaking

This pattern is designed to expose how well our software can recover from a high-stress situation. We want to take our application up to the breakpoint and then start backing off. We want to see how long we can sustain running a service at 80% or 90% of the breakpoint. And we want to know if we back off load and then increase the load again, is it stable? Is it resilient? A lot of applications are plagued with memory leaks and other antipatterns in code development that prevent them from recovering, and you have to restart the process. Soak tests offer the opportunity to uncover a lot of code deficiencies. Running them will help you lower your cost per transaction in terms of memory and compute resources.


Every time you make a change, whether you are tweaking a configuration or refactoring the code, you start a new baseline, you find the breakpoint, identify the scaling factors, and validate behavior with soak tests. In this way you progressively fine-tune your code.

If you think about continuous performance testing, what we are doing is essentially striving for Six Sigma—the defect-eliminating methodology that drives toward six standard deviations between the mean and the nearest specification limit. We’re looking at run tests frequently enough and continuously enough that variations in underlying dependencies including software, hardware, virtualization, storage, latency, and network, are distilled away until you’re left with statistical deviations that are relevant and baselines that you can really rely on.

In my final post we’ll cover best practices for using AppDynamics to as part of your continuous performance testing initiative. Stay tuned!

Colin Fallwell is part of AppDynamics Global Services team, which is dedicated to helping enterprises realize the value of business and application performance monitoring. AppDynamics’ Global Services’ consultants, architects, and project managers are experts in unlocking the cross-stack intelligence needed to improve business outcomes and increase organizational efficiency.



Colin Fallwell
Colin Falwell is a Sr. Architect of DevOps and Performance Engineering at AppDynamics, charged with leading AppDynamics integrations to better support enterprises in improving performance and achieving business outcomes. Prior to AppDynamics, Colin held performance architect leadership roles at Intuit and Compuware and co-founded his own startup.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form