Recently I’ve been working with some of my larger customers on ways they can revamp their application performance testing using AppDynamics. During this process, what became evident was that performance testing with an APM tool is much different than performance testing without. As part of this, we uncovered a few best practices that I don’t see very often- even in more sophisticated users of APM. I think these principles apply whether you’re a household name testing a massive web application, or a startup just getting off the ground.
Performance testing is a very broad category, so for this post I’ll narrow it down to a few areas:
- Maintaining current performance baselines
- Optimizing performance and increasing throughput of the application
Maintaining Status Quo
Stop me if you’ve heard this one before: Every release after the build, the performance team steps in and runs their suite of tests. Their test harness spits out a couple of numbers. Based on what they’ve seen before, the build gets a thumbs up or down.
That’s great. You know if your static use case improved or degraded. Let’s step back now. Let’s say this application is made up of over ten different application tiers, each with a multitude of potential bottlenecks and knobs to turn to optimize performance. What the process absolved is–What changed? How do I fix it?
This instance came out of a real life scenario. We did a few pretty simple things from a current process that resembled what I just described, that provided some quick, dramatic changes. First we installed AppDynamics in their test environment and ran a test cycle.
From that information, we created a baseline from that time period.
That allowed us to monitor and baseline not just average response time, but much more granular things like backend performance, JVM health, specific Business Transactions, etc.
We then set up health rules for key metrics across those baselines. Now, rather than just relying on the test suite’s metrics, we can test for performance deviations in depth.
Lastly, we automated this process. The process was already part of the CICD process so we attached alerts to each of these health rules so that the team can be alerted any time their build performance degraded and take action. Alternatively, we could have used these health rules to fail the build automatically via REST API or actions from the controller.
Once we had reasonable assurance that application performance was stable, that freed up time to go attack performance problems. I see most performance teams doing this, but what’s lacking is an easy analysis of what’s causing those performance problems. Using AppDynamics here allowed us to iterate much more quickly than a traditional test/fix model.
From the same environment, we were able to turn up the heat a little with load. Each time we did this in different scenarios, we were able to identify bottlenecks in the environment- JDBC Connections, then Heap Usage, then JVM parameters, inefficient caching protocols, etc. During our concentrated effort over the course of 2 weeks, we found over 20 significant performance improvements and improved throughput of the system over 40% with that test suite. Arguably these would have been very challenging to do without deep visibility into the application under load.
What it All Means
While these techniques seem basic, I see a lot of test teams hesitant to engage in this way and potentially foregoing an opportunity to drive significant value. Hopefully, some of these strategies provide some new insight, happy testing!