TAG | Root Cause Analysis
Finding the Root Cause of Application Performance Issues in Production
Posted by App Man | May, 04, 2012 | In The Usual Suspects
0 Comments
The most enjoyable part of my job at AppDynamics is to witness and evangelize customer success. What’s slightly strange is that for this to happen, an application has to slow down or crash.
It’s a bittersweet feeling when End Users, Operations, Developers and many Businesses suffer application performance pain. Outages cost the business money, but sometimes they cost people their jobs–which is truly unfortunate. However, when people solve performance issues, they become overnight heroes with a great sense of achievement, pride, and obviously relief.
To explain the complexity of managing application performance, imagine your application is 100 haystacks that represent tiers, and somewhere a needle is hurting your end user experience. It’s your job to find the needle as quickly as possible! The problem is, each haystack has over half a million pieces of hay, and they each represent lines of code in your application. It’s therefore no surprise that organizations can take days or weeks to find the root cause of performance issues in large, complex, distributed production environments.
End User Experience Monitoring, Application Mapping and Transaction profiling will help you identify unhappy users, slow business transactions, and problematic haystacks (tiers) in your application, but they won’t find needles. To do this, you’ll need x-ray visibility inside haystacks to see which pieces of hay (lines of code) are holding the needle (root cause) that is hurting your end users. This X-Ray visibility is known as “Deep Diagnostics” in application monitoring terms, and it represents the difference between isolating performance issues and resolving them.
For example, AppDynamics has great End User Monitoring, Business Transaction Monitoring, Application Flow Maps and very cool analytics all integrated into a single product. They all look and sound great (honestly they do), but they only identify and isolate performance issues to an application tier. This is largely what Business Transaction Management (BTM) and Network Performance Management (NPM) solutions do today. They’ll tell you what and where a business transaction slows down, but they won’t tell you the root cause so you can resolve the issues.
Why Deep Diagnostics for Production Monitoring Matters
A key reason why AppDynamics has become very successful in just a few years is because our Deep Diagnostics, behavioral learning, and analytics technology is 18 months ahead of the nearest vendor. A bold claim? Perhaps, but it’s backed up by bold customer case studies such as Edmunds.com and Karavel, who compared us against some of the top vendors in the application performance management (APM) market in 2011. Yes, End User Monitoring, Application Mapping and Transaction Profiling are important–but these capabilities will only help you isolate performance pain, not resolve it.
AppDynamics has the ability to instantly show the complete code execution and timing of slow user requests or business transactions for any Java or .NET application, in production, with incredibly small overhead and no configuration. We basically give customers a metal detector and X-Ray vision to help them find needles in haystacks. Locating the exact line of code responsible for a performance issue means Operations and Developers solve business pain faster, and this is a key reason why AppDynamics technology is disrupting the market.
Below is a small collection of needles that customers found using AppDynamics in production. The simple fact is that complete code visibility allows customers to troubleshoot in minutes as opposed to days and weeks. Monitoring with blind spots and configuring instrumentation are a thing of the past with AppDynamics.
Needle #1 – Slow SQL Statement
Industry: Education
Pain: Key Business Transaction with 5 sec response times
Root Cause: Slow JDBC query with full-table scan
Needle #2 – Slice of Death in Cassandra
Industry: SaaS Provider
Pain: Key Business Transaction with 2.5 sec response times
Root Cause: Slow Thrift query in Cassandra
Needle #3 – Slow & Chatty Web Service Calls
Industry: Media
Pain: Several Business Transactions with 2.5 min response times
Root Cause: Excessive Web Service Invocation (5+ per trx)
Needle #4 -Extreme XML processing
Industry: Retail/E-Commerce
Pain: Key Business Transaction with 17 sec response times
Root Cause: XML serialization over the wire.
Needle #5 – Mail Server Connectivity
Industry: Retail/E-Commerce
Pain: Key Business Transaction with 20 sec response times
Root Cause: Slow Mail Server Connectivity
Needle #6 – Slow ResultSet Iteration
Industry: Retail/E-Commerce
Pain: Several Business Transactions with 30+ sec response times
Root Cause: Querying too much data
Needle #7 – Slow Security 3rd Party Framework
Industry: Education
Pain: All Business Transactions with > 3 sec response times
Root Cause: Slow 3rd party code
Needle #8 – Excessive SQL Queries
Industry: Education
Pain: Key Business Transactions with 2 min response times
Root Cause: Thousands of SQL queries per transaction
Needle #9 – Commit Happy
Industry: Retail/E-Commerce
Pain: Several Business Transactions with 25+ sec response times
Root Cause: Unnecessary use of commits and transaction management.
Needle #10 – Locking under Concurrency
Industry: Retail/E-Commerce
Pain: Several Business Transactions with 5+ sec response times
Root Cause: Non-Thread safe cache forces locking for read/write consistency
Needle #11 – Slow 3rd Party Search Service
Industry: SaaS Provider
Pain: Key Business Transaction with 2+ min response times
Root Cause: Slow 3rd Party code
Needle #12 – Connection Pool Exhaustion
Industry: Financial Services
Pain: Several Business Transactions with 7+ sec response times
Root Cause: DB Connection Pool Exhaustion caused by excessive connection pool invocation & queries
Needle #13 – Excessive Cache Usage
Industry: Retail/E-Commerce
Pain: Several Business Transactions with 50+ sec response times
Root Cause: Cache Sizing & Configuration
If you want to manage and troubleshoot application performance in production, you should seriously consider AppDynamics. We’re the fastest growing on-premise and SaaS based APM vendor in the market right now. You can download our free product AppDynamics Lite or take a free 30-day trial of AppDynamics Pro – our commercial product.
Now go find those needles that are hurting your end users!
App Man.
apm, appdynamics, AppDynamics Pro, application monitoring, Application Performance Management, BTM, Business Transaction Management, CA Wily, Compuware, Deep Diagnostics, Dynatrace, End User Monitoring, New Relic, OpNet, OpTier, Production Monitoring, Root Cause Analysis, Transaction Profiling
Travel Company Karavel Boosts Application Performance by 20% with AppDynamics
Posted by App Man | Feb, 10, 2012 | In App Man's X-Ray Competition
0 Comments
The X-Ray competition winner from last quarter came from an online travel company in France called Karavel who kindly documented their success with AppDynamics Pro. Karavel has been using AppDynamics extensively with custom dashboards, pro-actively alerting and also for optimizing slow business transactions in their production environment.
Here is Karavel’s X-Ray case study as they documented it:
apm, APM Case Study, APM Results, appdynamics, application monitoring, Application Performance Tuning, Karavel, Performance Tuning, Root Cause Analysis
Application Performance Management On-Demand
Posted by App Man | Sep, 29, 2011 | In APM Thought Leadership
0 Comments
I thought it would be good to start blogging about my experiences with customers just so you get an idea of how important Application Performance Management (APM) has become.
A few weeks back I met with a customer who had issues, the expression on their face said it all. It started with an apology that several people couldn’t make our meeting, why? because they were investigating a production outage. You might think I’ve just made that up, I can assure you this was real and a frequent event which I’ve witnessed many a time. It can be especially annoying when you’ve travelled many miles to chat with a customer expecting to have a productive meeting and then the alarm bells ring. However, an outage in this scenario just validates the reason why you’re there in the first place.
apm, APM as a Service, APM on demand, appdynamics, AppDynamics Monitoring, Application Performance Management, Application Performance Monitoring, Business Transactions, Production Outage, Root Cause Analysis, Solving Performance Issues
AppDynamics helps Insurance Customer avoid Production Outage
Posted by App Man | Aug, 03, 2011 | In App Man's X-Ray Competition
0 Comments
Last week I published my winning Customer X-Ray of the Quarter, which showed how AppDynamics was able to help a media customer solve a production issue that had plagued their application for over two years. This week I’m posting the runner-up X-Ray entry. This one describes how AppDynamics was able to help an Insurance customer avoid a production outage by spotting a major bottleneck as their application was migrated from dev to pre-production during performance testing. All of the X-Rays you see published in this blog were written by customers, so the stories you read are real, factual, and credible.
apm, appdynamics, Application Performance Management, BTM, Business Transaction Management, Business Transactions, Insurance, Monitoring Production, MTTR, Prevent business impact, Root Cause Analysis, Slow Application, Slow Response Times
AppDynamics helps Media Customer solve Production Slowdown
Posted by App Man | Jul, 25, 2011 | In App Man's X-Ray Competition
1 Comment
Whilst disrupting the APM marketplace is a primary objective for AppDynamics, our ultimate goal is to help customers manage applications performance and solve real world problems using our software. Our unique visibility allows customers to fully understand how their business transactions execute across their infrastructure, so when they slow down, our customers know the where, what and why . I reached out to our customers over the last six weeks and asked them to document how our X-Ray visibility has helped them solve real business problems. The best submission each quarter wins a brand new MacBook Air with runners up receiving t-shirts.
The winner this quarter came from an Media and Entertainment company in the US. Once again, the X-Ray below was documented by the customer and it shows how AppDynamics was able to solve a performance issue in production that had hampered their application for over two years!
apm, appdynamics, Application Performance Management, Appman, BTM, Business Transaction Management, Business Transactions, Fixing Performance, Performance Bottleneck, Root Cause Analysis, Slow Website
The Real Overhead of Managing Application Performance
Posted by App Man | May, 23, 2011 | In APM Thought Leadership
0 Comments
Have you ever tried to troubleshoot a production bottleneck or outage? Let me guess: the first place you’ll look is log files, right? And in those log files lay all the answers to your problems? Err, not exactly. Log files are like haystacks; they take up lots of space and take hours to find the precious needles that are causing you pain. Even with tools like “kerplunk,” your troubleshooting success is only as good as the data you can collect, manage and report. You can’t log everything because disk I/O and debug logging is an expensive operation, which is why most log files today only contain basic information about what applications are doing in their various infrastructure silos.
For example, if you need to troubleshoot how a slow distributed business transaction executed across your infrastructure, it could take you hours or never to piece together a jigsaw. And if one piece of your jigsaw is missing, then the trail goes cold and you shrug your head. Bottom line, managing application performance with log files is still a long, manual and tedious task with no guarantee of success. You can’t manage with facts if you don’t have all the facts in the first place.
So even with tools that help you parse and index log files, you’re still dependent on the right data being captured and available. Pointing the finger with weak or incomplete evidence just fuels the fire when it comes to figuring out who and what is causing the issue. For example, have you ever tried telling a DBA his database is the issue just because it’s slow?
apm, Application Performance Man, BTM, Business Transactions, Log Files, MTTR, Performance Management, Production Monitoring, Root Cause Analysis
App Man’s view on AppDynamics 3.2
Posted by App Man | Apr, 19, 2011 | In APM Thought Leadership, News
0 Comments
Our mission at AppDynamics is to simplify the process of managing complex and dynamic applications. This only happens if we the APM vendor create innovative solutions to enable the monitoring of agile, highly distributed and complex applications in production environments. Our solution must be simple for the user to solve real business problems without being too simplistic or shallow. We pride ourselves on doing all the hard work to enable APM deployment so our users can focus on monitoring their applications to solve problems fast.
Agile Operations, apm, AppDynamics 3.2, Application Performance Management, Big Data, Business Transactions, Root Cause Analysis
AppMan Adventures: The Slow Tier
Posted by App Man | Apr, 15, 2011 | In App Man Adventures
0 Comments


















