AppDynamics & Splunk – Better Together

AppD & Splunk LogoA few months ago I saw an interesting partnership announcement from Foursquare and OpenTable.  Users can now make OpenTable reservations at participating restaurants from directly within the Foursquare mobile app.  My first thought was, “What the hell took you guys so long?” That integration makes sense on so many levels, I’m surprised it hadn’t already been done.

So when AppDynamics recently announced a partnership with Splunk, I viewed that as another no-brainer.  Two companies with complementary solutions making it easier for customers to use their products together – makes sense right?  It does to me, and I’m not alone.

I’ve been demoing a prototype of the integration for a few months now at different events across the country, and at the conclusion of each walk-through I’d get some variation of the same question, “How do I get my hands on this?”  Well, I’m glad to say the wait is over – the integration is available today as an App download on Splunkbase.  You’ll need a Splunk and AppDynamics license to get started – if you don’t already have one, you can sign up for free trials of Splunk and AppDynamics online.

The Top 5 Advantages of SaaS-based Application Performance Management

Software-as-a-Service (SaaS) has received a lot of success and adoption in the past five years, but not as much in the field of application performance management (APM) than it has in other markets. With Cloud computing gaining momentum, you’re likely to see SaaS APM adoption increase significantly as more applications are deployed to the Cloud. Gartner also recently made SaaS a mandatory requirement for APM vendors to be included in their 2012 APM Magic Quadrant, so SaaS-based APM is definitely becoming hot right now!

Here’s the top 5 advantages that SaaS-based APM can offer:

1. TIME-TO-VALUE

SaaS-based APM can be deployed within your organization in the time it takes you to read this article. Think about that for a second – you get to experience the full benefits of APM in just a few minutes with no interaction from sales people or technical consultants. All you need to do is sign up for an account, take a free trial, and evaluate whether APM can meet your needs or solve your problems.

Many cloud providers are now actively partnering with APM vendors to embed agents within the servers they provision for customer applications. I personally know of a company that solved a 6 month production issue within an hour of deploying SaaS-based APM. How about that for ROI and time to value!

2. COST – LICENSES, MAINTENANCE, ADMINISTRATION, HARDWARE

Simply put, subscription-based licenses are cheaper, more flexible and less risk than owning perpetual licenses. Annual maintenance is included in the subscription, as is the cost of managing and supporting the APM infrastructure required to monitor your applications. You don’t need to buy hardware to run your APM management server, and you also don’t need to pay someone to manage it either – you simply deploy your agents and you’re all done. There’s now no need to sign up to a multi-million dollar 3 year APM ELA agreement with a vendor; rather, you can pay as you go. If the APM software rocks, you renew your subscription. If the APM software sucks, you go elsewhere.

3. EASE OF USE

When a customer signs up for a SaaS account and evaluates APM for the first time, there is no pre-sales or technical consultant sitting next to them to configure or demo the solution. The experience from account registration to application monitoring is a journey taken alone by the customer.

First impressions are everything with SaaS. Therefore, the learning curve of APM in this context must be faster and easier, so the APM solution can sell itself to the customer.

SaaS-based APM solutions are also much younger than traditional on-premise software, meaning the technology, UI design principles, and concepts applied are more superior and interactive for the user. Try comparing the UI of an iPhone with a Nokia phone from 5 years ago and you’ll see my point.

First generation APM solutions were typically written for developers by developers. Today the value of APM touches many different user skill sets. It is therefore no surprise that SaaS-based APM can appeal to and be adopted by development, operations and business users.

4. MIGRATING TO THE LATEST RELEASE

When an APM vendor announces a new release of its software with lots of cool features, it’s normally down to the customers themselves to migrate to the new release. If things go well, they might spend several days or perhaps a few weeks performing the migration. If things go badly, they might end up spending several weeks working hand in hand with the vendor to complete the migration.

With SaaS-based APM, the vendors themselves are responsible for the migration. Customers simply login and they get the latest version and features automatically. They get to harness APM innovation as soon as it’s ready, rather than having to wait weeks or months to find the time to migrate by themselves. If anything goes wrong, then it’s the vendor who spends the time and money to fix it, rather than the customer.

Customers today will typically upgrade their APM software once a year because of the time and effort. With SaaS-based APM, they can receive multiple upgrades and always be on the latest version.

5. SCALABILITY

Enterprises and Cloud providers can manage lots of applications, which can span several thousand servers. It is one thing for a customer to deploy APM across two applications and a hundred servers in their organization. It is another thing to deploy it across fifty applications and a thousand servers.

Scaling APM has never been easy. The more agents you deploy, the more management servers you need to collect, process, and manage the data. How quickly can you purchase, provision, and maintain the APM management infrastructure when you’ve got hundreds of applications you want to monitor?

With SaaS-based APM, you let the vendor take care of that for you. I know of a SaaS-based APM user that monitors over 6,000 servers in their organization. Compare that with the largest APM on-premise deployment you know of and you can see why SaaS-based APM is a better scalability option.

So there you have it–five compelling reasons why you should consider SaaS-based APM in your organization. SaaS-based APM isn’t for everyone, though. I typically see less adoption in financial services customers where data privacy and security controls are much tighter.

Appman.

Gartner positions AppDynamics as a Leader in 2012 APM Magic Quadrant

Application Performance Monitoring (APM) has been my life and world for almost a decade. I used APM as a developer, sold it as a sales engineer, built it as a product manager and now I’m evangelizing it as a superhero. In that time, I’ve seen APM evolve from being a pure JavaEE monitoring tool in 2002 that a few developers might use, to a full blown IT monitoring platform in 2012 that aligns development, operations and the business.

Today, the APM market has advanced tenfold, with the help from analysts like Gartner, who research APM, and literally take hundreds of inquiry calls a year from buyers. As industry and technology trends evolve like SOA, Agile, web 2.0, cloud computing, devops and big data, so do the market requirements for APM.  For APM to deliver the promised benefits, it must enable users to monitor and manage modern applications. If modern buyers commonly require X, Y and Z from APM, then APM vendors must offer X, Y and Z to be considered relevant in the market, and so they’re recognized by analyst research and reports such as the Gartner Magic Quadrant.

For example, let’s take a look at the inclusion criteria from 2012 for a vendor to be included in the Gartner Application Performance Monitoring Magic Quadrant (and if you’d like to get a complimentary copy, be our guest):

  • The vendor’s APM product must include all five dimensions of APM, including application runtime; application architecture discovery and modeling; deep-dive monitoring of one or more key application component types (e.g., database, application server); user-defined transaction profiling; and analytics applied to metric aggregation, trending and pattern discovery techniques.
  • The APM product must provide compiled Java or .NET code instrumentation in a production environment.
  • The vendor should have at least 50 customers that use its APM products actively in a production environment.
  • The APM offering must include part of or the entire solution as a service. This includes managed service provider hosting, regardless of other commercial arrangements, or SaaS delivery through its own distribution channels.
  • Total revenue (including new licenses, updates, maintenance, subscriptions, SaaS, hosting and technical support) must have exceeded $5 million in 2011.
  • Customer references must be located in at least three of the following geographic locations: North America, South America, EMEA, the Asia/Pacific region and/or Japan.
  • The vendor references must monitor more than 200 production application server instances in a production environment.

Raising the APM bar:

The rational for Gartner’s 2012 APM MQ inclusion criteria is available here. A vendor must provide a broad set of APM functionality, supporting all five dimensions of APM rather than just a few.

Other inclusion criteria I liked from above was that APM vendors must provide a compiled Java or .NET code instrumentation in a production environment, the offering must include part or the entire solution as-a-service, and that vendor references must now monitor over 200 application server instances in a production environment. These items pretty much hit the sweet spots of AppDynamics, in that we monitor some of the largest production Java and .NET applications in the world, and we offer  all 5 dimensions of APM in a single product, which can be deployed both as on-premise or via SaaS. Our largest Java deployment is over 6,000 nodes and our largest .NET deployment is now over 5,000 nodes – this is how easy our APM solution is to deploy and scale.

The adoption of public cloud, combined with the facts that APM buyers are looking to simplify their APM purchases, implementations and maintenance means that AppDynamics is well positioned to capitalize on these opportunities.

Love or Hate the Gartner Magic Quadrant, every vendor wants to be part of it, because everyone wants to be known as a leader in their field. To do this, vendors must meet or exceed Gartner’s inclusion criteria as well as their very detailed requirements matrix, which puts pressure on each vendor to constantly innovate, execute and demonstrate a compelling vision.

AppDynamics named a Leader in 2012:

What I’ve witnessed at AppDynamics since I joined back in 2011 has been nothing short of amazing. We’ve kicked a lot of ass in the last year and have had a lot of fun doing it. You could say AppDynamics being positioned as a leader in the 2012 MQ was perhaps the recognition we deserved for breaking the rules of traditional APM. We believe our MQ position represents a clear testament of our technology, tremendous customer success and disruption in the marketplace. We’re enormously proud and privileged at AppDynamics to be recognized as a leader, but we know our job isn’t done yet. We want to make APM easy to deploy, easy to use and affordable for everyone. We do this and they’ll be more organizations in the world leveraging the benefits of APM than ever before, which translates to faster applications for everyone. Not a bad thing at all.

You can sign up for a free 30-day trial of AppDynamics Pro right here, and see for yourself why we’ve become a leader in just two years.

App Man.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose

How Monitoring Analytics can make DevOps more Agile

The word “analytics” is an interesting and often abused term in the world of application monitoring. For the sake of correctness, I’m going to reference Wikipedia in how I define analytics:

Analytics is the discovery and communication of meaningful patterns in data.

Simply put, analytics should make IT’s life easier. Analytics should point out the bleeding obvious from all the monitoring data available, and guide IT so they can effectively manage the performance and availability of their application(s). Think of analytics as “doing the hard work” or “making sense” of the data being collected, so IT doesn’t have to spend hours figuring out for themselves what is being impacted and why.

Discovery
This is about how effectively a monitoring solution can self-learn the environment it’s deployed in, so it’s able to baseline what is normal and abnormal for the environment. This is really important as every application and business transaction is different. A key reason why many monitoring solutions fail today is that they rely on users to manually define what is normal and abnormal using static or simplistic global thresholds. The classic “alert me if server CPU > 90%” and “alert me if response times are > 2 seconds,” both of which normally result in a full inbox (which everyone loves) or an alert storm for IT to manage.

Communication
The communication bit of analytics is equally as important as the discovery bit. How well can IT interpret and understand what the monitoring solution is telling them? Is the data shown actionable–or does it require manual analysis, knowledge or expertise to arrive at a conclusion? Does the user have to look for problems on their own or does the monitoring solution present problems by itself? A monitoring solution should provide answers rather than questions.

One thing we did at AppDynamics was make analytics central to our product architecture. We’re about delivering maximum visibility through minimal effort, which means our product has to do the hard work for our users. Our customers today are solving issues in minutes versus days thanks to the way we collect, analyze and present monitoring data. If your applications are agile, complex, distributed and virtual then you probably don’t want to spend time telling a monitoring solution what is normal, abnormal, relevant or interesting. Let’s take a look at a few ways AppDynamics Pro is leveraging analytics:

Seeing The Big Picture
Seeing the bigger picture of application performance allows IT to quickly prioritize whether a problem is impacting an entire application or just a few users or transactions. For example, in the screenshot to the right we can see that in the last day the application processed 19.2 million business transactions (user requests), of which 0.1% experienced an error. 0.4% of transactions were classified as slow (> 2 SD), 0.3% were classified as very slow (> 3 SD) and 94 transaction stalled. The interesting thing here is that AppDynamics used analytics to automatically discover, learn and baseline what normal performance is for the application. No static, global or user defined thresholds were used – the performance baselines are dynamic and relative to each type of business transaction and user request. So if a credit card payment transaction normally takes 7 seconds, then this shouldn’t be classified as slow relative to other transactions that may only take 1 or 2 seconds.

The big picture here is that application performance generally looks OK, with 99.3% of business transactions having a normal end user experience with an average response time of 123 milliseconds. However, if you look at the data shown, 0.7% of user requests were either slow or very slow, which is almost 140,000 transactions. This is not good! The application in this example is an e-commerce website, so it’s important we understand exactly what business transactions were impacted out of those 140,000 that were classified as slow or very slow. For example, a slow search transaction isn’t the same as a slow checkout or order transaction – different transactions, different business impact.

Understanding the real Business Impact
The below screenshot shows business transaction health for the e-commerce application sorted by number of very slow requests. Analytics is used in this view by AppDynamics so it can automatically classify and present to the user which business transactions are erroneous, slow, very slow and stalling relative to their individual performance baseline (which is self-learned). At a quick glance, you can see two business transactions–“Order Calculate” and “OrderItemDisplayView”–are breaching their performance baseline.

This information helps IT determine the true business impact of a performance issue so they can prioritize where and what to troubleshoot. You can also see that the “Order Calculate” transaction had 15,717 errors. Clicking on this number would reveal the stack traces of those errors, thus allowing the APM user to easily find the root cause. In addition, we can see the average response time of the “Order Calculate” transaction was 576 milliseconds and the maximum response time is just over 64 seconds, along with 10,393 very slow requests. If AppDynamics didn’t show how many requests were erroneous, slow or very slow, then the user could spend hours figuring out the true business impact of such incident. Let’s take a look at those very slow requests by clicking on the 10,393 link in the user interface.

Seeing individual slow user business transactions
As you can probably imagine, using average response times to troubleshoot business impact is like putting a blindfold over your eyes. If your end users are experiencing slow transactions, then you need to see those transactions to effectively troubleshoot them. For example, AppDynamics uses real-time analytics to detect when business transactions breach their performance baseline, so it’s able to collect a complete blueprint of how those transactions executed across and inside the application infrastructure. This enables IT to identify root cause rapidly.

 In the screenshot above you can see all “OrderCalculate” transactions have been sorted in descending order by response time, thus making it real easy for the user to drill into any of the slow user requests. You can also see looking at the summary column that AppDynamics continuously monitors the response time of business transactions using moving averages and standard deviations to identify real business impact. Given the results our customers are seeing, we’d say this is a pretty proven way to troubleshoot business impact and application performance. Let’s drill into one of those slow transactions…

Visualizing the flow of a slow transaction
Sometimes a picture says a thousands words, and that’s exactly what visualizing the flow of a business transaction can do for IT. IT shouldn’t have to look through pages of metrics, or GBs of log files to correlate and guess why a transaction maybe slow. AppDynamics does all that for you! Look at the screenshot below that shows the flow of a “OrderCalculate” transaction–which takes 63 seconds to execute across 3 different application tiers as shown below. You can see the majority of time spent is calling the DB2 database and an external 3rd party HTTP web service. Let’s drill down to see what is causing that high amount of latency.

Automating Root Cause Analysis
Finding the root cause of a slow transaction isn’t trivial, because a single transaction can invoke several thousand lines of code–kind of like finding a needle in a haystack. Call graphs of transaction code execution are useful, but it’s much faster and easier if the user can shortcut to hotspots. AppDynamics uses analytics to do just that by presenting code hotspots to the user automatically so they can pinpoint the root cause in seconds. You can see in the below screenshot that almost 30 seconds (18.8+6.4+4.1+0.6) was spent in a web service call “calculateTaxes” (which was called 4 times) with another 13 seconds being spent in a single JDBC database call (user can click to view SQL query). Root cause analysis with analytics can be a powerful asset for any IT team.

Verifying Server Resource or Capacity
It’s true that application performance can be impacted by server capacity or resource constraints. When a transaction or user request is slow, it’s always a good idea to check what impact OS and JVM resource is having. For example, was the server maxed out on CPU? Was Garbage Collection (GC) running? If so, how long did GC run for? Was the database connection pool maxed out? All these questions require a user to manually look at different OS and JVM metrics to understand whether resource spikes or exhaustion was occurring during the slowdown. This is pretty much what most sysadmins do today to triage and troubleshoot servers that underpin a slow running application. Wouldn’t it be great if a monitoring solution could answer these questions in a single view, showing IT which OS and JVM resource was deviating from its baseline during the slowdown? With analytics it can.

AppDynamics introduced a new set of analytics in version 3.4.2 called “Node Problems” to do just this. The above screenshot shows this view whereby node metrics (e.g. OS, JVM and JMX metrics) are analyzed to determine if any were breaching their baseline and contributing to the slow performance of the “OrderCalculate” transaction. The screenshot above shows that % CPU idle, % memory used and MB memory used have deviated slightly from their baseline (denoted by blue dotted lines in the charts). Server capacity on this occasion was therefore not a contributing factor to the slow application performance. Hardware metrics that did not deviate from their baseline are not shown, thus reducing the amount of data and noise the user has to look at in this view.

Analytics makes IT more Agile
If a monitoring solution is able to discover abnormal patterns and communicate these effectively to a user, then this significantly reduces the amount of time IT has to spend managing application performance, thus making IT more agile and productive. Without analytics, IT can become a slave to data overload, big data, alert storming and silos of information that must be manually stitched together and analyzed by teams of people. In today’s world, “manually” isn’t cool or clever. If you want to be agile then you need to automate the way you manage application performance, or you’ll end up with the monitoring solution managing you.

If your current monitoring solution requires you to manually tell it what to monitor, then maybe you should be evaluating a next generation monitoring solution like AppDynamics.

App Man.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AppDynamics recognized by Forrester in APM market overview

Interest in the Application Performance Management (APM) category is very high right now.   To stay one step ahead of their clients, the Industry Analysts who cover the category and write research to advise their clients have been very busy.  In December alone, there were six different analyst reports being researched by the major analyst firms.

Forrester published the results of their research in the 2nd week of December with the report: Market Overview: Application Performance Management, Q4 2011.  Forrester clients can access the report at www.forrester.com. In this report, Forrester provides very sound advice on why APM exists and what it should do for clients. Forrester has created their own “Reference Model” for APM and evaluated the vendor landscape against those criteria.

Raison d’etre for APM

Forrester VP and Principal Analyst, JP Garbani, gives readers very pragmatic advice on the raison d’etre for APM.  Simply put, APM’s job is to:

1) Alert IT to application performance and availability issues before a full-scale outage occurs

2) Isolate or pinpoint the problem source

3) Provide deep-diagnostics to enable IT to determine the root cause

For several years now, JP Garbani has been on the forefront of proclaiming that modern APM solutions should enable IT organizations to manage apps not by gauging the heath of their servers or servlets, but instead by assessing what the customer or end-user cares about most – whether their Business Transaction completes quickly and doesn’t make them wait.  He states that this has become even more critical as applications have gotten more distributed and complex.

Just how complex can a Login Transaction be? Answer: Very!

People in our industry always talk about IT complexity and cost. Cost is pretty easy to calculate, because IT budgets are allocated and audited every year. Complexity is very different–we know it exists, but we can’t really see or measure it. Complexity is often when our brain tries to understand something and stalls in the process, trying to make sense of information that has never been seen before.

Well, this happened to a few of us in AppDynamics last week. A customer was kind enough to share how a single login business transaction flowed across their entire infrastructure. You might be thinking: “How can a login transaction be complex? That’s just a simple call to an LDAP or SiteMinder tier”–which is pretty much what we all thought it was. However, the screenshot that graced us was one of shock, beauty and amazement. In fact, I’m looking at it right now before I scrub the customer details, and I’m still thinking “Hmmmm, this is bonkers.”

Without delaying further, here is that very screenshot showing the Login Business Transaction:

Scary huh? What you see is the flow and timing of a Customer Login business transaction as it executes across a well governed, regulated, SOA environment consisting of many services (denoted by the Java Tiers). The Customer Login transaction begins at the Java node to the right marked “START” and propagates across the entire SOA environment using a combination of sync/async JMS messages, HTTP and RMI communication to notify other Services that a customer is now active and logged in. You can also see many services writing to a database as a result of this transaction. These invocations are simply auditing the customer login to satisfy the legal regulations that this organization has to comply with. So if you ever wonder what impact Governance and Legislation has on IT, this is a perfect example of the complexity storm it creates. What’s interesting is that the Logout business transaction for this application was just as complex!

The screenshot above unfortunately reflects the enormous complexity that many IT departments have to deal with everyday, especially when a user complains that their business transaction is slow. The problem for 95% of IT departments is they don’t have this type of visibility in production. They can feel pain, but they can’t see it. A slow business transaction may take 25 seconds to complete and touch many infrastructure tiers along the way. Unless IT sees this end to end journey they’ll always struggle to troubleshoot and manage it.

The good news is you’re 30 minutes away from getting this visibility in production by evaluating a next generation application monitoring solution like AppDynamics Pro. AppDynamics will auto-discover your business transactions, map their specific flows across your infrastructure, and give you a latency breakdown across and inside every tier the business transaction touches.

To manage and master IT complexity you have to visualize and see it.  Seeing how your business actually runs across IT is completely different to guessing how your business runs across IT. Next time a user complains that their business transaction is slow, what will you do? Bury your head in a log file, or visualize how that business transaction executed using an application performance monitoring solution like AppDynamics?

Isn’t it about time you mapped your app?

App Man.

Agent Intelligence

How intelligent is your monitoring agent?

The agent should not do too much processing locally to ensure minimal impact to application performance by utilizing the smallest CPU and memory footprint possible. On the other hand, offloading some processing to the agent results in less network traffic and more scalability from the monitoring Mgmt Server.

Application Performance Management On-Demand

I thought it would be good to start blogging about my experiences with customers just so you get an idea of how important Application Performance Management (APM) has become.

A few weeks back I met with a customer who had issues, the expression on their face said it all. It started with an apology that several people couldn’t make our meeting, why? because they were investigating a production outage. You might think I’ve just made that up, I can assure you this was real and a frequent event which I’ve witnessed many a time. It can be especially annoying when you’ve travelled many miles to chat with a customer expecting to have a productive meeting and then the alarm bells ring. However, an outage in this scenario just validates the reason why you’re there in the first place.

Gartner’s APM Magic Quadrant: Application Mapping and Transaction Profiling Explained

Gartner recently released their latest magic quadrant for Application Performance Monitoring (APM) and in this report mentioned five key dimensions, two of which were Application Mapping and Transaction Profiling. These two dimensions are critical for users to identify performance bottlenecks in distributed applications, whose architecture design is typically based around SOA or Cloud concepts.

The point we’d like to emphasize in this post is this: To quickly find bottlenecks in distributed or SOA architectures, these two dimensions must be visible simultaneously to the user (troubleshooter).  Ok, get ready, we’re going to say something very “unvendor” – these two dimensions should actually be “features” and not separate “products”.   The only two APM products that combine these views in a single product today are AppDynamics and dynaTrace.

Unfortunately, the rest of the APM solutions don’t work that way. Most of the APM vendors who claimed to support these two dimensions for the MQ require customers to buy two or more distinct products – one of them usually a re-branded CMDB tool. The downside is that it is nowhere near as efficient, especially if the troubleshooter has to log into 2-3 different products and has to try to stitch together this view in their own mind.

Gartner publishes 2011 Magic Quadrant for Application Performance Monitoring

On Tuesday, Gartner announced this year’s Magic Quadrant for Application Performance Monitoring (APM).   I’ll make a few observations from reading the MQ and then suggest 3 additional criteria that APM buyers should consider to make informed buying decisions.

APM demand is strongThe research report started with an analysis of the APM market growth at 15% year-over-year and $2 billion in total market spend.  These facts reflect what we see every day – the market for APM is very strong and benefits from the high growth in web-driven commerce.  Web apps just can’t be slow.

One key APM growth driver is that modern applications have become more difficult to monitor – with more moving parts and a higher rate of change. Gartner summarizes this nicely in their market overview:

“Unfortunately, at just the moment when executives have become keen about imposing an application-centric view of the world on IT operations, applications have become far more difficult to monitor; in general, architectures have become more modular, redundant, distributed and dynamic, often laying down the particular twists and turns that a code execution path could take at the latest possible moment.”