CAT | APM Thought Leadership

I have yet to meet anyone in Dev or Ops who likes alerts. I’ve also yet to meet anyone who was fast enough to acknowledge an alert, so they could prevent an application from slowing down or crashing. In the real world alerts just don’t work, nobody has the time or patience anymore, alerts are truly evil and no-one trusts them. The most efficient alert today is an angry end user phone call, because Dev and Ops physically hear and feel the pain of someone suffering :)

Why? There is little or no intelligence in how a monitoring solution determines what is normal or abnormal for application performance. Today, monitoring solutions are only as good as the users that configure them, which is bad news because humans make mistakes, configuration takes time, and time is something many of us have little of.

Its therefore no surprise to learn that behavioral learning and analytics are becoming key requirements for modern application performance monitoring (APM) solutions. In fact, Will Capelli from Gartner recently published a report on IT Operational Analytics and pattern based strategies in the data center. The report covered the role of Complex Event Processing (CEP), behavior learning engines (BLEs) and analytics as a means for monitoring solutions to deliver better intelligence and quality information to Dev and Ops. Rather than just collect, store and report data, monitoring solutions must now learn and make sense of the data they collect, thus enabling them to become smarter and deliver better intelligence back to their users.

Change is constant for applications and infrastructure thanks to agile cycles, therefore monitoring solutions must also change so they can adapt and stay relevant. For example, if the performance of a business transaction in an application is 2.5 secs one week, and that drops to 200ms the week after because of a development fix. 200ms should become the new performance baseline for that same transaction, otherwise the monitoring solution won’t learn or alert of any performance regression. If the end user experience of a business transaction goes from 2.5 secs to 200ms, then end user expectations change instantly, and users become used to an instant response. Monitoring solutions have to keep up with user expectations, otherwise IT will become blind to the one thing that impacts customer loyalty and experience the most.

Read the Full Post…

Link to this post:

, , , , , , , , ,

AppDynamics vs CA Wily vs DynaTrace2011 was an amazing year for AppDynamics. We experienced tremendous growth and success, largely down to the many customers around the world who believed in our vision, technology, and ability to help Dev and Ops teams better manage application performance in production. The Application Performance Management (APM) market isn’t an easy market to succeed in, with well over 30 vendors competing against each other. In just three years we’ve managed to take on the big players like Compuware DynaTrace, CA Wily, HP and IBM to change the industry perception that APM is expensive to own and difficult to deploy/use.

We feel APM should be for everyone. It should be affordable, it should be easy to deploy, and easy to use. APM should not be a luxury that only an elite group of enterprises can afford. Today, we have customers who monitor applications with 5 nodes, 50 nodes, 500 nodes and 5,000 nodes. Application performance impacts organizations of all sizes; that’s why we wanted our APM solution to be accessible to the masses over the web via our free download and SaaS trial. We wanted to be transparent with our buyers and demonstrate that they can evaluate and use our solution all by themselves with no account manager or technical consultant by their side. We really wanted prospects to see for themselves that APM can be simple to deploy and easy to use.

A major validation of this market disruption was when a customer called Karavel in France was looking for an APM solution and evaluated CA Wily, Compuware dynaTrace and AppDynamics. Karavel requested a trial, downloaded our software and we sent them a trial license key for 30 days. The whole AppDynamics install, deployment and evaluation was solely conducted by the customer on their own. This might not sound that impressive, but this is what the software buying experience should be all about: the customer and the solution. If the customer can’t install, deploy and evaluate an APM solution on their own, how will they manage this process when it comes to a production deployment? Software should sell itself these days–if it requires an army of people to sell it, it probably requires an army of people to implement it as well.

You can read the full Karavel press release here:
http://www.appdynamics.com/press/press-release-01-03-12.php

Full case study is available here also:
http://www.appdynamics.com/documents/case_studies/AppDynamics_CS_Karavel.pdf

Remember, software like APM doesn’t have to be complex and expensive. With the internet these days, there is no excuse why a prospect can’t download or evaluate solutions online in just a few hours.

App Man.

Link to this post:

, , , , , , , , , , , ,

We recently finished conducting our annual Application Performance Management survey. Over 250 IT professionals participated, and they shared insights such as:
- Many Ops and Dev teams are anticipating growth in their applications by 20% or more
- Over 50% are planning to move to the cloud, and are architecting brand-new applications to be cloud-ready
- Most teams are using log files to monitor application performance, rather than an Application Performance Management (APM) tool.

We’ll release the full report soon, but here’s an infographic that summarizes some of the main findings:

AppDynamics Inforgraphic - Storm Clouds in 2012

Embed this image on your site:

What I found personally surprising was the heavy reliance on log files. When you’re troubleshooting distributed architectures, time is of the essence–and there’s no way to cut your MTTR down when you’re relying on log files to identify root cause.

In fact, there’s only one guy who ever made using a log file look cool:

And I think we can all agree that’s a pretty unique use case.

We’ll have the full survey results available soon.

 

 

Link to this post:

, , , , , , , , , , , , , , , , , , , ,

People in our industry always talk about IT complexity and cost. Cost is pretty easy to calculate, because IT budgets are allocated and audited every year. Complexity is very different–we know it exists, but we can’t really see or measure it. Complexity is often when our brain tries to understand something and stalls in the process, trying to make sense of information that has never been seen before.

Well, this happened to a few of us in AppDynamics last week. A customer was kind enough to share how a single login business transaction flowed across their entire infrastructure. You might be thinking: “How can a login transaction be complex? That’s just a simple call to an LDAP or SiteMinder tier”–which is pretty much what we all thought it was. However, the screenshot that graced us was one of shock, beauty and amazement. In fact, I’m looking at it right now before I scrub the customer details, and I’m still thinking “Hmmmm, this is bonkers.”

Without delaying further, here is that very screenshot showing the Login Business Transaction:

Scary huh? What you see is the flow and timing of a Customer Login business transaction as it executes across a well governed, regulated, SOA environment consisting of many services (denoted by the Java Tiers). The Customer Login transaction begins at the Java node to the right marked “START” and propagates across the entire SOA environment using a combination of sync/async JMS messages, HTTP and RMI communication to notify other Services that a customer is now active and logged in. You can also see many services writing to a database as a result of this transaction. These invocations are simply auditing the customer login to satisfy the legal regulations that this organization has to comply with. So if you ever wonder what impact Governance and Legislation has on IT, this is a perfect example of the complexity storm it creates. What’s interesting is that the Logout business transaction for this application was just as complex!

The screenshot above unfortunately reflects the enormous complexity that many IT departments have to deal with everyday, especially when a user complains that their business transaction is slow. The problem for 95% of IT departments is they don’t have this type of visibility in production. They can feel pain, but they can’t see it. A slow business transaction may take 25 seconds to complete and touch many infrastructure tiers along the way. Unless IT sees this end to end journey they’ll always struggle to troubleshoot and manage it.

The good news is you’re 30 minutes away from getting this visibility in production by evaluating a next generation application monitoring solution like AppDynamics Pro. AppDynamics will auto-discover your business transactions, map their specific flows across your infrastructure, and give you a latency breakdown across and inside every tier the business transaction touches.

To manage and master IT complexity you have to visualize and see it.  Seeing how your business actually runs across IT is completely different to guessing how your business runs across IT. Next time a user complains that their business transaction is slow, what will you do? Bury your head in a log file, or visualize how that business transaction executed using an application performance monitoring solution like AppDynamics?

Isn’t it about time you mapped your app?

App Man.

Link to this post:

, , , , , , , , , , , ,

One of my colleagues this week was consolidating the results from our recent Application Performance Management survey, and one interesting finding was that 40% of customers have at least one release cycle a month. Out of those respondents, one third experience a Severity-1 incident each month as well. That’s a pretty compelling pair of statistics, and they might explain the continued frustration and conflict between development and operations teams. It’s also perhaps the reason why this DevOps underground movement can no longer be ignored (even by Gartner). There is no doubt development organizations have become agile, but does deploying this frequent change make the business more or less agile? For example, if one in three releases creates a Severity-1 incident, then surely agile development becomes a risk to the business. We’re at the point where Operations either has to start managing change better or simply restrict the amount of change that can occur.

So why are Sev 1 incidents so common? Based on my experiences and customer interaction, I’d strongly argue that testing in development isn’t enough. At the very least, it’s certainly not an insurance policy for deploying an application in production. When a Formula 1 team designs a car in a wind tunnel and tests it on a simulator pre-season, they don’t assume that the performance they see in test will mirror the results they see in a race. Yet, that is pretty much what happens today in the application development lifecycle. Development teams build and test their apps in pre-production before handing it off to operations for deployment in production, and they assume everything will work just fine. This is probably the worst assumption IT has made over the last decade, because development and production environments differ significantly. It’s also a lame excuse for any development team to use when a production issue occurs: “Well it ran fine in test so you must have deployed it wrong.” Yes people make mistakes occasionally, but if one in every three releases has an issue, deployment error may not be the sole reason. If development never get to see how their baby runs in production, they’ll never learn how to build robust, scalable, and high-performance applications.

Read the Full Post…

Link to this post:

, , , , , , , , , ,

Peter Drucker proclaimed: “If you can’t measure it, you can’t manage it.” Do you know what’s “normal” for your mission-critical application? Actually, wait a second–with Halloween having just finished up,  maybe the following Young Frankenstein reference is more appropriate. Whenever I focus on the word “normal,” the first thing that pops into my head (pardon the pun) is that famous scene from Young Frankenstein:

DR. FREDERICK FRANKENSTEIN: Abby Normal?

IGOR: I’m almost sure that was the name.

DR. FREDERICK FRANKENSTEIN: [chuckles] Are you saying that I put an abnormal brain into a seven and a half foot long, fifty-four inch wide GORILLA?

[grabs Igor and starts throttling him]

DR. FREDERICK FRANKENSTEIN: Is that what you’re telling me?

Read the Full Post…

Link to this post:

, , , , , , , , ,

On Wednesday I delivered a keynote at WJAX in Munich. Everything went really well, but I was a little shocked at the response I got when I asked the audience “How many of you monitor the performance of your apps in production?” As I scanned the audience, I counted 9 out of ~950 developers had put their hands up, meaning about 1% had visibility of how their applications actually performed in production. I know what you’re thinking: “But isn’t application performance in production the responsibility of Operations?”  Well, it is and it isn’t. Most organizations think that when an application has an issue, it’s related to the infrastructure it runs on. That’s like saying when a car crashes, it’s because a part failed on the car whereas in actual fact most accidents are caused by the driver. Yes, hardware fails occasionally, but application logic and configuration drives how infrastructure resource is used, which is why most issues today occur when new code is deployed in production.

Read the Full Post…

Link to this post:

, , , , , , , , , ,

Hugh Brien

Software that “Just Works”

This is my first blog.  I’ve been a sales engineer for three application performance management (APM) products over the last 7 years (CA Wily Introscope, SpringSource/Hyperic, and now AppDynamics). I hadn’t really considered myself much of a “blogger” because I have alway thought that actions speak louder than words.  So I guess you are wondering why I would start now.  I guess you could say I was inspired by a recent experience at a customer site. It was quite a bit different than what I’ve been used to in my earlier APM career.

We recently had an experience with a customer who called and inquired about AppDynamics for monitoring several of their mission critical applications. As always our sales team kicked into high gear and had a conversation with the customer.  In less than 15 minutes we  agreed to start a Proof of Concept for AppDynamics running on a critical application.  Later that day, we setup an online conference with the customer and commenced an installation that took about 15 minutes.

Read the Full Post…

Link to this post:

, , , , , , , ,

The majority of us in IT are specialists, with the exception of a few VPs of engineering who are “special” in their own “special” world of being “special.” What I mean by this is that no single person has the skills or experience to do everything well in IT. IT is too big for me to explain or summarize in a few words, other than it requires a lot of different people with different skills to make it tick along. Despite applications being the living breathing entities of the business, a large portion of folk in IT have little context of how applications are built, how they execute, and how they consume resource across the IT infrastructure. Many people simply don’t care as their responsibilities are completely void of anything application related. That’s fine–but the reality is that everyone in IT should have one eye on the business. The whole reason IT exists is so the business can be more competitive and make more money. If this happens, IT gets more budget and is allowed to innovate more. IT and the business need each other to survive, which is why when applications slow down or break, both parties bitch at each other.

Operations need better visibility

Unfortunately for both the business and IT, the people (Operations) who manage the performance and availability of applications in production aren’t application experts. They are also not stupid either; their skills sets are wide and broad across many technologies and platforms that underpin applications. They manage a lot of things that application developers take for granted, like networks, databases, storage and virtualization. While Operations monitor the health of these infrastructure components, they often get bombarded with crap from the business when end users and business transactions are being impacted by slow performance, despite all system monitoring showing everything is fine. This lack of understanding between the Business and Operations is because both parties see things from different perspectives.

Read the Full Post…

Link to this post:

, , , , , , , , , , , , , ,

Steve Waterworth

Agent Intelligence

How intelligent is your monitoring agent?

The agent should not do too much processing locally to ensure minimal impact to application performance by utilizing the smallest CPU and memory footprint possible. On the other hand, offloading some processing to the agent results in less network traffic and more scalability from the monitoring Mgmt Server.

Read the Full Post…

Link to this post:

, , , , , , , , , , , , , ,

Older posts >>