TAG | End User Monitoring
Some companies talk about monitoring their end user experience and other companies take the bull by the horns and get it done. For those who have successfully implemented EUM (RUM, EUEM, or whatever your favorite acronym is) the technology is rewarding for both the company and the end user alike. I recently had the opportunity to discuss AppDynamics EUM with one of our customers and the information shared with me was exciting and gratifying.
ManpowerGroup monitors their intranet and internet applications with AppDynamics. These applications are used for internal operations as well as customer facing websites; in support of their global business and accessed from around the word, 24×7. We’re talking about business critical, revenue generating applications!
I asked Fred Graichen, Manager of Enterprise Application Support, why he thought ManpowerGroup needed EUM.
“One of the key components for EUM is to shed light on what is happening in the “last mile”. Our business involves supporting branch locations. Having an EUM tool allows us to compare performance across all of our branches. This also helps us determine whether any performance issues are localized. Having the insight into the difference in performance by location allows us to make more targeted investments in local hardware and network infrastructure.”
Turning on a monitoring tool doesn’t mean you’ll automagically get the results you want. You also need to make sure your tool is integrated with your people, processes, and technologies. That’s exactly what ManpowerGroup has done with AppDynamics EUM. They have alerts based upon EUM metrics that get routed to the proper people. They are then able to correlate the EUM information with data from other (Network) monitoring tools in their root cause analysis. Below is an EUM screen shot from ManpowerGroup’s environment.
By implementing AppDynamics EUM, ManpowerGroup has been able to:
- Identify locations that are experiencing the worst performance.
- Successfully illustrate the difference in performance globally as well. (This is key when studying the impact of latency etc. on an application that is being accessed from other countries but are located in a central datacenter.)
- Quickly identify when a certain location is seeing performance issues and correlate that with data from other monitoring solutions.
But what does all of this mean to the business? It means that ManpowerGroup has been able to find and resolve problems faster for their customers and employees. Faster application response time combined with happier customers and more productive employees all contribute to a healthier bottom line for ManpowerGroup.
ManpowerGroup is using AppDynamics EUM to bring a higher level of performance to it’s employees, customers, and shareholders. Sign up for a free trial today and begin your journey to a healthier bottom line.Link to this post:
Recently Jonah Kowall of Gartner released a research note titled “Use Synthetic Monitoring to Measure Availability and Real-User Monitoring for Performance”. After reading this paper I had some thoughts that I wanted to share based upon my experience as a Monitoring Architect (and certifiable performance geek) working within large enterprise organizations. I highly recommend reading the research note as the information and findings contained within are spot on and highlight important differences between Synthetic and Real-User Monitoring as applied to availability and performance.
My Apps Are Not All 24×7
During my time working at a top 10 Investment Bank I came across many different applications with varying service level requirements. I say they were requirements because there were rarely ever any agreements or contracts in place, usually just an organizational understanding of how important each application was to the business and the expected service level. Many of the applications in the Investment Bank portfolio were only used during trading hours of the exchanges that they interfaced with. These applications also had to be available right as the exchanges opened and performing well for the entire duration of trading activity. Having no real user activity meant that the only way to gain any insight into availability and performance of these applications was by using synthetically generated transactions.
Was this an ideal situation? No, but it was all we had to work with in the absence of real user activity. If the synthetic transactions were slow or throwing errors at least we could attempt to repair the platform before the opening bell. Once the trading day got started we measured real user activity to see the true picture of performance and made adjustments based upon that information.
Can’t Script It All
Having to rely upon synthetic transactions as a measure of availability and performance is definitely suboptimal. The problem gets amplified in environments where you shouldn’t be testing certain application functionality due to regulatory and other restrictions. Do you really want to be trading securities, derivatives, currencies, etc… with your synthetic transaction monitoring tool? Me thinks not!
So now there is a gaping hole in your monitoring strategy if you are relying upon synthetic transactions alone. You can’t test all of your business critical functionality even if you wanted to spend the long hours scripting and testing your synthetics. The scripting/testing time investment gets amplified when there are changes to your application code. If those code updates change the application response you will need to re-script for the new response. It’s an evil cycle that doesn’t happen when you use the right kind of real user monitoring.
Real User Monitoring: Accurate and Meaningful
When you monitor real user transactions you will get more accurate and relevant information. Here is a list (what would a good blog post be without a list?) of some of the benefits:
- Understand exactly how your application is being used.
- See the performance of each application function as the end user does, not just within your data center.
- No scripting required (scripting can take a significant amount of time and resources)
- Ensure full visibility of application usage and performance, not just what was scripted.
- Understand the real geographic distribution of your users and the impact of that distribution on end user experience.
- Ability to track performance of your most important users (particularly useful in trading environments)
Synthetic transaction monitoring and real user monitoring can definitely co-exist within the same application environment. Every business is different and has their own unique requirements that can impact the type of monitoring you choose to implement. If you’ve not yet read the Gartner research note I suggest you go check it out now. It provides a solid analysis on synthetic and real user monitoring tools, companies, and usage scenarios which are completely different from what I have covered here.
Have synthetic or real transaction monitoring saved the day for your company? I’d love to hear about it in the comments below.Link to this post:
Last week I flew into Las Vegas for #Interop fully suited and booted in my big blue costume (no joke). I’d been invited to speak in a vendor debate on User eXperience (UX): Monitor the Application or the Network? NetScout represented the Network, AppDynamics (and me) represented the Application, and “Compuware dynaTrace Gomez” sat on the fence representing both. Moderating was Jim Frey from EMA, who did a great job introducing the subject, asking the questions and keeping the debate flowing.
At the start each vendor gave their usual intro and company pitch, followed by their own definition on what User Experience is.
Defining User Experience
So at this point you’d probably expect me to blabber on about how application code and agents are critical for monitoring the UX? Wrong. For me, users experience “Business Transactions”–they don’t experience applications, infrastructure, or networks. When a user complains, they normally say something like “I can’t Login” or “My checkout timed out.” I can honestly say I’ve never heard them say – ”The CPU utilization on your machine is too high” or “I don’t think you have enough memory allocated.”
Now think about that from a monitoring perspective. Do most organizations today monitor business transactions? Or do they monitor application infrastructure and networks? The truth is the latter, normally with several toolsets. So the question “Monitor the Application or the Network?” is really the wrong question for me. Unless you monitor business transactions, you are never going to understand what your end users actually experience.
Monitoring Business Transactions
So how do you monitor business transactions? The reality is that both Application and Network monitoring tools are capable, but most solutions have been designed not to–just so they provide a more technical view for application developers and network engineers. This is wrong, very wrong and a primary reason why IT never sees what the end user sees or complains about. Today, SOA means applications are more complex and distributed, meaning a single business transaction could traverse multiple applications that potentially share services and infrastructure. If your monitoring solution doesn’t have business transaction context, you’re basically blind to how application infrastructure is impacting your UX.
The debate then switched to how monitoring the UX differs from an application and network perspective. Simply put, application monitoring relies on agents, while network monitoring relies on sniffing network traffic passively. My point here was that you can either monitor user experience with the network or you can manage it with the application. For example, with network monitoring you only see business transactions and the application infrastructure, because you’re monitoring at the network layer. In contrast, with application monitoring you see business transactions, application infrastructure, and the application logic (hence why it’s called application monitoring).
Monitor or Manage the UX?
Both application and network monitoring can identify and isolate UX degradation, because they see how a business transaction executes across the application infrastructure. However, you can only manage UX if you can understand what’s causing the degradation. To do this you need deep visibility into the application run-time and logic (code). Operations telling a Development team that their JVM is responsible for a user experience issue is a bit like Fedex telling a customer their package is lost somewhere in Alaska. Identifying and Isolating pain is useful, but one could argue it’s pointless without being able to manage and resolve the pain (through finding the root cause).
Netscout made the point that with network monitoring you can identify common bottlenecks in the network that are responsible for degrading the UX. I have no doubt you could, but if you look at the most common reason for UX issues, it’s related to change–and if you look at what changes the most, it’s application logic. Why? Because Development and Operations teams want to be agile, so their applications and business remains competitive in the marketplace. Agile release cycles means application logic (code) constantly changes. It’s therefore not unusual for an application to change several times a week, and that’s before you count hotfixes and patches. So if applications change more than the network, then one could argue it’s more effective for monitoring and managing the end user experience.
UX and Web Applications
We then debated which monitoring concept was better for web-based applications. Obviously, network monitoring is able to monitor the UX by sniffing HTTP packets passively, so it’s possible to get granular visibility on QoS in the network and application. However, the recent adoption of Web 2.0 technologies (ajax, GWT, Dojo) means application logic is now moving from the application server to the users browser. This means browser processing time becomes a critical part of the UX. Unfortunately, Network monitoring solutions can’t monitor browser processing latency (because they monitor the network), unlike application monitoring solutions that can use techniques like client-side instrumentation or web-page injection to obtain browser latency for the UX.
The C Word
We then got to the Cloud and which made more sense for monitoring UX. Well, network monitoring solutions are normally hardware appliances which plug direct into a network tap or span port. I’ve never asked, but I’d imagine the guys in Seattle (Amazon) and Redmond (Windows Azure) probably wouldn’t let you wheel a network monitoring appliance into their data-centre. More importantly, why would you need to if you’re already paying someone else to manage your infrastructure and network for you? Moving to the Cloud is about agility, and letting someone else deal with the hardware and pipes so you can focus on making your application and business competitive. It’s actually very easy for application monitoring solutions to monitor UX in the cloud. Agents can piggy back with application code libraries when they’re deployed to the cloud, or cloud providers can embed and provision vendor agents as part of their server builds and provisioning process.
What’s interesting also is that Cloud is highlighting a trend towards DevOps (or NoOps for a few organizations) where Operations become more focused on applications vs infrastructure. As the network and infrastructure becomes abstracted in the Public Cloud, then the focus naturally shifts to the application and deployment of code. For private clouds you’ll still have network Ops and Engineering teams that build and support the Cloud platform, but they wouldn’t be the people who care about user experience. Those people would be the Line of Business or application owners which the UX impacts.
In reality most organizations today already monitor the application infrastructure and network. However, if you want to start monitoring the true UX, you should monitor what your users experience, and that is business transactions. If you can’t see your users’ business transactions, you can’t manage their experience.
What are your thoughts on this?
I did have an hour spare at #Interop after my debate to meet and greet our competitors, before flying back to AppDynamics HQ. It was nice to see many of them meet and greet the APM Caped Crusader.
App Man.Link to this post:
It’s a bittersweet feeling when End Users, Operations, Developers and many Businesses suffer application performance pain. Outages cost the business money, but sometimes they cost people their jobs–which is truly unfortunate. However, when people solve performance issues, they become overnight heroes with a great sense of achievement, pride, and obviously relief.
To explain the complexity of managing application performance, imagine your application is 100 haystacks that represent tiers, and somewhere a needle is hurting your end user experience. It’s your job to find the needle as quickly as possible! The problem is, each haystack has over half a million pieces of hay, and they each represent lines of code in your application. It’s therefore no surprise that organizations can take days or weeks to find the root cause of performance issues in large, complex, distributed production environments.
End User Experience Monitoring, Application Mapping and Transaction profiling will help you identify unhappy users, slow business transactions, and problematic haystacks (tiers) in your application, but they won’t find needles. To do this, you’ll need x-ray visibility inside haystacks to see which pieces of hay (lines of code) are holding the needle (root cause) that is hurting your end users. This X-Ray visibility is known as “Deep Diagnostics” in application monitoring terms, and it represents the difference between isolating performance issues and resolving them.
For example, AppDynamics has great End User Monitoring, Business Transaction Monitoring, Application Flow Maps and very cool analytics all integrated into a single product. They all look and sound great (honestly they do), but they only identify and isolate performance issues to an application tier. This is largely what Business Transaction Management (BTM) and Network Performance Management (NPM) solutions do today. They’ll tell you what and where a business transaction slows down, but they won’t tell you the root cause so you can resolve the issues.
Why Deep Diagnostics for Production Monitoring Matters
A key reason why AppDynamics has become very successful in just a few years is because our Deep Diagnostics, behavioral learning, and analytics technology is 18 months ahead of the nearest vendor. A bold claim? Perhaps, but it’s backed up by bold customer case studies such as Edmunds.com and Karavel, who compared us against some of the top vendors in the application performance management (APM) market in 2011. Yes, End User Monitoring, Application Mapping and Transaction Profiling are important–but these capabilities will only help you isolate performance pain, not resolve it.
AppDynamics has the ability to instantly show the complete code execution and timing of slow user requests or business transactions for any Java or .NET application, in production, with incredibly small overhead and no configuration. We basically give customers a metal detector and X-Ray vision to help them find needles in haystacks. Locating the exact line of code responsible for a performance issue means Operations and Developers solve business pain faster, and this is a key reason why AppDynamics technology is disrupting the market.
Below is a small collection of needles that customers found using AppDynamics in production. The simple fact is that complete code visibility allows customers to troubleshoot in minutes as opposed to days and weeks. Monitoring with blind spots and configuring instrumentation are a thing of the past with AppDynamics.
Needle #1 – Slow SQL Statement
Pain: Key Business Transaction with 5 sec response times
Root Cause: Slow JDBC query with full-table scan
Needle #2 – Slice of Death in Cassandra
Industry: SaaS Provider
Pain: Key Business Transaction with 2.5 sec response times
Root Cause: Slow Thrift query in Cassandra
Needle #3 – Slow & Chatty Web Service Calls
Pain: Several Business Transactions with 2.5 min response times
Root Cause: Excessive Web Service Invocation (5+ per trx)
Needle #4 -Extreme XML processing
Pain: Key Business Transaction with 17 sec response times
Root Cause: XML serialization over the wire.
Needle #5 – Mail Server Connectivity
Pain: Key Business Transaction with 20 sec response times
Root Cause: Slow Mail Server Connectivity
Pain: Several Business Transactions with 30+ sec response times
Root Cause: Querying too much data
Needle #7 – Slow Security 3rd Party Framework
Pain: All Business Transactions with > 3 sec response times
Root Cause: Slow 3rd party code
Needle #8 – Excessive SQL Queries
Pain: Key Business Transactions with 2 min response times
Root Cause: Thousands of SQL queries per transaction
Needle #9 – Commit Happy
Pain: Several Business Transactions with 25+ sec response times
Root Cause: Unnecessary use of commits and transaction management.
Needle #10 – Locking under Concurrency
Pain: Several Business Transactions with 5+ sec response times
Root Cause: Non-Thread safe cache forces locking for read/write consistency
Industry: SaaS Provider
Pain: Key Business Transaction with 2+ min response times
Root Cause: Slow 3rd Party code
Industry: Financial Services
Pain: Several Business Transactions with 7+ sec response times
Root Cause: DB Connection Pool Exhaustion caused by excessive connection pool invocation & queries
Pain: Several Business Transactions with 50+ sec response times
Root Cause: Cache Sizing & Configuration
If you want to manage and troubleshoot application performance in production, you should seriously consider AppDynamics. We’re the fastest growing on-premise and SaaS based APM vendor in the market right now. You can download our free product AppDynamics Lite or take a free 30-day trial of AppDynamics Pro – our commercial product.
Now go find those needles that are hurting your end users!
App Man.Link to this post: