Why End User Monitoring?
In a previous post, my colleague Tom Levey explained the value of Monitoring the Real End User Experience. In this post, we will dive into how we built a service to scale to billions of users.
AppDynamics End User Monitoring enables application owners to:
- Monitor Their Global Audience and track End User Experience across the World to pinpoint which geo-locations may be impacted by poor Application Performance
- Capture end-to-end performance metrics for all business transactions – including page rendering time in the Browser, Network time, and processing time in the Application Infrastructure
- Identify bottlenecks anywhere in the end-to-end business transaction flow to help Operations and Development teams triage problems and troubleshoot quickly
- Compare performance across all browsers types – such as Internet Explorer, FireFox, Google Chrome, Safari, iOS and Android
“Fox News already depends upon AppDynamics for ease-of-use and rapid troubleshooting capability in our production environment,” said Ryan Jairam, Internet Operations Lead at Fox News. “What we’ve seen with AppDynamics’ End-User Monitoring release is an even greater ability to understand application performance, from what’s happening on the browser level to the network all the way down to the code in the application. Getting this level of insight and visibility for an application as complex and agile as ours has been a tremendous benefit, and we’re extremely happy with this powerful new addition to the AppDynamics Pro solution.”
EUM Cloud Service
EUM (End User Monitoring) Cloud Service is our on-demand, cloud based, multi-tenant SaaS infrastructure that acts as an aggregator for the entire EUM metrics traffic. All the EUM metrics from the end user browsers from different customers are reported to EUM Cloud service. The raw browser information received from the browser is verified, aggregated, and rolled up at the EUM Cloud Service. All the AppDynamics Controllers (SaaS or on-premise) connect to the EUM Cloud service to download metrics every minute, for each application.
On-Demand highly available
End users access customer web applications anywhere in the world and any time of the day in different time zones, whenever an AppDynamics instrumented web page is accessed. From the browser, EUM metrics are reported to the EUM Cloud Service. This requires a highly available on-demand system accessed from different geo locations and different time zones.
Extremely Concurrent usage
All end users of all AppDynamics customers using EUM solution continuously report browser information on the same EUM Cloud Service. EUM Cloud Service processes all the reported browser information concurrently and generate metrics and collect snapshot samples continuously.
The usage pattern for different applications throughout the day is different; the number of records to be processed at EUM Cloud vary with different applications at different times. The EUM Cloud Service automatically scale up to handle any surge in the incoming records and accordingly scale down with lower load.
Multi Tenancy support
The EUM Cloud Service process EUM metrics reported from different applications for different customers; the cloud service provides multi-tenancy. The reported browser information is partitioned based on customers and their different applications. EUM Cloud Service provides a mechanism for different customer controllers to download aggregated metrics and snapshots based on customer and application identification.
The EUM Cloud Service needs to be able to dynamically scale based on demand. The problem with supporting massive scale is that we have to pay for hardware upfront and over provision to handle huge spikes. One of the motivating factors when choosing to use Amazon Web Services is that costs scale linearly with demand.
The EUM Cloud Service is hosted on Amazon Web Services infrastructure for horizontal scaling. The service has two functional components – collector and aggregator. Multiple instances of these components work in parallel to collect and aggregate the EUM metrics received from the end user browser/device. The transient metric data be transient is stored in Amazon S3 buckets. All the meta data information related to applications and other configuration is stored in the Amazon DynamoDB tables.
The functionality of the nodes is to receive the metric data from the browser and process it for the controller:
- Resolve the GEO information (request coming from the country/region/city) and add it to the metric using a in-process maxmind Geo-resolver.
- Parse the User-Agent information and add browser information, device information and OS information to the metrics.
- Validate the incoming browser reported metrics and discard invalid metrics
- Mark the metrics/snapshots SLOW/VERY SLOW categories based on a dynamic standard deviation algorithm or using static threshold
For maximum scalability, we leverage Amazon Web Services global presence for optimal performance in every region (Virginia, Oregon, Ireland, Tokyo, Singapore, Sao Paulo). In our most recent load test, we tested the system as a collective to about 6.5 B requests per day. The system is designed to easily scale up as needed to support infinite load. We’ve tested the system running at many billions of requests per day without breaking a sweat.
Check out your end user experience data in AppDynamics
Find out more about AppDynamics Pro and get started monitoring your application with a free 15 day trial.
As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.