Back in 2011, Forbes famously declared that “every company is a software company,” and in 2020, the pace of digital transformation continues faster than ever, accelerated by the COVID-19 pandemic. In fact, according to the AppDynamics Agents of Transformation Report 2020: Covid-19 Special Edition:
- 95% of organizations surveyed during the pandemic have changed their technology priorities
- 74% of technologists reported that digital transformation projects, which would typically take more than a year to be approved, have been signed off in a matter of weeks
- 71% of technologists have witnessed digital transformation projects now being implemented within weeks rather than the months or years it would have taken before the pandemic
- 79% of technologists believe the pandemic will separate the strong from the weak in tech teams across the world
- This accelerated pace is further complicated by higher-than-ever technical complexities and scale. IT teams were forced to shift from supporting the organization to being at the heart of the business today
This accelerated pace is further complicated by higher-than-ever technical complexities and scale. IT teams were forced to shift from supporting the organization to being at the heart of the business today.
- 97% of IT leaders reported performance issues in the last six months alone with an average cost of a single enterprise service outage in the US of $402,542
- 91% said monitoring tools only provide data about how releases impact their own area of responsibility
According to Gartner, by 2023, 40% of DevOps teams will augment application and infrastructure monitoring tools with artificial intelligence for IT operations (AIOps) platform capabilities in order to solve these problems (1).
Addressing these challenges demands a new approach
The concept of the Business Transaction has been at the heart of AppDynamics’ strategy since its creation: providing a unified view of the IT landscape, monitor applications’ health wherever they run, visualizing how the infrastructure supports them, identifying how well digital experiences are delivered and helping digital teams to prioritize by connecting technical performance to business impact.
Today, Artificial intelligence (AI) and machine learning (ML) continue to make the headlines, and at AppDynamics, we believe artificial intelligence for IT operations (AIOps) can help IT teams deal with the tsunami of data generated these days, which far outweighs what humans can handle. It’s not about replacing jobs through automation, it’s about providing teams with the right information when and where it’s needed to help them make smart, informed decisions in real-time. In fact, according to a report from McKinsey, most companies report measurable benefits from AI where it has been deployed, and expect only a minimal effect on head count. Instead of being concerned that machines will take over their jobs, IT professionals want more of it.
In the AppDynamics Agents of Transformation Report 2020, 74% of IT professionals said they want to use monitoring and analytics tools proactively to detect emerging performance issues, optimize user experience, and drive business outcomes such as revenue and conversion. That’s why, for the past few years, we’ve been hard at work to meet these demands with solutions such as Cognition Engine which complements our observability platform with AIOps capabilities.
A multi-faceted approach to AIOps
The long term goal we most often hear from customers is to ultimately to achieve self-healing and be able to pilot their applications and supporting infrastructures and services to maximize their business outcomes. This will typically be a multi-step, multi-year journey:
- Data ingestion & aggregation across apps on public & private clouds hybrid clouds
- Predictive Analytics and incident management
- Business Journey driven problem resolution
- Extensible Data collection: Our approach is to deliver exhaustive, end to end visibility into all technology stacks, wherever they are. AppDynamics autodiscover any change and updates of your environments and their dependencies up and down the stack. After all, AI and ML can’t shine and deliver theoretical potential without access to large, high-quality volumes of data.That’s why we’re expanding our already large data sources with open standards such as OpenTelemetry and Prometheus. The recent ThousandEyes acquisition by Cisco will deliver key synergies to further understand the behavior and performance of your WAN networks, your interconnections with public clouds, the quality of service of your ISPs, and unparalleled visibility into your SaaS applications such as Office 365, Microsoft Teams, Cisco Webex and Salesforce.
- Unified Data platform: We developed a new unified data model, data repository and query engines to unlock flexible data slicing and dicing for our AI and ML engines, and provide the insight and visibility that everyone – from technologists to business leaders – needs to run a modern enterprise. The new data model is highly flexible and keeps full auto discovered topology and dependencies information. The new unified data repository across storage, network, compute and application relates data to the digital experiences delivered and their corresponding business impact.
- Explainable AI: To Build trust with users and improve Cognition Engine accuracy over time for your specific contexts, AppDynamics Cognition Engine approach differs from other offerings in the market. Our Explainable AI exposes its conclusions, which evidence lead to the insights and recommended next steps. Users can then upvote or downvote Cognition Engine reasonings to help it adapt better over time to specific contexts.
- Contextual AI: To maximize insights value, Cognition Engine heavily relies on high quality transaction-based data we generate or ingest. We do not base our approach on unstructured data coming from many third party solutions, but a unified data model with full awareness of topologies and dependencies and domain expertise built over the last decade. Some AIOps solutions sit atop third party monitoring data and usually require time-consuming model training and can surface erroneous root causes. For instance, if a server faces issues right before or at the same time of your application, it doesn’t mean it is to blame, especially if it doesn’t run any code involved in the faulty transaction. That dependency knowledge and further context on the studied transactions lives through all stages of Cognition Engine.
Cognition Engine inner workings
AI has not yet reached the maturity we see in Hollywood movies: there is no super AI, ingesting and understanding all kinds of data and making sense of it. That’s why Cognition Engine is based on a collection of specialized AI working in unison.
Automated Anomaly Detection (AD) which continually learns what normal looks like for your specific applications and context, then identifies abnormal behaviors. It beats humans at detecting weak signals in the sea of data to act before problems arise and impact customers and the business.
Automated Root Cause Analysis (RCA) comes after Anomaly Detection to investigate further. It navigates the dependency tree (i.e. which infrastructure supports which application for which user and the resulting business impact) to identify the problematic areas (compute/network/storage/code) to accelerate mean time to resolution (MTTR). To help teams further, we recently released Automated Transaction Diagnostics (ATD) which analyzes all instances of collected telemetry to surface why all instances of a given digital service fails.
Experience Journey Map then delivers a customer-centric view of their digital journeys and the supporting technology health information to understand its influence. Application owners and IT operations professionals lack visual insight into changing user patterns, where these changes are occurring, and what’s causing them. Using advanced data science algorithms, Experience Journey Map updates continuously, deploying self-learning capabilities to discover changes in user journey patterns and alerting you before they become major problems.
Field tested results
Alaska Airlines’ e-commerce division leveraged AppDynamics Cognition Engine, with Nemo Hajiyusuf, software engineering manager, reporting that, “We were able to reduce the number of outages by 60% [in 2017], and we continue to sustain that. Our mean time to detection went from hours, where customer care was calling us to say customers were having issues, to less than 10 minutes”.
Automated actions further extend AIOps benefits
Having great visibility and deep insights into your whole technology stack, the digital experience delivered to your users and the business impact helps technologists to address many of today’s challenges, but at AppDynamics, we believe we can also leverage this information to further automate actioning.
For instance, AppDynamics integrates with Cisco Intersight Workload Optimizer (IWO) to automatically right-size applications’ supporting infrastructure and workload placement based on an application’s health, optimizing for both performance and costs. AppDynamics has long provided AIOps capabilities to its customers via Cognition Engine, and we’ll continue to add many more to meet the ever-changing needs of our customers. We’re thrilled that this market was recently acknowledged by Forrester in its new report, Now Tech: Artificial Intelligence For IT Operations, Q4 2020.
Referenced in this post
(1) Market Guide for AIOps Platforms, 7 November 2019, Analysts: Charley Rich, Pankaj Prasad, Sanjit Ganguli