Search...
- Product overview
  
  Business
  
  Correlate performance metrics with business outcomes.
  
  User experience
  
  Provide a flawless user experience, every time.
  
  Application
  
  Ensure quality software delivery with application performance monitoring (APM)
  
  Infrastructure and Cloud
  
  Monitor and manage on-premise, hybrid, and cloud-native environments.
  
  Network
  
  Isolate performance issues across third party networks and SaaS.
  
  Security
  
  Automate and continuously adapt application security.
- Product overview
  
  See how it works
  
  Supported technologies
  
  Get free trial
  
  Schedule a demo
  
  Product updates
  
  Simplify agent lifecycle management with Smart Agent for Cisco AppDynamics
  
  Agent lifecycle management just got easier. Learn how the Smart Agent for Cisco AppDynamics centralized user interface streamlines agent management and simplifies application instrumentation.
- Solutions
  
  Overview
  
  Cisco Cloud Observability
  
  Cloud migration
  
  AWS monitoring
  
  Microsoft Azure
  
  Microservices
  
  SAP monitoring
  
  UCCE monitoring
  
  Continuous delivery
- Featured
  
  Get to know Cisco Cloud Observability powered by the Cisco Observability Platform
  
  Cisco Cloud Observability is purpose-built to observe distributed and dynamic cloud native applications and infrastructure at scale.
- Company
  
  About us
  
  Careers
  
  Newsroom
  
  Contact us
  
  Why AppDynamics
  
  Become a partner
  
  Become an advocate
- Featured
  
  An introduction to Cisco AppDynamics
  
  Discover how Cisco AppDynamics can help you see your technology through the lens of the business — so you can prioritize what matters most.
- Learn
  
  Webinars
  
  Customers
  
  Resources
  
  Blog
  
  Events
  
  Analyst coverage
  
  University
  
  Community
- Featured
  
  Peer Insights ‘Voice of the Customer’: Application Performance Monitoring and Observability
  
  See how Cisco for AppDynamics was reviewed by customers on Gartner® Peer Insights™ — and why we were recognized as a 2023 Customers’ Choice for APM and observability.

What is mean time to repair (MTTR) & other incident metrics?

Accurately assess the reliability of your system and its components as well as the efficiency of your team’s incident management capabilities by measuring mean time to repair (MTTR).

What is MTTR

Mean time to repair (MTTR) measures the average time from when an issue is initially detected to the moment the component or system's functionality is fully restored. MTTR is a useful metric to assess the maintainability of an application or infrastructure, the lifecycle costs of equipment, and the efficiency of an organization's DevOps team.

Components or systems that can be repaired quickly will have a low MTTR and associated outages are likely to have less of an impact on business outcomes. A high MTTR can result in significant unplanned downtime and may have a negative impact on the overall user experience.

Measuring diagnostic time, repair time, testing and other activities that relate to identifying and mitigating performance issues can provide essential clues to your team's incident management capabilities and may highlight potential areas of improvement that can help optimize your application, infrastructure, and workflow.

How to calculate MTTR

MTTR is a key performance indicator and a critical component of developing an agile and dynamic DevOps strategy.

In the past, MTTR mostly referred to hardware, and IT teams used a combination of redundancy and replacing devices prior to the predicted end of their lifecycle to proactively avoid system failures.

The adoption of cloud computing has placed a lot of the responsibility of maintenance and performance on the providers of Infrastructure as a service (IaaS) or Platform as a Service (PaaS) providers, with the acceptable rate of MTTR often negotiated as a part of service level agreements (SLAs). As a result, DevOps teams can often focus solely on debugging their own applications or on-premises equipment.

The formula for calculating a basic measure of MTTR is essentially to divide the amount of time a service was not available in a given period by the number of incidents within that period.

Tracking the total time between when a support ticket is created and when it is closed or resolved is an effective method for obtaining an average MTTR metric.

Establishing a baseline for identifying and resolving performance issues and working continuously to improve upon that number results in reduced costs, improved reliability, and increased customer satisfaction.

MTTR vs MTTF vs MTBF

Failure metrics are valuable KPIs that allow organizations to track the reliability of their systems. "Failure" doesn't necessarily indicate a complete outage, but can also represent general functionality issues or degradation. Other important failure metrics to be aware of include:

Mean time to recovery

MTTR is also referred to as mean time to recovery, resolve, or resolution and is the length of time between when a problem arises and when it is solved.

Mean time to failure

MTTF represents the average duration of a system or component's overall lifecycle and refers to items that are not repairable. There is no need to calculate repair times when an item requires replacement.

Mean time between failures

MTBF denotes the average operational time between failures and is used to forecast the availability of systems and components. MTBF is calculated by measuring the time between failures of systems or components.

Collecting data-based evidence of when failures may occur and what the potential impact may be is crucial to effectively managing, monitoring, and mitigating performance issues.

How AppD Helps Reduce MTTR

AppDynamics helps your organization establish a center of monitoring excellence for efficient and effective performance monitoring and application management. Harness the power of artificial intelligence and machine learning with the Cognition Engine to guide root cause analysis and improve productivity while reducing MTTR, SLA breaches, and system downtime.

Hear from our customers

"The ability to trace a transaction visually and intuitively through the interface was a major benefit. This visibility was especially valuable when Nasdaq was migrating a platform from its internal infrastructure to the AWS Cloud."

Heather Abbott, SVP Corporate Solutions Technology, Nasdaq