Search...
- Product overview
  
  Business
  
  Correlate performance metrics with business outcomes.
  
  User experience
  
  Provide a flawless user experience, every time.
  
  Application
  
  Ensure quality software delivery with application performance monitoring (APM)
  
  Infrastructure and Cloud
  
  Monitor and manage on-premise, hybrid, and cloud-native environments.
  
  Network
  
  Isolate performance issues across third party networks and SaaS.
  
  Security
  
  Automate and continuously adapt application security.
- Product overview
  
  See how it works
  
  Supported technologies
  
  Get free trial
  
  Schedule a demo
  
  Product updates
  
  Cisco FSO vs. Dynatrace: Monitoring is no longer enough to safeguard user experience
  
  Cisco Full-Stack Observability (FSO) goes beyond reactive monitoring to deliver proactive and predictive visibility across the entire tech stack. Learn more in this can't-miss webinar.
- Solutions
  
  Overview
  
  Cloud migration
  
  AWS monitoring
  
  Microsoft Azure
  
  Microservices
  
  SAP monitoring
  
  UCCE monitoring
  
  Continuous delivery
- Featured
  
  Read the report:
  
  Forrester Total Economic Impact™ of Cisco Full-Stack Observability found 359% ROI and less than 6 months payback.
  
  Get your copy now!
- Company
  
  About us
  
  Careers
  
  Newsroom
  
  Contact us
  
  Why AppDynamics
  
  Become a partner
  
  Become an advocate
- Featured
  
  Better Together: Introducing New Splunk Integrations and AI Innovations for Cisco AppDynamics
  
  Discover the industry’s most comprehensive Full-Stack Observability solution 
- Learn
  
  Webinars
  
  Customers
  
  Resources
  
  Blog
  
  Events
  
  Analyst coverage
  
  University
  
  Community
- Featured
  
  Peer Insights ‘Voice of the Customer’: Application Performance Monitoring and Observability
  
  See how Cisco for AppDynamics was reviewed by customers on Gartner® Peer Insights™ — and why we were recognized as a 2023 Customers’ Choice for APM and observability.

What Is Distributed Tracing?

Improve visibility and quickly diagnose bottlenecks by tracking requests as they move through complex software architectures.

What is distributed tracing?

Distributed tracing is a method of observing requests as they advance through a distributed system. Its primary use is to profile and monitor modern applications built using microservices and (or) cloud native architecture, enabling developers to find performance issues.

With distributed tracing, developers can track a single request traversing through an entire system that is distributed across multiple applications, services, and databases.

By using a distributed tracing tool, you can collect data on each request that helps you present, analyze, and visualize the request in detail. These visual representations allow you to see each step (also known a span) a request makes and how long each step takes. Developers can review this information to see where the system is experiencing blockages and latencies to determine the root cause. For example, a request may pass back and forth through multiple microservices before fulfilling a request. Without a way of tracking the entire journey, there is no way to know exactly where the issues occur.

How distributed tracing works

Distributed tracing begins with a single request. Each request is considered a trace and receives a unique ID known as a trace ID to identify that specific transaction. Traces consist of a series of tagged time intervals called spans.

Spans represent the actual work being performed in a distributed system. Along with a name, timestamp, and optional metadata, each span also has a unique ID known as a span ID. Spans have parent-child relationships between each other that are used to show the exact path a transaction takes through the various components of an application.

When requests move between services, all activity is recorded in the span. Once an activity is complete, the parent span refers to the child span for the next activity. Combining all these spans in the right order forms a single distributed trace that provides an overview of an entire request. Once a trace has run its course, you can search it in a presentation layer of a distributed tracing tool.

Why do we need distributed tracing?

Without a way to track requests across different services, it's next to impossible to identify the service that is responsible for a performance issue. Distributed tracing provides a way to track a request from start to finish, making troubleshooting any issues faster and easier.

Modern software architectures provide many advantages to companies. While new practices and technologies like microservices, containers, and DevOps allow teams to manage and operate their individual services more easily, they also bring new challenges. One of the biggest concerns is reduced visibility and the increased difficulty of monitoring your entire IT infrastructure.

With modern applications, a slow-running response is distributed across several microservices and serverless functions that are monitored by multiple teams.

This increased complexity has prompted companies to adjust their observability strategies to provide visibility of the entire request flow, not just services in isolation.

Distributed tracing provides observability for microservices

Request tracing is straightforward in a monolithic application. It aligns with application performance monitoring (APM) where a reporting tool organizes, processes, and creates visualizations of behavior from requests, helping to show how the system is performing. Developers can use these insights to quickly diagnose and resolve bottlenecks and other performance issues before they impact customer experience.

Traditional tracing is much more challenging in a distributed system consisting of multiple services. Microservices scale independently, creating many iterations of the same function. With a monolithic application, you can trace a request through a specific function but with microservices, there could be numerous iterations of the same function, all across different servers and data centers. Distributed tracing allows you to follow requests as they move through each service.

What is the difference between distributed tracing and logging?

The main difference between logging and distributed tracing is that logging provides records from a single application while distributed tracing tracks requests traveling through multiple applications. Both methods help to find and debug issues by allowing you to monitor systems in real-time and look back in time to analyze previous issues.

The rising use of microservices has introduced new complexity to software systems and by extension, system-monitoring practices. Metrics and logs lack the necessary visibility across all services to provide proper support for distributed systems.

Logs only provide insight into the state of a single application with specific time-stamped events that took place in the system. Application performance monitoring provides a more comprehensive way to find the root cause of performance issues. Most APM tools offer some form of distributed tracing while also providing detailed diagnostic data including code-level insights and queries.

Examples of distributed tracing tools

There are many options available for implementing distributed tracing including both open source and enterprise tracing tools. Here are some of the more popular tools:

OpenTracing

OpenTracing is a vendor-neutral API designed to help developers easily incorporate tracing into their code base. It is both a distributed tool and a framework. Libraries written for the OpenTracing specification can be used with any system that is OpenTracing-compliant

OpenCensus

Like OpenTracing, OpenCensus is a tool and a framework and It provides observability for both microservices and monoliths using a common context propagation format. Originally created within Google, it provides a set of libraries for various languages that allow you to collect application metrics and distributed traces and then transfer the data to your backend. This data can be analyzed by developers to understand the state of the application.

OpenTelemetry

OpenTelemetry is a merger of OpenTracing and OpenCensus as it combines the best of each library together. It provides a way to gain insight into the status of applications, web servers or software in near-real-time.

Zipkin

Zipkin is an open source distributed tracing system developed by Twitter. It was written using Java, and it can use Cassandra or ElasticSearch as a scalable backend.

Reporting trace data in Zipkin requires instrumenting applications. This is usually done through configuring a tracer or instrumentation library. There are many ways to report data to Zipkin including via HTTP, Kafka, and Apache. Users can track the source code and any issues on Github.

Jaeger

Jaeger is a newer project from Uber that has been incubated by the Cloud Native Computing Foundation (CNCF). It is written in Golang, and like Zipkin, Jaeger also supports Cassandra and ElasticSearch as scalable storage backends and is compatible with the OpenTracing standard. The analysis tool is lightweight making it a good fit for highly elastic environments like multi-tenant Kubernetes clusters in a docker container.

How AppDynamics can help

AppDynamics application performance monitoring (APM) provides end-to-end monitoring for microservices architectures. This includes the ability to trace transactions across hundreds of microservice calls in production environments, allowing customers to track business transactions end-to-end to rapidly identify and resolve any issues.

Hear from our customers

"With AppDynamics, we gain better visibility into how microservices interface with the rest of the components of our application, the ability to proactively troubleshoot emerging issues, and the increased velocity to resolve issues faster than ever."

Nuno Pereira, CTO, iJET