As more enterprises distribute applications not only between data centers, but also across data centers and multiple clouds, the application footprint is growing in size and complexity. And with companies increasingly relying on better end-to-end performance as a key requirement for business success, performance implications for these highly distributed and scalable applications are greater than ever.
Indeed, application performance in today’s hyper-connected social world directly impacts a business’s brand, revenues and customer stickiness. Application performance monitoring (APM) is critically important, of course, but APM can provide far better results when application and business performance metrics are leveraged to program the underlying network policy. The end result can be application-driven, end-to-end control that’s highly effective regardless of the underlying network/cloud infrastructure.
In this blog—the first in a series—we’ll examine the pain points associated with the lack of application and network correlation, and discuss the benefits of APM when correlated with underlying network visibility and monitoring. We’ll explore how business and application performance metrics and policy, when correlated with underlying network information, can provide the fastest root cause analysis (RCA). We’ll also look at how this integration between application and network performance can reduce the risk of unexpected application outages, simplify application deployment, and boost trust and understanding across teams. These benefits ultimately will lead to better customer experiences and business outcomes for critical application and business transactions.
The Benefits of Modern Apps
Most applications developed in recent years are highly distributed from the ground up. Traditional client-server models have given way to containerized, virtualized, distributed apps built using state-of-the-art frameworks, technologies and specialized third-party services. A modern app may even be written as a wrapper/enclosure for a legacy application in application-modernization projects. And the use of agile DevOps methods to develop and operate these apps can mean frequent rollouts and changes to production environments.
Modern apps are growing in complexity and scale. They’re capable of running in multiple environments and are accessible via myriad devices, including PCs, mobile gadgets and IoT endpoints. These apps traverse a variety of networks, from traditional data centers to multiple WAN links to the cloud. Within the datacenter (DC)—whether a private DC or a public cloud colo facility—the size and complexity of the underlying network is growing to support modern application deployment models and to scale as needed. All of this is driving the need for faster root-cause identification of problems.
But Modern Apps Can Bring Pain, Too
In contrast to the growing complexity of modern-app deployment, the end-user experience requires great simplicity. Complicating matters is the fact that users demand flawless app performance 24/7. Unsurprisingly, many pain points are associated with achieving this goal.
Let’s examine traditional network issues that adversely affect application performance, which is critical to finding root cause faster. As you’re aware, application slowdowns or failures lead to a poor end-user experience. These incidents can be caused by a number of network-related issues, including:
-
Incorrect network configuration for the application’s needs; something as simple as the duplex or speed of a switch port can cause big problems.
-
Firewall or load balancer misconfiguration—not allowing traffic for a particular application component.
-
Improper permissions that block good traffic from accessing an application service or, conversely, allow bad traffic to access an app component or service.
-
Packet loss due to overwhelming load on a network device, insufficient bandwidth, or other factors.
-
Packet loops or extra inefficient hops in the network.
-
Network policies that inadvertently impact application performance such as discussed below.
A large portion of modern enterprise application traffic can be classified as east-west—in a datacenter environment, that’s traffic moving between application servers, databases, firewalls, load balancers and enterprise storage devices. Some network issues are unique to modern data centers and can adversely impact both application performance and the end-user experience. Examples include:
-
Wrong mapping of application requirement (policy) to underlying switch fabric/ports.
-
Incorrect switch configuration, causing fabric loops for data between systems, or incorrect drops.
-
Wrong or outdated storage access policy or configuration.
-
Inefficient virtual machine-to-physical port configuration, i.e., wrong virtual-to-physical (v-to-p) or physical-to-virtual (p-to-v) mappings.
-
Cabling issues on top-of-rack (TOR) or end-of-row switches (EOR).
-
Inefficient or wrong power budget, and other factors.
Cloud-related network issues can also impact app performance, including incorrect configuration of virtual private gateway, security group, virtual router capacity, and traditional DC and cloud DC gateway settings.
The Problem with IT Silos
Application outages and slowdowns are often technological in nature, although many are exacerbated by organizational issues. Most IT organizations evolve from silo-based org structures and skill sets, including app opps, datacenter network, wide area network, security, desktops, cloud, and so on. In many cases, these siloed organizations don’t communicate or work well together.
Furthermore, these silos often use their own set of tools for performance monitoring and troubleshooting—different tools for network monitoring of routers, switches, firewalls and load-balancers, for instance. And while these tools may do a decent job of detecting problems, they solve siloed problems for their respective domains.
Another issue is that these tools don’t provide cross-domain correlation, nor are they able to map application slowdowns to specific network issues. And while some tools attempt to do this, they don’t map from business transactions—how an end-user interacts with or uses the application all the way through the network—without extensive war-room involvement.
In production environments (where there is tremendous pressure from the business), these balkanized orgs and tools focus on silo-specific, “not-my-problem” outcomes that fail to resolve end user or customer problems. This phenomenon, known as mean-time-to-innocence (MTTI), zaps time, effort and energy from companies, resulting in a loss of productivity and customer stickiness.
How the Integration of Network, APM and Troubleshooting Brings Value to Ops Teams
The ability to see application performance issues in near-real time, correlated to underlying network performance, is exceptionally valuable. Mapping application changes and policies to underlying data center policy can go a long way toward driving efficiencies inside an organization, as more than three-fourths of data center traffic is east-west, according to Cisco’s Global Cloud Index.
The ability to dynamically discover application topology, as well as proactively identify application performance bottlenecks all the way down to a specific data center or network segment, can prove very beneficial to an organization.
This integration of network, APM and troubleshooting offers many benefits. Some key ones are:
-
Fastest app-to-network root cause analysis: Fast and flexible mapping of application changes to the underlying DC network. By mapping application policy to underlying network policy, network ops teams can receive application-driven information quickly. This increases productivity by avoiding war-room scenarios, and is by far the biggest benefit in modern networks and data centers where enterprise apps are deployed.
-
Reduced risk of unexpected application outages: When app ops can provide proactive alerts to network ops on specific network or data center slowdowns involving an application or business transaction, network ops can focus on the root cause to prevent further performance degradation and/or outages.
-
Simplified application deployment: the ability to generate network policy based on application topology (the whitelist model) helps simplify app deployment.
Finally, from an organizational perspective, correlated views can reduce mean-time-to-innocence. This helps app ops work better with network ops when reporting slowdowns to the business. A common dashboard with important KPIs makes this effort a lot easier. This cooperation not only promotes trust between app ops/devOps and network ops teams, it also provides a better operational view for the business.
A Major Win for App Ops, Network Ops, and the Business
The correlation of application performance metrics—from business transaction and end-user experience all the way through the underlying network—is critical for business and operational excellence. This shared view of application and network performance delivers key benefits such as reduced mean-time-to-innocence, better cross-team collaboration, and a simplified operational business model. Having an app-centric and business-level view of underlying network performance bottlenecks leads to greater customer satisfaction overall.
Schedule a demo to learn how AppDynamics and Cisco are working together to bring this visibility to enterprises everywhere.