Server Monitoring – Think You’re Covered? You’re not!

image_pdfimage_print

Server monitoring a foundational component to any data center monitoring architecture but it has become a crutch and a deterrent to successfully building out a holistic monitoring platform. Servers exist to run applications and you will never properly monitor applications with server monitoring alone. In this blog we will explore what server monitoring is, some of the available tools, benefits, drawbacks, and how to move to the next level.

What is server monitoring?

Server monitoring consists of monitoring operating system and associated hardware metrics. It’s the view of the world from the perspective of the server but never from inside of the running processes. It includes the basics of CPU, memory, disk, and I/O metrics but those are simplified categories of the real underlying metrics like CPU sys time, CPU wait time, used memory, free memory, disk queue length, % disk used, network collisions, adapter transmit rate, etc… Server monitoring is used by every IT organization in some shape or form.

What are some server monitoring tools?

When I was a Unix System Administrator I used tools like sar, vmstat, nmon, top, topas, and netstat to monitor my servers in real time. In the Windows world I used perfmon for my real time monitoring needs. I also had other tools at my disposal for alerting and keeping historical metrics. Some popular versions of those tools are BMC Patrol, HP OpenView, Nagios, Zenoss, Cacti, Zabbix, Ganglia, GroundWork, and Hyperic.

All of these tools are useful. All of these tools also fall woefully short of achieving the goal of minimizing application downtime and maximizing application performance.

vmstat

vmstat output from a Linux server. Is there any problem with the running application?

What’s the problem?

The problem is that none of the server monitoring tools are capable of knowing how your applications are performing. Some of them can probe your application to see if it is available or not but none can tell you why your application has ceased to function.  No server monitoring tool can tell you any of the following:

  • What is the response time of every request to my application?
  • What components of my application are involved in any of my transactions and where is the slow down?
  • How does the application code execute in the run time?
  • What part of the application code is slow?
  • What application functionality is used, how often, and how does it perform?
  • What application functionality is throwing exceptions and what are they?
  • Did a slow external service call impact my application response time and by how much?

Without answering those fundamental questions you don’t stand a chance of restoring application service in minutes instead of hours or days. Jonah Kowall of Gartner recently wrote a post titled “Got Nagios? Get Rid of It.“. I’d suggest you take a moment to read it when you can.

nagios graph explorer

Nagios HTTP check response times. Is the application experiencing problems as a whole? Problems with individual functions? Are there application Errors?

nagios performance charts

Nagios server monitoring charts.What does this tell us about our application?

What’s the solution?

Many companies have turned to log monitoring and analytics as a solution to this problem but as my colleague Dustin Whittle explained, “If all you have is logs, you are doing it wrong!“. Log file monitoring is nice to have but it also can’t answer many of the questions posed in the above section without a lot of customization. The best solution to the problem at hand is to use the latest generation of Application Performance Monitoring (APM) tools. APM tools understand the inner workings of your applications. They can see the code executing, the entry and exit calls to the application, the transactions flowing through and across multiple application components, exceptions and their associated impact, and much, much more.

Application Flow Map

Dynamic application flow map showing all application components.

Business Transactions

Business transactions automatically detected, tracked, and classified.

Call Graph

Call graph of a single business transaction with all methods, timing, and remote calls.

What’s the impact?

Ultimately APM products offer a tremendous amount of value. Here are just a few of the benefits of using APM:

  • Reducing MTTR from hours/days to minutes.
  • Faster development due to less time tracking down bugs.
  • Fewer bugs released because they are easier to identify and remediate.
  • Faster QA cycle due to rapid problem detection, isolation, and resolution.
  • More stable production environment due to better development and QA.

If you’re still using server monitoring and log analysis to try and figure out your application problems you’re wasting valuable time and doing a disservice to the business you support. Server monitoring and log analytics complement a good APM tool and strategy. There are still too many organizations today that are doing the bare minimum of server monitoring. Do yourself and your business a favor and start down the path of a holistic monitoring strategy by trying AppDynamics for free today.

  • Scott

    Jim, APM tools are definitely needed and instrument what is happening in the App Server via transaction tracing and other technologies. Some APM vendors offer really powerful end user experience monitoring as well, but most do not offer the depth of infrastructure monitoring (you referred to it as server monitoring) that Network Operations or IT Operations needs on a daily basis. One of the products you mentioned, Zenoss, offers a paid-for component called Service Impact that discovers the dependencies between technology components and identifies what applications and services are impacted by various technology faults or performance issues. Zenoss Service Dynamics includes comprehensive resource discovery and management (compute, storage, network, virtual infrastructure, cloud-base infrastructure) and is not at all a “server monitoring solution”. Ultimately, I think both capabilities are needed. APM is needed to ensure application performance and gauge end user experience, and is often a must-have for DevOPS and Application Owners. On the other hand, Service-centric infrastructure monitoring is critical as well but more applicable to network operations and IT Operations in general. Zenoss, for one, offers comprehensive infrastructure monitoring for IT Operations with the added benefit of real-time Service Model Dependency discovery and management, service alerts, and both Service-level and infrastructure-level analytics as well. I think there is a “both-and” rather than an “either-or” reality for modern Hybrid IT organizations.

    • Jim Hirschauer

      Hi Scott, thanks for your comment. In the spirit of full disclosure you probably should have mentioned that you are a Zenoss employee somewhere in your comment but I’m sure it was just an oversight.

      I absolutely agree with your point that organizations need both server monitoring and APM. That is why I mentioned a holistic monitoring approach. Many server monitoring (or infrastructure monitoring) tools have tried to provide some sort of view into application health but there is currently no way to provide true application visibility (transaction tracing, deep code diagnostics, etc) without having an application specific agent running with the application code itself. There is no way I would ever consider any of the tools I listed in my post as a viable substitute for APM but they are certainly complementary.

      By the way, I actually like Zenoss as an infrastructure monitoring tool. Thanks again for your comment.

    • Jim Hirschauer

      Hi Scott, thanks for your comment. In the spirit of full disclosure you probably should have mentioned that you are a Zenoss employee somewhere in your comment but I’m sure it was just an oversight.

      I absolutely agree with your point that organizations need both server monitoring and APM. That is why I mentioned a holistic monitoring approach. Many server monitoring (or infrastructure monitoring) tools have tried to provide some sort of view into application health but there is currently no way to provide true application visibility (transaction tracing, deep code diagnostics, etc) without having an application specific agent running with the application code itself. There is no way I would ever consider any of the tools I listed in my post as a viable substitute for APM but they are certainly complementary.

      By the way, I actually like Zenoss as an infrastructure monitoring tool. Thanks again for your comment.

      • Scott

        Hi Jim. Yeah I figured it was visible via my linkedin profile…Not trying to hide! :) I recently joined Zenoss after a stint at Quest Software.

        At Quest Software (now part of Dell), I was responsible for product marketing for Foglight at one time. We had an APM offering as well as an infrastructure monitoring offering. These were sold by different sales teams to different buyers!

        I have been in performance and security management just about all of my career. What I find interesting is that our large enterprise IT Operations buyers are looking to find something that works more easily than the traditional Big-4 frameworks, especially for UCS, VMware, Flexpod, vBlock infrastructure. They tell me they want to know when something fails or when there is a performance exception/issue, what application or service is impacted. They really aren’t looking for deep APM transaction level tracing. They just want to know what application or service is impacted and to what extent. So some lightweight APM is useful for the IT Operations folks, but from my experience the APM like what AppDynamics delivers is really purchased/used by DevOPs and Application teams.

        Good chatting with you….

  • Pingback: Companies need to pick correct server monitoring tools | Scalextreme Articles

  • http://www.neteffects.com.au/ Server Management

    There are a lot of server monitoring software available in the market, at the end of the day you have to find the one that suits your company’s needs. A server monitoring tool that is reliable, transparent and delivers efficient results.

Copyright © 2014 AppDynamics. All rights Reserved.