The Real Cost of Logging

image_pdfimage_print

In order to manage today’s highly dynamic application environments, many organizations turn to their logging system for answers – but reliance on these systems may be having an undesired impact on the applications themselves.

The vast majority of organisations use some sort of logging system — it could log errors, traces, information messages or debugging information. Applications can write logs to a file, database, Windows event log, or big data store and there are many logging frameworks and practices being used.

Logging brings good insights about the application behavior, especially about failures. However, by being part of the application, logging also participates in the execution chain, which can have its disadvantages. While working with customers we often see the negative consequences when logging alone introduced the adverse impact to the application.

Most of the time the overhead of logging is negligible. It only matters when the application is under significant load — but these are the times when it matters the most. Think about Walmart or Best Buy during Black Friday and Cyber Monday. Online sales are particularly crucial for these retail companies during this period and this is the time when their applications are under most stress.

To better explain the logging overhead I created a lightweight .NET application that:

  1. implemented using ASP.NET
  2. performs lightweight processing
  3. has an exception built in
  4. exceptions are always handled within try…catch statement
  5. exceptions are either logged using log4net or ignored based on the test

In my example I used log4net as I recently diagnosed a similar problem with a customer who was using log4net, however this could be replaced for any other framework that you use.

Test #1

First, we set up a baseline by running an application when exceptions are not being logged from the catch statement.

 

Test #2

Next I enabled logging exceptions by logging those to a local file and running same load test.

As you can see not only is the average response time significantly higher now, but also the throughput of the application is lower.

The AppDynamics snapshot is collected automatically when there is a performance problem or failure and includes full call graph with timings for each executed method.

By investigating the call graph AppDynamics produces, we see that log4net method, FileAppender, renders error information to a file using FileStream. On the right you can see duration of each call and the most time was spent in WriteFileNative as it was competing with similar requests trying to append the log file with error details.

Test #3

I often come across attempting to make exception logging asynchronous by using ThreadPool. Below is how the performance looks like in this setup under exactly the same load.

This is a clever concept and works adequately for low throughput applications, but as you can see the average response time is still in a similar range as the non-asynchronous version, but a slightly lower throughput is achieved.

Why is this? Having logging running in separate threads means the resources are still consumed, there are less threads available, and the number of context switches will be greater.

In my experience logging to a file is exceptionally common. Other types of storage could introduce better performance, however they always need to be further tested and logging to a file is the easier solution.

Summary

While logging is important and helps with application support and troubleshooting, logging should be treated as part of the application logic. This means logging has to be designed, implemented, tested, monitored and managed. In short, it should become a part of full application lifecycle management.

How well do you understand the your logging framework and it’s performance cost?

 Take five minutes to get complete visibility into the performance of your production applications with AppDynamics today.

  • Andreas Grabner

    Can you give more details on the number of exceptions being thrown per request? Was this just one? Or hundreds? Was the full stavk trace logged or just the top frames? How many worker threads did you have running in parallel to write to that logfike?
    I think its an interesting test – but it would be helpful to better understand your test scenario

  • Pingback: How Do you Monitor Your Logging? - Application Performance Monitoring Blog from AppDynamics

Copyright © 2014 AppDynamics. All rights Reserved.