Novus

Novus reduces application performance troubleshooting time from weeks to minutes using the AppDynamics platform

Novus is the world's leading portfolio intelligence platform with over $2 trillion in client assets under management. Novus provides financial insights and skills assessments to large institutional investors and direct money managers via their proprietary analytics platform. Using Scala as their language of choice, Novus’ platform is built using a sophisticated, distributed network of services that decouples the various analytical procedures necessary to provide real-time analysis and reports to their customers.

Challenge: "Hundreds of machines, hundreds of logs"

Novus's engineering team built a platform responsible for mission-critical performance. As the Novus architecture began to grow, the team responded by scaling its platform into various internal services. Isolating the responsibilities into decoupled services helped the team reach the scalability demands required by its growth. As Novus' environment grew in scale, the method in which it would diagnose problems did not; using logs to diagnose application and performance issues quickly proved to become tedious, difficult, and unreliable. "Previously, Novus was entirely log driven - hundreds of machines, hundreds of logs," says Brian KimJohnson, Software Engineer at Novus.
Furthermore, associating a particular user to the correct log data to remedy the flawed user experience proved to be difficult. Noah Zucker, Software Engineer at Novus, recalls, "The best example of this that stands out in my mind is we had a case where, for some reason, our entire platform was grinding to a halt." Noah and his team could not diagnose swiftly enough why the user was plagued with the issue and, unfortunately, as a result, "the user was taking down one server after another. We could not keep up with the speed in which we were experiencing problems," says Brian. "That is an example of an issue that we would now solve in minutes, thanks to AppDynamics."

That is an example of an issue that we would now solve in minutes, thanks to AppDynamics.

Solution: A unified monitoring platform that provides transparency into production issues

Implementing AppDynamics as a unified APM platform gave insight into the actual problems causing application performance issues in production. "Performance issues have gone from an all-hands-on-deck show-stopper to one person only spending 10 minutes to get an answer by using AppDynamics. That is a huge productivity booster," says Zucker.

In addition to consolidating critical features into a single interface, AppDynamics helped uncover application problems that were not detected before. "We had a memory leak that had gone unnoticed for almost a year. After we had turned on the memory monitoring, we were alerted to a memory leak and quickly had it resolved," says Zucker, "targeting memory leaks is a big win for us."

After using AppDynamics to pinpoint and identify performance issues, the resolution time has gone from weeks to a matter of minutes. Novus estimates a 25% reduction in man-hours to diagnose and fix problems simply by using AppDynamics versus traditional logging methods. "Traditionally, resolving an issue would involve three engineers for at least an hour, and now it takes only one engineer resolving it in 5 minutes," says Brian KimJohnson.

Performance issues have gone from an all-hands-on-deck show stopper to one person only spending 10 minutes to get an answer by using AppDynamics. That is a huge productivity booster.

Benefits: MTTR is faster so engineers can focus on core business features

The Novus engineering team has benefited from enormous productivity gains since their adoption of AppDynamics. "Investigating a problem would take days and most of the team would get sucked into it. Now, almost all of them can now be handled by one person within 10 minutes," says Noah Zucker.

The Novus team no longer has a need for third-party profilers or monitoring solutions, as AppDynamics provides a suite of critical features necessary. "One of the most useful features that I discovered in AppDynamics was the JMX monitoring", says Brian KimJohnson, "the JMX console removed the need to write our own code in order to achieve what AppDynamics provides out of the box".
When a performance problem is reported, the transaction snapshots feature can correlate the exact diagnostic data with the user experiencing the problem. "When a user complains that a page was slow, we can zero in on that specific issue and explain exactly why that specific user experienced slowness," says Noah. Using the AppDynamics baselining and automatic anomaly detection, the Novus engineers have a true understanding of what is considered healthy. "Now, there is no more guessing about what is healthy and how it is supposed to behave. We know what to expect and aspire to."

In short, “AppDynamics dramatically improved our ability to respond to performance issues, which translated to a huge improvement in user experience and a solid increase in customer retention."

AppDynamics dramatically improved our ability to respond to performance issues, which translated to a huge improvement in user experience and a solid increase in customer retention.