Case Studies
Introduction
FamilySearch International, a non-profit organization sponsored by The Church of Jesus Christ of Latter-day Saints, is the world’s largest genealogy organization. Millions of people worldwide rely on FamilySearch to preserve and provide free access to genealogical records of every kind.
Managing Big Data: 10 PBs and Growing Fast
All the data – 10 petabytes in all currently – can be a challenge to manage, especially as it grows. “We’re growing the system at an incredibly fast pace of 300 million new records per year and 300 million new images,” said Bob Hartley, Principal Engineer and Development Manager at FamilySearch. This growth rate is impressive, especially considering that most of FamilySearch’s records are input by volunteers. “We’re on the path to 20 petabytes of data, and more.”
FamilySearch started out with an Oracle database, Java and EJBs to power its search engine, but quickly found that this architecture could not support the quantities of data they had to handle. “The amount of time it took to query the database against 3 billion records to return one record was just completely unacceptable to the user,” Hartley said. “They don’t want to wait minutes.”
FamilySearch quickly realized that simply scaling infrastructure would not be an efficient way to manage performance. Their plans for the next year included increasing the scale that their web app could handle by an order of 10, but growing their infrastructure by 10x would be prohibitively expensive. They needed another solution to managing performance while both the load on their application and the data within it grew exponentially.
Getting to 10x: Finding a Solution to Manage Performance
Hartley and his team are nothing if not thorough: During their search for an application performance management (APM) solution, they evaluated over 20 different vendors. Most of these solutions were not well suited to FamilySearch’s unique environment. “Most of the monitoring solutions we evaluated followed the traditional model – they assume we have a static infrastructure, which has to be defined up front. The software doesn’t adapt well to change, which is constant in our environment.”
AppDynamics, on the other hand, presented a different approach. The software could automatically detect the application’s architecture and would make adjustments as it grew and changed. “AppDynamics came in and within 4 hours they had the software up and running. We immediately had the developers looking over our shoulders, asking, ‘Why is that service talking to that service? Why are we seeing that problem?’”
Hartley and the development team at FamilySearch were able to quickly identify and fix several issues they had suspected existed for years. “AppDynamics was able to discover issues that we had been trying to track down in the software, in the monolithic components as well as the service-oriented components of our system, for many years,” said Hartley. “These issues were immediately apparent to us in the way AppDynamics instruments the code, presents the data to the user, and provides a way to identify and troubleshoot the problems.”
10x Throughput and 10x ROI
Hartley found that by giving his team access to AppDynamics, they were able to more quickly respond to performance issues that arose in the application. “We’ve taken the mean-time-to-recovery for our problems from days down to a matter of minutes,” said Hartley. “We estimate we’ve saved about $460,000 in productivity costs alone, just by reducing the amount of time we all spend troubleshooting issues.”
In addition, Hartley and his team have been able to reduce the response time of their longest-running transactions. In the long run, this has enabled them to improve application performance while increasing load and data – without adding any new infrastructure. “Adding more infrastructure would have been very expensive for us – it would have cost us millions in hardware, power, and administration costs. The fact that we’ve avoided having to do that has been huge for us.” Hartley and his team estimate that the deferred hardware costs amount to almost $3.5 million in savings for FamilySearch.
A year after first implementing AppDynamics, Hartley and his team reviewed their progress so far. They found that they had scaled the load on their application by over 10x – throughput had increased from 11,500 transactions per minute to 122,000 transactions per minute. “We didn’t buy any new infrastructure, we didn’t need any new hardware, and now we’re supporting more than 10x load on the system that we were a year ago,” said Hartley. “But the best thing is, our end users would never know – despite our massive scale in load and in data, our performance is actually improving. That’s seems impossible, but that’s what we’ve managed to do with AppDynamics.”
Quote:
| Example Use Case | Before AppDynamics | After AppDynamics | Benefits |
|---|---|---|---|
| Hard ROI | |||
| Reduction in MTTR for production issue | 227 incidents/year, 33 man-hours each | Reduced MTTR by 45% (conservative) | $460,836 in productivity savings |
| Reduction in MTTR for pre-production issue | 300 incidents/year, 49 man-hours each | Reduced MTTR by 45% (conservative) | $885,170 in productivity savings |
| Hardware & indirect costs (Power/AC) | Estimated 1,200 additional units required to achieve 10X scalability | No additional infrastructure needed | $3,464,584 |
| Total savings after 2 years: | $4,810,590 | ||
| Soft ROI | |||
| Example Use Case | Before AppDynamics | After AppDynamics | Benefits |
| Increased agility | 12 application releases a year | 20 application releases a year | Enhanced end user experience |
| Increased throughput | 11,500 tpm | 122,000 tpm | Service more end users |
| Increased user concurrency | 6,000 users per minute | 25,000 users per minute | Support more users |
| Improved performance | Key transactions would take minutes | Key transactions now take seconds | Enhanced end user experience |

