Cloud Migration Tips Part 4: Failure Breeds Success
Post by Jim Hirschauer | Apr, 25, 2013 | In Cloud
0 Comments
Welcome back to my series on migration to the cloud. In my last post we discussed all of the effort you need to put into the planning phase of your migration. In this post we are going to focus on what should happen directly after the migration has been completed.
Regardless of how well you planned or if you just decided to dive right in without any forethought, there are steps that need to be taken after your migration to ensure your application is working properly and performing up to snuff. These steps need to be performed whether you chose to use a public, private or hybrid cloud implementation.
Step 1: Take Your New Cloud Based Application for a Test Drive
Go easy at first and just roll through the functionality as a user would. If it doesn’t work well for you then you know it wont work well when there are a bunch of users hitting it.
Assuming things went well with your functional test it’s time to go bigger. Lay down a load test and see step 2 below.
Step 2: Monitoring is Not the Job of Your Users
If you’re relying on the users of your application to let you know if there are performance or stability issues you are already a major step behind your competition. If you planned properly then you have a monitoring system in place. If you’re just winging it, put in a monitoring system now!!!
Here are the things your monitoring tool should help you understand:
- Architecture and Flow: You design an application architecture to support the type of application you are building. How do you really know if you have deployed the architecture you designed in the first place? How do you know if your application flow changes over time and causes problems? Cloud computing environments are dynamic and can shift at any given time. You need to have a tool in place that let’s you know exactly what happened, when and if it caused any impact.
What happens if you don’t have a flow map? Simple, when there’s a problem you waste a bunch of time trying to figure out what components were involved in the problematic transaction so that you can isolate the problem to the right component.
- Response Times: Slow sucks! You moved to the cloud for many potential reasons but one thing is certain, your users don’t want your application(s) to run slowly. It seems obvious to monitor the response time of your applications but I’m constantly amazed by how many organizations still don’t have this type of monitoring in place for their applications. There are really only 2 options in this category; let your users tell you when (notice I didn’t say if) your application is slow or have a monitoring tool alert you right away.
- Resources: You need to keep an eye on the resources you are consuming in the cloud. New instances of your application can quickly add up to a large expense if your code is inefficient. You need to understand how well your application scales under load and fix the resource hogs so that you can drive better value out of your application as usage increases.
Step 3: Elasticity
Elasticity is a key benefit of migrating your application to the cloud. Traditional application architectures accounted for periodic spikes in workload by permanently over-allocating resources. Put simply, we used to buy a bunch of servers so that we could handle the monthly or yearly spikes in activity. Most of these servers sat nearly idle the rest of the year and generated heat.
If you’re going to take advantage of the inherent elasticity within your cloud environment you need to understand exactly how your application will respond to being overloaded and how your infrastructure adapts to this condition. Cloud providers have tools to execute the dynamic shift in resources but ultimately you need a tool to detect the trigger conditions and then interface with the dynamic provisioning features of your cloud.
The combination of slow transactions AND resource exhaustion would be a great trigger to spin up new application instances. Each condition on its own does not justify adding a new resource.
The point here is that migrating to the cloud is not a magic bullet. You need to know how to use the features that are available and you need the right tools to help you understand exactly when to use those features. You need to stress your new cloud application to the point of failure and understand how to respond BEFORE you set users free on your application. Your users will certainly break your application and during an event is not the proper time to figure out how to manage your application in the cloud.
Let failure be your guide to success. Fail when it doesn’t matter so that you can success when the pressure is on. The cloud auto-scaling features shown in this post are part of AppDynamics Pro 3.7. Click here to start your free trial today.
Link to this post:Application Elasticity, auto-scaling, cloud architecture, Cloud Migration, Cloud Monitoring
Welcome, let’s start with a quick introduction of who you are and what your role is at PagerDuty.
My name is Alex, I’m PagerDuty’s CEO and co-founder.
And what beer will you be drinking tonight?
Guinness! It’s the office favorite.
What problems does your solution or service solve for customers?
PagerDuty provides centralized, highly-targeted alerting, escalation, and incident management. We integrate with the monitoring systems you have in place to provide a single place to manage all of your alerting. When things go down, we wake you up.
What types of technology trends are you seeing within your customer base?
We have a great mix of customers, from large enterprises (we have 15 of the Fortune 100 as customers), to mid-size companies like Splunk, Citrix, and Box, to two-person startups, which gives us an interesting perspective. The most obvious trend that we’re seeing is companies looking to automate the notification piece of ‘monitoring and alerting’ right off the bat. With IaaS providers like Rackspace and AWS, the need to do a costly NOC build-out has been almost entirely eliminated. Folks who do have a NOC in place are looking to automate what they can, while allowing on-site engineers to tackle the problems they’re best equipped to solve.
What application performance pain and challenges do you typically see within customer accounts?
All our customers have mission-critical infrastructure in place, and for many of them, slow applications directly equate to lost revenue. When company performance depends on applications being available and responsive, having an APM solution deployed isn’t a luxury, it’s a necessity.
Why does partnering with AppDynamics make good sense?
AppDynamics is the enterprise APM solution, and PagerDuty is the enterprise alerting solution. There couldn’t be a more natural fit – when AppDynamics detects an issue, PagerDuty wakes you up.
What’s your favorite thing about AppDynamics?
The ROI! We’re big believers in actively monitoring revenue-critical production applications. Anything you can deploy and see a 2x, 4x, 10x return, quickly, that’s pretty great in my book.
How can someone find out more about PagerDuty?
Just signup for a free trial at pagerduty.com. Or just shoot us an email at sales@pagerduty.com and a member of our sales staff will reach out to setup a demo.
Which software companies inspire you?
B2B companies that make something people want and that solves a clear need. We like companies that can tackle hair on fire problems. AppDynamics, obviously, and Splunk come to mind.
Who is your favorite super hero and why?
Spiderman. He’s powerful yet flawed – a real human super hero.
Alert Storming, Alerts, App Man, application monitoring, Application Performance, Application Performance Management, Application Performance Monitoring, Appman, Appman Beers, Beers, Guinness, incident management, pagerduty, startups, technology trends
Intelligent Alerting for Complex Applications – PagerDuty & AppDynamics
Post by Marcus | Apr, 16, 2013 | In APM Best Practice, APM Thought Leadership, News
1 Comment
Today AppDynamics announced integration with PagerDuty, a SaaS-based provider of IT alerting and incident management software that is changing the way IT teams are notified, and how they manage incidents in their mission-critical applications. By combining AppDynamics’ granular visibility of applications with PagerDuty’s reliable alerting capabilities, customers can make sure the right people are proactively notified when business impact occurs, so IT teams can get their apps back up and running as quickly as possible.
You’ll need a PagerDuty and AppDynamics license to get started – if you don’t already have one, you can sign up for free trials of PagerDuty and AppDynamics online. Once you complete this simple installation, you’ll start receiving incidents in PagerDuty created by AppDynamics out-of-the-box policies.
Once an incident is filed it will have the following list view:
When the ‘Details’ link is clicked, you’ll see the details for this particular incident including the Incident Log:
If you are interested in learning more about the event itself, simply click ‘View message’ and all of the AppDynamics event details are displayed showing which policy was breached, violation value, severity, etc. :
Let’s walk through some examples of how our customers are using this integration today.
Say Goodbye to Irrelevant Notifications
Is your work email address included in some sort of group email alias at work and you get several, maybe even dozens, of notifications a day that aren’t particularly relevant to your responsibilities or are intended for other people on your team? I know I do. Imagine a world where your team only receives messages when the notifications have to do with their individual role and only get sent to people that are actually on call. With AppDynamics & PagerDuty you can now build in alerting logic that routes specific alerts to specific teams and only sends messages to the people that are actually on-call. App response time way above the normal value? Send an alert to the app support engineer that is on call, not all of his colleagues. Not having to sift through a bunch of irrelevant alerts means that when one does come through you can be sure it requires YOUR attention right away.
Automatic Escalations
If you are only sending a notification and assigning an incident to one person, what happens if that person is out of the office or doesn’t have access to the internet / phone to respond to the alert? Well, the good thing about the power of PagerDuty is that you can build in automatic escalations. So, if you have a trigger in AppDynamics to fire off a PagerDuty alert when a node is down, and the infrastructure manager isn’t available, you can automatically escalate and re-assign / alert a backup employee or admin.
The Sky is Falling! Oh Wait – We’re Just Conducting Maintenance…
Another potentially annoying situation for IT teams are all of the alerts that get fired off during a maintenance window. PagerDuty has the concept of a maintenance window so your team doesn’t get a bunch of doomsday messages during maintenance. You can even setup a maintenance window with one click if you prefer to go that route.
Either way, no new incidents will be created during this time period… meaning your team will be spared having to open, read, and file the alerts and update / close out the newly-created incidents in the system.
We’re confident this integration of the leading application performance management solution with the leading IT incident management solution will save your team time and make them more productive. Check out the AppDynamics and PagerDuty integration today!
Alert Storm, Alert Storming, Alerts, apm, APM Integration, APM Thought Leadership, appdynamics, AppDynamics Pro, application, Application Complexity, application monitoring, Application Performance, Application Performance Management, Application Performance Monitoring, incident management, integration, pagerduty, Production Monitoring
Storage is Killing Your Database Performance
Post by Jim Hirschauer | Apr, 11, 2013 | In Database, Storage
1 Comment
The other day I had the opportunity to speak with a good friend of mine who also happens to be a DBA at a global Financial Services company. We were discussing database performance and I was surprised when he told me that the most common cause of database performance issues (from his experience) was a direct result of contention on shared storage arrays.
After recovering from my initial surprise I had an opportunity to really think things through and realized that this makes a lot of sense. Storage requirements in most companies are growing at an ever increasing pace (big data anyone?). Storage teams have to rack, stack, allocate, and configure new storage quickly to meet demand and don’t have the time to do a detailed analysis on the anticipated workload of every application that will connect to and use the storage. And therein lies the problem.
Workloads can be really unpredictable and can change considerably over time within a given application. Databases that once played nicely together on the same spindles can become the worst of enemies and sink the performance of multiple applications at the same time. So what can you do about it? How can you know for sure if your storage array is the cause of your application/database performance issues? Well, if you use NetApp storage then you’re in luck!
AppDynamics for Databases remotely connects (i.e. no agent required) to your NetApp controllers and collects the performance and configuration information that you need to identify the root cause of performance issues. Before we take a look at the features, let’s look at how it gets set up.
The Config
Step 1: Prepare the remote user ID and privileges on the NetApp controller. The following commands are used for the configuration.
useradmin role add AppD_Role -a api-*,login-http-admin
useradmin group add AppD_Group -r AppD_Role
useradmin user add appd -g AppD_Group
Note: Make sure you set a password for the appd user.
Step 2: Configure AppDynamics to monitor the NetApp controller. Notice that we configure AppDynamics with the the username and password created in step 1.
Step 3: Enjoy your awesome new monitoring (yep, it’s that easy).
The Result
After an incredibly difficult 2 minutes of configuration work we are ready for the payoff. In the AppDynamics for Databases main menu you will see a section for all of your NetApp agents.
Let’s do a “drill-up” from the NetApp controller to our impacted database. Clicking into our monitored instance we see the following activity screen.
By clicking on the purple latency line inside of the red box in the image above we can drill into the volume that has the highest response time. Notice in the scree grab below that we have a link at the bottom of the page where we can drill-up into the database that is attached to this storage volume. This relationship is built automatically by AppDynamics for Databases.
Clicking on the “Launch In Context” link we are immediately transfered to the Oracle instance activity page shown below.
In just the same manner as we can drill-up from storage to database, we can also drill-down from database to storage. Notice the screen grab below from an Oracle instance activity screen. Clicking on the “View NetApp Volume Activity” link will launch the NetApp activity screen shown earlier for the volumes associated with this Oracle instance. It’s that easy to switch between the views you need to solve your applications performance issues.
Imagine being able to detect an end user problem, drill down through the code execution, identify the slow SQL query, and isolate the storage volume that is causing the poor performance. That’s exactly what you can do with AppDynamics.
Storage monitoring in AppDynamics for Databases is another powerful feature that enables application support, database support, and storage support to get on the same page and restore service as quickly as possible. If you have databases connected to NetApp storage you need to take a free trial of AppDynamics for Databases today.
Link to this post:



















