TAG | Problem Resolution
Planning to deploy or migrate an application to a cloud environment is a big deal. In my last post we discussed the value of using real business and IT requirements to drive the justification of using a cloud architecture. We also explored the importance of using monitoring information to understand your before and after picture of application performance and overall success.
In this post I am going to dive deeper into the planning phase. You can’t expect to throw a half assed plan in place and just deal with problems as they pop up during an application migration. That will almost certainly result in frustration for the end users, IT staff, and the business who relies upon the application.
In reality, at least 90% of your total project time should be dedicated to planning and at most 10% to the actual implementation. The big question is “What are the most important aspects of the planning phase?”. Here’s my cloud migration planning top 10 list:
- Application Portfolio Rationalization - Let’s face reality for a moment… If you’re in a large enterprise you have multiple apps that perform a very similar business function at some level. Application Portfolio Rationalization is a method of discovering the overlap between your application and consolidating where it makes sense. It’s like spring cleaning for your IT department. You need to get your house in order before you decide to start moving applications or you will just waste a lot of time and money moving duplicate business functionality across your portfolio.
- Business Justification and Goal Identification – If there is one thing I try to make clear in every blog post it is the fact that you need to justify your activities using business logic. If there is no business driver for a change then why make the change? Even very techie-like activities can be related back to business drivers.
Example… Techie Activity: Quarterly server patching Business Driver: Failure to patch exposes the business to risk of being hacked which could cause brand damage and loss of revenue.
I included goal identification with business justification because your goals should align with the business drivers responsible for the change.
- Current State Architecture Assessment (App and Infra) – This task sounds simple but is really difficult for most companies. Current State Architecture Assessment is all about documenting the actual deployed application components, infrastructure components, and application dependencies. Why is this so difficult? Most enterprises have implemented a CMDB to try and document this information but the CMDB is typically manually populated and updated. What happens in reality is that over time the CMDB is neglected when application and infrastructure changes occur. In order to solve this problem some companies have turned to Automated discovery and dependency mapping tools. These tools are usually agentless so they login to each server and scan for processes, network connections, etc… at pre-defined intervals and create a very detailed mapping that includes all persistent connections to and from each host regardless of whether or not they are application related. The periodic scans also miss the short lived services calls between applications unless the scan happens to be at approximately the same time of the transient application call. An agent based APM tool covers all the gaps associated with these other methods.
- Current State Performance Assessment – Traditional monitoring metrics (CPU, Memory, Disk I/O, Network I/O, etc…) will help you size your cloud environment but tell you nothing about the actual application performance. The important performance metrics encompass end user response time, business transaction response time, external service response time, error and exception rates, transaction throughput, with baselines for each. This is also a good time to make sure there are no glaring performance issues that you are going to promote into your cloud environment. It’s better to fix any known issues before you migrate as the extra complexity of the cloud can amplify your application problems.
- Architectural Change Impact Assessment – Now that you know what your real application and infrastructure components are, you need to assess the impact of the difference between traditional and cloud architectures. Are there components that wont work well (or at all) in a cloud architecture? Are code changes required to take advantage of the dynamic features available in your cloud of choice? You need to have a very good understanding of how your application works today and how you want it to work after migration and plan accordingly.
- Problem Resolution Planning – Problem resolution planning is about making a commitment to your monitoring tools and strategy as a core layer of your overall application architecture. The number of potential points of failure increases dramatically from traditional to cloud environments due to increased virtualization and dynamic scaling. In highly distributed applications you need monitoring tools that will tell you exactly where problems are occurring or you will spend too much time isolating the problem location. Make monitoring a part of your application deployment and support culture!!!
- Process re-alignment – Just hearing the word “process” makes me cringe and have flashbacks to the giant, bloated , slow moving enterprise environments that I used to call my workplace. The unfortunate truth is that we really do need solid processes if we want to maintain order and have any chance of managing a large environment in a sustainable fashion. Many of the traditional IT development and operations processes need to be modified when we migrate to the cloud so you can’t overlook this task.
- Application re-development – The fruits of your Architectural Change Impact Assessment work will probably precipitate some level of development work within your application. Maybe only minor tweaks are required, maybe significant portions of your code need to change, maybe this application should never have been selected as a cloud migration candidate. If you need to change the application code you need to test it all over again and measure the performance.
- Application Functional and Performance Testing – No surprises here, after the code has been modified to function as intended with your cloud deployment it needs to be tested. APM tools really speed up the testing process since they show you the root of your performance problems down to the line of code level. If you rely only upon the output of your application testing suite your developers will spend hours trying to figure out what code to change instead of minutes fixing the problematic code.
- Training (New Processes and Technology) – With all of those new and/or modified processes and new software required to support your cloud application training is imperative. Never forget the “people” part of “people, process, technology”.
There’s a lot more that really goes into planning a cloud migration but these major tasks are the big ones in my book. Give these 10 tasks the attention they deserve and the odds will shift in your favor for a successful cloud migration. Next week we’ll talk about important work that should happen after your application gets migrated.Link to this post:
We recently finished conducting our annual Application Performance Management survey. Over 250 IT professionals participated, and they shared insights such as:
- Many Ops and Dev teams are anticipating growth in their applications by 20% or more
- Over 50% are planning to move to the cloud, and are architecting brand-new applications to be cloud-ready
- Most teams are using log files to monitor application performance, rather than an Application Performance Management (APM) tool.
We’ll release the full report soon, but here’s an infographic that summarizes some of the main findings:
Embed this image on your site:
What I found personally surprising was the heavy reliance on log files. When you’re troubleshooting distributed architectures, time is of the essence–and there’s no way to cut your MTTR down when you’re relying on log files to identify root cause.
In fact, there’s only one guy who ever made using a log file look cool:
And I think we can all agree that’s a pretty unique use case.
We’ll have the full survey results available soon.
Link to this post: