Gartner predicts that by 2024, 30% of enterprises will adopt observability techniques to improve their business service performance — up from less than 10% in 2020. Indeed, observability is becoming a competitive advantage, and companies with a system in place to determine the health and state of business-critical services without human intervention have a distinctive edge.
At the same time, adoption of Infrastructure-as-Code (IaC) is increasing at an exponential rate, as IaC in the management of IT infrastructure (i.e., networks, containers, virtual machines, load balancers and more) enables organizations to provision and effectively manage large-scale architectures. An overwhelming amount of monitoring data now needs to be put into context to become useful information, and automation has become a non-negotiable feature to achieve this.
As applications are built to scale and organizations increasingly adopt a multi-cloud strategy, it becomes increasingly important and more complicated to observe the entire software lifecycle effectively, from code deployment through build and deploy to alerting and dashboarding. As a result, there is an increasing demand for a new dimension of observability strategy — a scalable method that is easily pluggable into existing IaC and GitOps workflows, a solution that does not require organisations to change their established IaC tooling or approaches.
The scalable monitoring methodology is often referred to as Monitoring as Code (MaC), Monitoring as a Service (MaaS), Everything as Code (EaC), or Observability as Code. The core principle remains the same regardless of what it is called; monitoring and observability should be treated and managed as code without the need for manual intervention.
Observability as Code extends IaC, including GitOps workflows and/or continuous integration/continuous delivery (CI/CD) pipelines.
Anatomy of Observability-as-Code
At AppDynamics, we firmly believe that observability and monitoring configurations should be treated and managed as code, and we are committed to ensuring that our customers can achieve this goal via our open APIs and the provision of an AppDynamics Terraform Provider and Ansible Collections—both of which are currently on our roadmap.
Observability-as-Code can be subcategorized into two broad areas, as depicted in Fig. 1.0.
- Deployment as Code: Automating the agent deployment using the same approach as infrastructure deployment. The outcome of this is to instrument the applications, profile the infrastructure, and send metrics, events, logs and traces (MELT) to AppDynamics.
- Configuration as Code: Configuring alerts, transaction discovery, RBAC, dashboards, health rules, etc., as code.
Deployment as Code
This blog provides practical steps, nuances, rationales and sample implementation manifests aimed at enabling you to gain a quicker return on your investment. We will show you practical working code on how to instrument Amazon ECS workloads as code using the AWS CloudFormation template, Terraform and Ansible. For this demo, we chose ECS Fargate because it’s gaining popularity in the industry and fits quite well with the top three IaC toolings in the marketplace today: AWS CloudFormation, RedHat Ansible and HashiCorp Terraform, respectively.
Please note that we won’t spend time covering standard AWS setups for stacks, policies, roles and the like in great detail, with the assumption that you already have these in place. For help with the setup, please reach out to your AppDynamics representative and we’ll be glad to assist.
Are you ready to get started? Let’s dive in!
1. The AWS CloudFormation Approach
AWS CloudFormation provides AWS customers with a simple way to provision and manage their infrastructure and/or AWS resources as code. Using the CloudFormation template, users can describe all the AWS resources they want to provision (such as ECS, EC2 or Amazon RDS, etc.) in a YAML or JSON template file. CloudFormation takes care of provisioning and configuring those resources.
Here’s how you can natively embed AppDynamics agents into ECS using CloudFormation.
How are we adding agents?
In the examples above, we have provided a working CloudFormation template with and without AppDynamics agents. Perform a diff on both templates to see the difference and get hands-on right away, or continue reading for further information.
Note the changes mounting a volume to this application and the explicit service dependency (with DependsOn parameter). These are the key updates we’re performing to build observability into this application process.
We use the ECS ContainerDependency feature to inject AppDynamics agents into a shared ephemeral volume mount. This approach is similar to how Kubernetes init containers work. Unlike Kubernetes init containers that MUST come to successful completion before the main container comes up, ECS ContainerDependency provides additional conditional granularity for the dependent container with the following (super useful) container start args: START, COMPLETE, SUCCESS and HEALTHY.
Using Fig.3.o as a reference, the AppDynamics container is introduced as a dependent container and is configured to exit on COMPLETION. To create a hard dependency on the dependency (AppDynamics) container, we recommend you mark the container as an essential container by setting the essential flag “true”—doing this means your main container will not start up if the dependent container fails for any reason.
Note: If “essential” is set to false, you will run the risk of running the application without any monitoring, which may also result in false-positive alerts from AppDynamics.
The image of the AppDynamics container can and should be acquired from the official images in Docker Hub; acquiring the image from the official Docker Hub repo means your organization does not have to maintain AppDynamics images. If, however, you prefer to maintain your own base images internally because they need to be hardened (or for any other reason), we recommend you use a multi-stage build process to create your custom images from the official image. An example is provided here.
Moreso, we wrote a custom entry point copy command to copy the agent binaries from the AppDynamics agent (dependent) container onto the ephemeral volume mount. This step ensures the agent binaries are available to the main container when it starts up, as shown in Fig.4.0
Another option, which involves a bit of work and automation scripts — and also has the inherent risk of running out of date — is to mount the agent binaries from persistent storage such as EFS or EBS. You may consider using EFS for the Fargate launch type and EBS for the EC2 launch type. This approach eliminates the need to add the DependsOn container.
In summary, the instrumentation logic and outcome are the same regardless of whether you choose to acquire the official image from Docker Hub at runtime, embed the agents into your image at build time, or load the image from persistent storage. AppDynamics supports all three methods and we encourage our customers to choose what works best for them.
Next, we added all the necessary environment variables needed to dynamically configure the language agent, instrument the application, and send metrics to the controller (these steps can also be reviewed in the AppDynamics official documentation):
We are adding this as its own chapter to highlight that the AppDynamics controller access key (or any other secret!) should be kept confidential at all times and should not be checked into your code repository. We also added a section in the CloudFormation template to demonstrate how you would read the access key from AWS Secrets Manager.
In conclusion, refer to the monitoring-as-code: ECS guide for all of the files used in the above exercise. Should you like to set this up from scratch as a proof-of-concept, we have a full example, including scripts to help create the necessary policies and roles. Please refer to the corresponding README for further guidance.
2. The Terraform Approach
The Terraform example uses the same logic as the CloudFormation template to instrument your ECS Fargate workloads using ApDynamics. All the artifacts are stored in this GitHub repository.
The primary considerations that went into the design of this project are:
- Customers’ existing container images and/or the image build process should be unaltered.
- The deployment process must remain immutable.
- Idempotency: customers should get the same instrumentation result even if the Terraform config is applied multiple times.
- The AppDynamics access key must be stored and accessed from the AWS secret manager, not as plaintext.
In addition, we leveraged AWS CloudFormation’s DependsOn attribute to:
- Dynamically acquire the AppDynamics agent image from DockerHub. You may copy the image to your preferred registry.
- Copy the content of the agent image into an ephemeral volume.
- Mount the shared volume into the main application’s container at runtime.
Refer to the ecs-main.tf file in the ECS module for details.
3. The Ansible Playbook approach
Please refer to the AppDynamics repository Monitoring as Code: Cloud with Ansible for source code and details on how to execute the playbook. This project leverages the AWS Ansible module to demonstrate how to embed the AppDynamics agent into an ECS Fargate instance.
Furthermore, the project creates an ECS Fargate Cluster that allows you to define Tasks that describe your containers, as defined in Task Definition. Tasks can be scaled in and out with ECS Cluster Services, including all the supporting services. Also, the project creates secrets, like the controller access key, and stores them in AWS Secrets Manager.
Refer to the ecs_task_definition.yml file to see how AppDynamics agents are added to the ECS Fargate instance.
Today’s software challenges and the market shift towards the cloud are fueling automation and promoting strategic thinking over tactical course changes. It’s the path we’re walking with our customers every day.
Knowing the ins and outs of customers’ tools and infrastructure, asking the right questions, and predicting obstacles along the way helps us successfully drive AppDynamics’ observability-as-code adoption with cloud teams across organizations. This approach provides the native and built-in solutions for understanding each system based on its internal mechanisms and outputs.