It’s an important piece of a successful and high performing DevOps organization to treat configuration management and infrastructure as code. Why? Because if someone working within a cloud provider can apply the same rigor of quality to systems that development teams do to application code, it will prevent environmental drift, enable more rapid deployments, provide solutions that scale, and ensure maintainable system integrity. Using source control, continuous integration, testing, and other software development practices will allow for sustainable automation at the operational level. Infrastructure as Code is especially powerful in that it can create an underlying base of stability that operational-focused teams demand, while also providing the support to move quickly in a way that business and application developers often require.

Also see part 2 here: Demystifying Configuration Management.

Infrastructure as Code (IaC)

What is Infrastructure as Code (IaC)? It is a tool/process/system that allows developers to represent virtual systems like EC2 instances in AWS as representative code. This representation can ensure cloud environments are idempotent, meaning that the environment can be replicated the same way each time the process generates the environment. The IaC is usually declarative, meaning that it is written which specifies how the end state of the environment should exist, rather than listing the procedures to build it like a run book. Here are some resources talking about what Infrastructure as Code is:

Listed below are examples of Infrastructure as Code along with some introductory information about them.

Terraform

Terraform is written in HashiCorp Configuration Language (HCL), a YAML-like, human readable declarative configuration file. There are providers for each of the major cloud services. It’s a stateful system that allows tracking of changes throughout the process of building and deploying. You can configure, run, and manage the states and processes within your own ecosystems. Terraform has a CDK for using TypeScript to build out more complex configurations. Terraform Cloud provides a stateful backend service to manage state in their platform. The basic steps include:

  • Installation
  • Authentication
  • Write configs
  • Initialize, format, validate
  • Deploy

Code looks something like this when built:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.16"
    }
  }

  required_version = ">= 1.2.0"
}

provider "aws" {
  region  = "us-west-2"
}

resource "aws_instance" "app_server" {
  ami           = "ami-830c94e3"
  instance_type = "t2.micro"

  tags = {
    Name = "ExampleAppServerInstance"
  }
}

Pulumi

Pulumi support common language frameworks like Python, JavaScript, and TypeScript. It works with the major providers and has support to extend Terraform. Pulumi manages the state and built like a Software Development Kit out of the box. Basic steps look like this:

  • Install
  • Select runtime
  • Authenticate
  • Create project scaffold
  • Write configs
  • Deploy

The code looks something like this when built:

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as awsx from "@pulumi/awsx";

// Create an AWS resource (S3 Bucket)
const bucket = new aws.s3.Bucket("my-bucket");

// Export the name of the bucket
export const bucketName = bucket.id;

Cloud Native Solutions

Each cloud provider supports their own internal solution that allows code to be built in the same environment as the infrastructure you are building. These solutions are most beneficial when you are working in 1 cloud provider and don’t need to learn a cloud-agnostic tool to build out against multiple cloud infrastructure providers.

AWS CDK

AWS CDK extends the capabilities provided by AWS CloudFormation to utilize languages that developers are comfortable with like Python or TypeScript. These extend the general support provided by JSON/YAML CloudFormation templates to create more complex environments. With AWS CDK, the general steps are:

  • Install AWS CLI, Node.js, an IDE, and CDK Toolkit
  • Authenticate
  • Initialize, bootstrap
  • Write, synth, deploy

Sample code looks like this when you first start writing:

import * as cdk from 'aws-cdk-lib';
import * as sns from 'aws-cdk-lib/aws-sns';
import * as subs from 'aws-cdk-lib/aws-sns-subscriptions';
import * as sqs from 'aws-cdk-lib/aws-sqs';

export class CdkWorkshopStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const queue = new sqs.Queue(this, 'CdkWorkshopQueue', {
      visibilityTimeout: cdk.Duration.seconds(300)
    });

    const topic = new sns.Topic(this, 'CdkWorkshopTopic');

    topic.addSubscription(new subs.SqsSubscription(queue));
  }
}

Azure Bicep

Bicep is config-like, declarative, and simpler to read than the JSON ARM templates Azure supports. Code is written in VSCode with a bicep extension:

  • Install VSCode, Bicep Extension
  • Install Azure CLI, Azure PowerShell
  • Install Bicep
  • Write Bicep
  • Deploy
param location string = resourceGroup().location
param storageAccountName string = 'toylaunch${uniqueString(resourceGroup().id)}'

resource storageAccount 'Microsoft.Storage/storageAccounts@2021-06-01' = {
  name: storageAccountName
  location: location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'StorageV2'
  properties: {
    accessTier: 'Hot'
  }
}

Google Cloud Deployment Manager

  • Configure cloud account
  • Enable API
  • Install gcloud
  • Write configs
  • Deploy
resources:
- name: vm-created-by-deployment-manager
  type: compute.v1.instance
  properties:
    zone: us-central1-a
    machineType: zones/us-central1-a/machineTypes/n1-standard-1
    disks:
    - deviceName: boot
      type: PERSISTENT
      boot: true
      autoDelete: true
      initializeParams:
        sourceImage: projects/debian-cloud/global/images/family/debian-9
    networkInterfaces:
    - network: global/networks/default

Where to Implement IaC

There’s no simple way to mock or replicate a local laptop environment to mimic a cloud environment. So it’s important to have a robust and repeatable way to safely spin up sandbox accounts to test cloud capabilities. It’s within those accounts that any IaC code can be written to replicate what is often done in the console. From that point forward, as code moves up towards production like standard application code deployments, there should be more regulation ensuring IaC is deploying in higher environments and humans aren’t making changes manually. These manual changes will either be overwritten or cause conflict with IaC deployed in those areas.