← Back to homepage

Bootstrapping Microservices with AWS - Chapter 7

Welcome back to Bootstrapping Microservices with AWS! In the last edition we got to create a Kubernetes cluster and deploy an application to it. Now we're going to do the exact same thing but with a twist: we'll provision our infastructure with code! Infastructure as Code helps prevent mistakes and make provisioning and destroying resources much quicker and more consistent. There are quite a few ways to provision infastructure as code but the book uses Terraform to provision resources in Azure. So this guide will help you follow along with the book and write Terraform code to deploy an EKS cluster using AWS as a cloud provider. You can alter the code in the offical github repo with the help of this guide or see the finished product with my forked version of this github repo.

7.7 Creating an Azure resource group for your application

The title of section 7.7 in the book references an Azure resource group, but we're working with AWS so let's do it our way! AWS has resource groups too so we can do the same thing, but we are presented with a few differences in this example. We'll need to use the AWS provider instead of the Azure provider, connect to our AWS account, and fill out some more information for our resource group than what Azure requires.

To use the AWS provider, let's check out the Terraform docs. Here we see we can set up the provider with some simple code like below:

terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}

provider "aws" {
region = "us-east-2"
}

Let's replace our providers.tf with this. If you remember from previous entires, I'll be using the us-east-2 region for this series. If you're using a different region make sure to update it here.

To connect our AWS account, we can just reference the credentials in our .aws directory. If you don't have credentials in your .aws directory, you can follow along with the beginning of part one of this series here. If you have multiple profiles defined in your .aws directory like I do, you can specify here too. Let's update our AWS provider code here.

provider "aws" {
region = "us-east-2"
shared_credentials_files = ["/Users/YOUR USER NAME/.aws/credentials"]
profile = "YOUR PROFILE (if necessary)"
}

The file path above for credentials will work for Linux/Mac users. If you're using Windows, try C:\Users\USERNAME\.aws\credentials

Now we just have to update our resource group. If you look at the Terraform docs for creating a resource group, you'll see that are some extra required fields to create a resource group compared to the Azure provider. We can fill all this out by using a resource type filter that allows all supported AWS resource types and only include resources with the key Stage and value Test with the code below.

resource "aws_resourcegroups_group" "flixtube" {
name = "flixtube"
description = "resource group for flixtube"
resource_query {
query = jsonencode({
ResourceTypeFilters = ["AWS::AllSupported"],
TagFilters = [{
"Key" : "Stage",
"Values" : ["Test"]
}]
})
}
}

And that's all for example 1! Try running terraform init then terraform apply and see if this creates a new resource group in your account. Since we're just getting used to provisioning infrastructure with code, let's check the aws console to confirm. We can do this by navigating to the Resource Groups & Tag Editor service in the console and we should see something like this

You can also see the api call Terraform made for you by navigating to CloudTrail in your AWS console. CloudTrail is a service that logs API calls to your AWS account by default. If you click through your events, you should find something that looks like this

Once you see your resource group in your account, make sure to follow along in the book and run terraform destroy and make sure your resource group gets removed. This will help us for the next examples, but it also shows one of the biggest benefits of using Terraform, how easy it is to take down multiple resources.

7.8 Creating your container registry

Great! Now let's move on to example 2. This example has us creating a container registry. This step is quite easy with AWS! According to the docs we just need to provide a name for the registry. While we're at it, we should add the tags necessary to include our registry in the resource group.

The example in the book also has us outputting the registry hostname, username and password. As mentioned in the book, outputs are useful for debugging but you should avoid outputting sensitive values. Azure's container registry has a static username and password that can be used as an output, but AWS doesn't use a static username and password to authenticate to ECR. Instead, AWS uses temporary tokens, as we've used in previous entries. So for our AWS version of this chapter, let's just simply output the hostname to be sure that our resource was deployed as expected and so we understand how outputs work. Practicing with the sensitive = true argument can be good practice too. This leaves our code looking like this:

resource "aws_ecr_repository" "flixtube" {
name = "flixtube"
Stage = "Test"
}
}

output "repository_url" {
value = aws_ecr_repository.flixtube.repository_url
}

Since we're in a new directory from the last example, we'll need to run terrform init again. Then we should be able to run terrform apply and see a new repository in the console if we navigate to ECR! We should also see this repo as part of the resource group, as shown below.

Resource group in AWS with ECR repository

That was easy! Now let's run terraform destroy and move on to example 3.

7.10 Creating our Kubernetes cluster

Now it's time for what we've all been waiting for, deploying a Kubernetes cluster! Before we go any further, remember from part 3 that this will cost us a little bit of money, but if we do everything right, it should be around 12 cents. If we don't destroy our resrouces right after we're done with them though, the cost can skyrocket so make sure to not leave anything running!

To launch our cluster, we'll need to define an IAM role for our cluster, create our cluster, define an IAM role for our node group, and create our node group. It's important we follow in that order.

Let's start by defining our cluster's IAM role. There's a few types of syntax we can use to do this. We could use Heredoc syntax, JSON encoding, or define our role as a data resource. The first two involve us adding an arbitrary string to our file that Terraform can't validate. If we were to have any typos, we wouldn't have any sort of specified error message. Defining our IAM roles as a data resource however will give Terraform the ability to catch typos if we run a terraform validate on our code, so let's go ahead and choose that route.

Let's create a new file for our role called eks-role.tf. We'll start by defining an IAM policy document to allow EKS to assume an IAM role.

data "aws_iam_policy_document" "assume_role_cluster" {
statement {
effect = "Allow"

principals {
type = "Service"
identifiers = ["eks.amazonaws.com"]
}

actions = ["sts:AssumeRole"]
}
}

Now, let's create the IAM role for our cluster.

resource "aws_iam_role" "flixtube_eks_role" {
name = "flixtube_eks_cluster_role"
assume_role_policy = data.aws_iam_policy_document.assume_role_cluster.json
}

Lastly, let's attach the policy and the role.

resource "aws_iam_role_policy_attachment" "eks_cluster_policy_attachment" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.flixtube_eks_role.name
}

This role has the same managed policy as we used in part 3 just in code form. By now, hopefully you've noticed that we're doing all all of the same steps as part 3. So we now have the role for our cluster, but we need to create the cluster itself. When we created the cluster in part 3, we decided to use all available subnets in our default VPC for ease of launch. With Terraform, we'll first need to write code to get information about our VPC and subnets. This is as easy as getting the default VPC with

data "aws_vpc" "default_vpc" {
default = true
}

then getting the subnets associated with the default VPC with

data "aws_subnets" "subnets" {
filter {
name = "vpc-id"
values = [data.aws_vpc.default_vpc.id]
}
}

Now we have all the information we need to launch our cluster. Let's create a new file for our cluster called kubernetes-cluster.tf and use the following code

resource "aws_eks_cluster" "flixtube" {
name = var.app_name
version = var.kubernetes_version
tags = {
Stage = "Test"
}
vpc_config {
subnet_ids = data.aws_subnets.subnets.ids
}
role_arn = aws_iam_role.flixtube_eks_role.arn
depends_on = [aws_iam_role_policy_attachment.eks_cluster_policy_attachment]
}

There's some key things to notice in the code here. First, notice how the subnet ids are referenced in the subnets varaible we created above. You'll also see that the IAM role we defined earlier gets referenced by its arn here. Most importantly though, look at the array for depends_on. This references the action of attaching our policy to our role and ensures that this role is created with the approriate policy before our cluster is launched.

Awesome! We've now written code to create our cluster. We just need to create a node group. Again we'll need to start by defining an IAM role for the cluster. Create a file called worker-node-role.tf and use this code to create the IAM role

data "aws_iam_policy_document" "assume_role_node" {
statement {
effect = "Allow"

principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}

resource "aws_iam_role" "flixtube_node_role" {
name = "flixtube_node_role"
assume_role_policy = data.aws_iam_policy_document.assume_role_node.json
}
resource "aws_iam_role_policy_attachment" "ecr_readonly_policy_attachment" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.flixtube_node_role.name
}
resource "aws_iam_role_policy_attachment" "eks_cni_policy_attachment" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.flixtube_node_role.name
}
resource "aws_iam_role_policy_attachment" "eks_worker_node_policy_attachment" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.flixtube_node_role.name
}

You should notice that this is using the same policies as part 3. One note I forgot to mention last time is that this gives all of your nodes permission to read all of your repositories in ECR for your account. To follow the principal of least privilege you can create a custom policy that only allows each node to read from the ARN of your desired ECR repo. Since we're just launching quickly here we'll be fine without this policy, but be sure to keep that in mind when creating any clusters in the real world.

For the final step of launching our cluster, we'll just need to define our node group. Create one last file called node-group.tf and use this code

resource "aws_eks_node_group" "flixtube" {
cluster_name = var.app_name
node_group_name = "flixtube"
node_role_arn = aws_iam_role.flixtube_node_role.arn
subnet_ids = data.aws_subnets.subnets.ids
instance_types = ["t3.small"]
tags = {
Stage = "Test"
}

scaling_config {
desired_size = 1
max_size = 2
min_size = 1
}

update_config {
max_unavailable = 1
}

depends_on = [
aws_eks_cluster.flixtube,
aws_iam_role_policy_attachment.ecr_readonly_policy_attachment,
aws_iam_role_policy_attachment.eks_cni_policy_attachment,
aws_iam_role_policy_attachment.eks_worker_node_policy_attachment
]
}

There shouldn't be anything groundbreaking in here. This is simply all of the configurations we used in part 3 in code form. A few things to notice are how we can re-use the subnet IDs as defined in our kubernetes-cluster.tf file, can define the cluster name with a variable we'll define when we run terraform apply (as per the book) and that the depends_on array includes all of the role attachments PLUS the creation of the cluster. Can't have a node group without a cluster!

Okay, we've done a lot so far and we're almost home free. We now have all of the Terraform code we need to launch our EKS cluster. If you haven't done so already, let's run terraform init and then see what we'll provision with terraform plan. You should see something like this for the first 2 of 10 resources you'll provision.

If all looks good, run terraform apply and sit back and watch the magic happen. Terraform should ask you for a value for var.app_name and make sure to input flixtube. You should then confirm that you want to create 10 resources, but remember, this will cost a little bit of money (~12 cents) so be sure to destroy your resources right when you're done. Just as with part 3, we'll need to wait for our cluster and our node group to spin up and each can take up to 10 minutes long. This time you can take one long snack break instead of 2 smaller ones. Once this process is done running, look around in the AWS console and see that everything is up and running as expected. You should also see an endpoint for your cluster and a URL for your ECR repository. If it's looking good, then we can start with our final test: deploying our application!

Navigate to example-4 in my companion repo and you should see an updated deploy.yaml script. Make sure to fill in your AWS account ID for the image url and try deploying! This will be the same steps as part 3 but our application is now called "flixtube" instead of "video-streaming." Guess our marketing team finally came up with a good name.

As a refresher, here's the commands you'll have to run. Make sure you're at the root of example-4. First, run

aws eks update-kubeconfig \
--region us-east-2 \
--name flixtube \

to update kubectl to connect to our cluster through the AWS CLI. Then run

docker buildx build --platform=linux/amd64 -t flixtube:1 --file Dockerfile-prod .

if you have an ARM processor or

docker build -t flixtube:1 --file Dockerfile-prod .

if you have an x86 processor to build our image. Then, run

aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin YOUR_ACCOUNT_ID.dkr.ecr.us-east-2.amazonaws.com

to login to our ECR repo and

docker tag flixtube:1 YOUR_ACCOUNT_ID.dkr.ecr.us-east-2.amazonaws.com/flixtube:1

to tag our built image. Finally, run

docker push YOUR_ACCOUNT_ID.dkr.ecr.us-east-2.amazonaws.com/flixtube:1

to push our built and tagged image to our ECR repo then

kubectl apply -f scripts/deploy.yaml

to run our deployment script. That should do it!

You should now be able to run kubectl get pods to see your pods and kubectl get services to get a URL of your load balancer. Copy the URL of your load balacner, navigate to /video and you should get the familiar but nevertheless glorious dinosaur video.

Our reward for creating a Kubernetes cluster with code!

And that's it! Congratulations on deploying a microservice application to a EKS cluster you created entirely with code! When you're done poking around and experimenting with the result, make sure to run kubectl delete -f scripts/deploy.yaml to delete the deployment to our cluster, delete the image in our ECR repo, and most importantly, navigate back to example-3 and run terraform destroy to get rid of all these resources before they start racking up charges to your account. I recommend double checking in your console under EKS and EC2 to make sure there are no services still running.

I hope you found this helpful!