Amazon Web Services Feed
AWS CloudFormation StackSet Orchestration: Automated deployment using AWS Step Functions
We often use AWS CloudFormation StackSets to automatically deploy infrastructure into many different accounts. Whether they are managed by AWS Control Tower or AWS Organizations, StackSets provide a simple and automated way to handle the creation of resources and infrastructure right after provisioning a new account.
You can automatically deploy StackSets to accounts that belong to one or many organizational units in AWS Organizations. This workflow is not suitable for every use-case, however, especially when you must override some parameters of the StackSets, depending on the target account.
To solve this problem, we have created a solution that allows you to automatically deploy StackSet instances into specific accounts. This solution uses Amazon S3, AWS Step Functions, and YAML configuration files. We use this implementation because it allows us to specify the StackSet deployment configuration of our accounts as source code files. This is a characteristic of the infrastructure as code paradigm, and it fits well with the automation side of DevOps culture. If you are interested, read on!
Overview
Here is an architectural diagram of this implementation:
The workflow is triggered as soon as an authorized IAM user pushes a specially formatted YAML file to an Amazon S3 bucket. This file specifies an AWS account number and a list of StackSets to deploy into it. For more information about the structure of this file, check the usage section later in this post. An authorized user is a user who is authorized to assume an IAM role that allows the PutObject action on the previously mentioned S3 bucket.
On object creation, Amazon S3 sends a notification to an AWS Lambda function. This function fetches the YAML file, parses its contents, and triggers an AWS Step Functions state machine, sending the YAML file contents as input. The state machine then creates all the specified StackSets on the target account.
This workflow can also be used to automatically delete all of the StackSet instances specified in the file for a specific account. For information about how to delete the StackSet instances, check the usage section later in this post.
The following diagram shows the state machine definition:
This state machine uses the Map state to iterate over the list of StackSet instances to be created.
The CreateUpdateDeleteStackInstances and VerifyStackInstanceStatus states are tasks that are backed by AWS Lambda functions. The IsStackSetInstanceReady state directs the workflow to different states, depending on the previous function output.
The CreateUpdateDeleteStackInstances state creates, updates, or deletes a StackSet instance from the specified account, according to the configuration specified in the YAML file. The next step in the state machine is the VerifyStackInstanceStatus state.
The VerifyStackInstanceStatus checks the status of the StackSet instance. If it is OUTDATED, and there is an error message in the StatusReason box of the StackSet instance, the state machine fails. In all other cases, the Lambda function sends the StackSet instance status to the IsStackSetInstanceReady state.
The IsStackSetInstanceReady state redirects the workflow to the VerifyStackInstanceStatus state when the StackSet instance is not yet in a CURRENT state or to the Done state when it is finished.
Security considerations
From a security point of view, the solution is safe due to the automation of all of the deployment and deletion operations. The remaining risk surface is the input of files to the S3 bucket. You can use standard Amazon S3 security mechanisms like S3 bucket policies or IAM policies or some other process to protect it.
This example uses an IAM role (StacksetAdministrator). This role is created with a trust relationship that allows an AWS principal specified as a parameter at deployment time to assume the role and put objects in the bucket.
Prerequisites
To use this solution, you need the following:
- This implementation uses the AWS Serverless Application Model (AWS SAM) to deploy the required infrastructure. Be sure to install the AWS SAM CLI if you want to deploy the code. Follow the recommended installation procedure appropriate for your operating system.
- Docker is for the AWS SAM build (–use-container) to locally package the Lambda functions in a Docker container . This functions like a Lambda environment, so the Lambda functions are in the right format when you deploy them to the AWS Cloud.
- An S3 bucket is used by the AWS SAM CLI to upload the Lambda packages that are used to provision the Lambda functions. This bucket is different from the bucket that contains the YAML configuration files. That bucket is created by the AWS CloudFormation template at deployment time.
- Lastly, make is used to deploy the resources, wrapping the AWS SAM CLI commands. If you do not have make on your workstation, or you do not want to install it, you can run the commands that are specified inside of the makefile manually.
Deployment
To deploy the solution, follow these steps:
# Clone the respository
git clone https://github.com/aws-samples/aws-cloudformation-stackset-orchestration
# Move to the repository's directory
cd aws-cloudformation-stackset-orchestration
Be sure to configure your AWS credentials before running the next step. The next step packages all of the Lambda functions and uploads them to the specified S3 bucket.
# Use the deployment target on the provided Makefile, provide your own bucket name and IAM principal
make deploy s3_bucket=yourbucketname stackset_administrator_principal=arn:aws:iam::123456789876:role/Admin
An IAM role named StacksetAdministrator is created with this deployment. The role allows IAM users to push objects to the S3 bucket. If you are not assuming this role, be sure that the principal you are using to push objects has sufficient permissions on the S3 bucket. Otherwise, you will get a permissions error. The S3 bucket is named stackset-orchestration-bucket-${AWS::AccountId}.
After the deployment is complete, you can use the solution.
Usage
First, you need a StackSet to test the solution. We have provided a sample, vpc.yaml, in the “stackset-examples” directory.
To create the StackSet, follow these steps:
# Create the example VPC StackSet
aws cloudformation create-stack-set --stack-set-name vpc
--template-body file://stackset-examples/vpc.yaml
--administration-role-arn arn:aws:iam::xxxxxxxxxxxx:role/AWSCloudFormationStackSetAdministrationRole
--execution-role-name AWSCloudFormationStackSetExecutionRole
Be sure to specify the administration role and the execution role.
You need a YAML file for each account that you are going to process. Although you can give the file any name, we suggest you name it according to the account’s name, usage, or purpose (for example, Application1-Dev-Account.yaml). After the file is created, you can upload it to the input S3 bucket, stackset-orchestration-bucket-${AWS::AccountId}.
The YAML file must be structured like so:
---
account: '123456789876'
stacksets:
- name: vpc
parameters:
CidrBlock: '10.0.0.0/24'
EnableDnsHostnames: "true"
This example has one StackSet in the stacksets
section, but you can add as many as you like.
In this example, the vpc
StackSet is deployed into the 123456789876
account. The CidrBlock
parameter is overridden to 10.0.0.0/24
, and the EnableDnsHostnames
parameter is overridden to true
. All other parameters keep their default values.
You can also delete the provisioned StackSet instances by setting the terminate
field in the YAML file to True
as follows:
---
account: '123456789876'
terminate: True
stacksets:
- name: vpc
parameters:
CidrBlock: '10.0.0.0/24'
EnableDnsHostnames: "true"
This deletes the VPC StackSet instance from the 123456789876 account.
Cleanup
To delete all of the resources created in the Deployment and Usage sections, follow these steps:
# Delete the StackSet instances
aws cloudformation delete-stack-instances --stack-set-name vpc --regions <AWS_REGION> --accounts 123456789876 --no-retain-stacks # Delete the StackSet
aws cloudformation delete-stack-set --stack-set-name vpc # Delete the account file
aws s3 rm s3://stackset-orchestration-bucket-<AWS_ACCOUNT_NUMBER/<ACCOUNT_FILENAME>.yaml # Delete the resources created by AWS CloudFormation
aws cloudformation delete-stack --stack-name stackset-orchestration
Caveats
Be mindful of AWS CloudFormation API throttling and StackSet provisioning limits. If you make too many simultaneous AWS CloudFormation API requests, the requests get throttled or eventually result in a server error. If you try to create more than one StackSet instance of the same stack simultaneously, you will also run into an error because StackSet operations are only run sequentially.
These problems are solved by using retries with exponential backoff and jitter at the Step Functions and Lambda levels, respectively. We tested this by uploading 50 YAML files at once to the S3 bucket. The run time for 48 relatively simple to mildly complex StackSets is nearly 2 hours compared to 6–8 hours to deploy them manually.
Despite these workarounds, we suggest you avoid uploading these files in large batches. It’s rarely necessary and results in longer provisioning time.
Conclusion
In this post, we have shown you how automating the deployment of StackSets through Step Functions and Amazon S3 objects can vastly simplify the operations on your AWS accounts. You can manage all of the StackSet infrastructure of your accounts as code through the use of configuration files. Each file represents an account and contains detailed information, including which StackSets are deployed and the parameters used. This, in turn, allows you to simplify the deployment, updates, and deletion of resources.
We hope you find this solution useful. Feel free to contribute to the GitHub repository if you see any possible improvements.
See you next time!
About the Authors
Sebastian Caceres is a DevOps consultant in Professional Services at Amazon Web Services. He helps customers solve problems related to Automation, Cloud Architecture and Infrastructure, Development, Operations and Organizations.
Paul de Monchy is a Principal Solutions Architect at Amazon Web Services, focusing on Telco global accounts.
He was previously part of the AWS Professional Services organization for 3 years, leading customer and partner engagements in Infrastructure and Architecture.
Prior to joining AWS, he has co-founded Spideo, a recommendation engine for video-on-demand and live TV content. He ran this company as its CTO for 7 years.