While containers have revolutionized how development teams package and deploy applications, these teams have had to carefully monitor releases and build custom tooling to mitigate deployment risks, which slows down shipping velocity. At scale, development teams spend valuable cycles building and maintaining undifferentiated deployment tools instead of innovating for their business.
Starting today, you can use the built-in blue/green deployment capability in Amazon Elastic Container Service (Amazon ECS) to make your application deployments safer and more consistent. This new capability eliminates the need to build custom deployment tooling while giving you the confidence to ship software updates more frequently with rollback capability.
Here’s how you can enable the built-in blue/green deployment capability in the Amazon ECS console.
You create a new “green” application environment while your existing “blue” environment continues to serve live traffic. After monitoring and testing the green environment thoroughly, you route the live traffic from blue to green. With this capability, Amazon ECS now provides built-in functionality that makes containerized application deployments safer and more reliable.
Below is a diagram illustrating how blue/green deployment works by shifting application traffic from the blue environment to the green environment. You can learn more at the Amazon ECS blue/green service deployments workflow page.
Amazon ECS orchestrates this entire workflow while providing event hooks to validate new versions using synthetic traffic before routing production traffic. You can validate new software versions in production environments before exposing them to end users and roll back near-instantaneously if issues arise. Because this functionality is built directly into Amazon ECS, you can add these safeguards by simply updating your configuration without building any custom tooling.
Getting started
Let me walk you through a demonstration that showcases how to configure and use blue/green deployments for an ECS service. Before that, there are a few setup steps that I need to complete, including configuring AWS Identity and Access Management (IAM) roles, which you can find on the Required resources for Amazon ECS blue/green deployments Documentation page.
For this demonstration, I want to deploy a new version of my application using the blue/green strategy to minimize risk. First, I need to configure my ECS service to use blue/green deployments. I can do this through the ECS console, AWS Command Line Interface (AWS CLI), or using infrastructure as code.
Using the Amazon ECS console, I create a new service and configure it as usual:
In the Deployment Options section, I choose ECS as the Deployment controller type, then Blue/green as the Deployment strategy. Bake time is the time after the production traffic has shifted to green, when instant rollback to blue is available. When the bake time expires, blue tasks are removed.
We’re also introducing deployment lifecycle hooks. These are event-driven mechanisms you can use to augment the deployment workflow. I can select which AWS Lambda function I’d like to use as a deployment lifecycle hook. The Lambda function can perform the required business logic, but it must return a hook status.
Amazon ECS supports the following lifecycle hooks during blue/green deployments. You can learn more about each stage on the Deployment lifecycle stages page.
- Pre scale up
- Post scale up
- Production traffic shift
- Test traffic shift
- Post production traffic shift
- Post test traffic shift
For my application, I want to test when the test traffic shift is complete and the green service handles all of the test traffic. Since there’s no end-user traffic, a rollback at this stage will have no impact on users. This makes Post test traffic shift suitable for my use case as I can test it first with my Lambda function.
Switching context for a moment, let’s focus on the Lambda function that I use to validate the deployment before allowing it to proceed. In my Lambda function as a deployment lifecycle hook, I can perform any business logic, such as synthetic testing, calling another API, or querying metrics.
Within the Lambda function, I must return a hookStatus
. A hookStatus
can be SUCCESSFUL
, which will move the process to the next step. If the status is FAILED
, it rolls back to the blue deployment. If it’s IN_PROGRESS
, then Amazon ECS retries the Lambda function in 30 seconds.
In the following example, I set up my validation with a Lambda function that performs file upload as part of a test suite for my application.
import json
import urllib3
import logging
import base64
import os
# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
# Initialize HTTP client
http = urllib3.PoolManager()
def lambda_handler(event, context):
"""
Validation hook that tests the green environment with file upload
"""
logger.info(f"Event: {json.dumps(event)}")
logger.info(f"Context: {context}")
try:
# In a real scenario, you would construct the test endpoint URL
test_endpoint = os.getenv("APP_URL")
# Create a test file for upload
test_file_content = "This is a test file for deployment validation"
test_file_data = test_file_content.encode('utf-8')
# Prepare multipart form data for file upload
fields = {
'file': ('test.txt', test_file_data, 'text/plain'),
'description': 'Deployment validation test file'
}
# Send POST request with file upload to /process endpoint
response = http.request(
'POST',
test_endpoint,
fields=fields,
timeout=30
)
logger.info(f"POST /process response status: {response.status}")
# Check if response has OK status code (200-299 range)
if 200 <= response.status < 300:
logger.info("File upload test passed - received OK status code")
return {
"hookStatus": "SUCCEEDED"
}
else:
logger.error(f"File upload test failed - status code: {response.status}")
return {
"hookStatus": "FAILED"
}
except Exception as error:
logger.error(f"File upload test failed: {str(error)}")
return {
"hookStatus": "FAILED"
}
When the deployment reaches the lifecycle stage that is associated with the hook, Amazon ECS automatically invokes my Lambda function with deployment context. My validation function can run comprehensive tests against the green revision—checking application health, running integration tests, or validating performance metrics. The function then signals back to ECS whether to proceed or abort the deployment.
As I chose the blue/green deployment strategy, I also need to configure the load balancers and/or Amazon ECS Service Connect. In the Load balancing section, I select my Application Load Balancer.
In the Listener section, I use an existing listener on port 80 and select two Target groups.
Happy with this configuration, I create the service and wait for ECS to provision my new service.
Testing blue/green deployments
Now, it’s time to test my blue/green deployments. For this test, Amazon ECS will trigger my Lambda function after the test traffic shift is completed. My Lambda function will return FAILED
in this case as it performs file upload to my application, but my application doesn’t have this capability.
I update my service and check Force new deployment, knowing the blue/green deployment capability will roll back if it detects a failure. I select this option because I haven’t modified the task definition but still need to trigger a new deployment.
At this stage, I have both blue and green environments running, with the green revision handling all the test traffic. Meanwhile, based on Amazon CloudWatch Logs of my Lambda function, I also see that the deployment lifecycle hooks work as expected and emit the following payload:
[INFO] 2025-07-10T13:15:39.018Z 67d9b03e-12da-4fab-920d-9887d264308e Event:
{
"executionDetails": {
"testTrafficWeights": {},
"productionTrafficWeights": {},
"serviceArn": "arn:aws:ecs:us-west-2:123:service/EcsBlueGreenCluster/nginxBGservice",
"targetServiceRevisionArn": "arn:aws:ecs:us-west-2:123:service-revision/EcsBlueGreenCluster/nginxBGservice/9386398427419951854"
},
"executionId": "a635edb5-a66b-4f44-bf3f-fcee4b3641a5",
"lifecycleStage": "POST_TEST_TRAFFIC_SHIFT",
"resourceArn": "arn:aws:ecs:us-west-2:123:service-deployment/EcsBlueGreenCluster/nginxBGservice/TFX5sH9q9XDboDTOv0rIt"
}
As expected, my AWS Lambda function returns FAILED
as hookStatus
because it failed to perform the test.
[ERROR] 2025-07-10T13:18:43.392Z 67d9b03e-12da-4fab-920d-9887d264308e File upload test failed: HTTPConnectionPool(host='xyz.us-west-2.elb.amazonaws.com', port=80): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f8036273a80>, 'Connection to xyz.us-west-2.elb.amazonaws.com timed out. (connect timeout=30)'))
Because the validation wasn’t completed successfully, Amazon ECS tries to roll back to the blue version, which is the previous working deployment version. I can monitor this process through ECS events in the Events section, which provides detailed visibility into the deployment progress.
Amazon ECS successfully rolls back the deployment to the previous working version. The rollback happens near-instantaneously because the blue revision remains running and ready to receive production traffic. There is no end-user impact during this process, as production traffic never shifted to the new application version—ECS simply rolled back test traffic to the original stable version. This eliminates the typical deployment downtime associated with traditional rolling deployments.
I can also see the rollback status in the Last deployment section.
Throughout my testing, I observed that the blue/green deployment strategy provides consistent and predictable behavior. Furthermore, the deployment lifecycle hooks provide more flexibility to control the behavior of the deployment. Each service revision maintains immutable configuration including task definition, load balancer settings, and Service Connect configuration. This means that rollbacks restore exactly the same environment that was previously running.
Additional things to know
Here are a couple of things to note:
- Pricing – The blue/green deployment capability is included with Amazon ECS at no additional charge. You pay only for the compute resources used during the deployment process.
- Availability – This capability is available in all commercial AWS Regions.
Get started with blue/green deployments by updating your Amazon ECS service configuration in the Amazon ECS console.
Happy deploying!
— Donnie