Amazon Web Services Feed
Migrating X-Ray tracing to AWS Distro for OpenTelemetry

In the context of containerized microservices we face the challenge of being able to tell where along the request path things happen and efficiently drill into signals and logs. As a developer, you don’t want to fly blind and one popular way to provide these insights is distributed tracing. In this post we walk you through migrating your distributed tracing setup for AWS X-Ray using AWS Distro for OpenTelemetry using Amazon Elastic Kubernetes Service (EKS).

AWS Distro for OpenTelemetry (ADOT) is the AWS distribution of the Cloud Native Computing Foundation (CNCF) OpenTelemetry project. ADOT enables you to use a standardized set of open source APIs, SDKs, and agents to instrument your applications once and collect signals for multiple analytics solutions. In the following we will be focusing on the telemetry aspect of distributed traces and their consumption in AWS X-Ray. We assume, in the context of this post, that you’re using X-Ray today already and want to migrate to ADOT. Further, since we’re using EKS here in this post, some familiarity with Kubernetes is required on your end. Note that while we’re demonstrating the setup using a Kubernetes environment, the same functionality is possible to achieve in Amazon Elastic Container Service (ECS).

The target setup with an ADOT-enabled tracing for X-Ray looks as follows:

ADOT tracing setup

What we have in above setup is the ADOT Collector (see below “Background” section for details), as a side car of your application, sending the traces to X-Ray. For the ADOT Collector to be able to write to X-Ray, we’re using a “least privileges” feature of EKS called IAM roles for service accounts (IRSA).

Before we get into the nitty gritty details, let’s step back a bit and make sure we’re all on the same page concerning the terms used in OpenTelemetry.

Background

The most important OpenTelemetry terms in the context of our discussion are as follows:

A collector is a set of components collecting and processing traces instrumented. The collector can do aggregation, smart sampling, and export traces a one or more tracing backends. The collector allows further processing of collected telemetry, such as adding additional attributes or scrubbing personal information.

Within a collector, one or more pipelines may be defined, each defining a path the data follows by using one or more of:

  • A receiver is how data gets into the OpenTelemetry collector. Generally, a receiver accepts data in a specified format, translates it into the internal format and passes it to one or more processors.
  • A processor can transform the data before forwarding it, that is can drop the data or add to it and forward it to an exporter.
  • An exporter typically forward the data they get to a destination such as over the network to a backend like X-Ray or to a local file.

Further, in context of distributed traces, we’re using the following terms:

A trace in OpenTelemetry can be thought of as a directed acyclic graph (DAG) of spans, where the edges between Spans are defined as parent/child relationship. A span encapsulates one or more events along with the start and finish timestamp as well as attributes (key-value pairs).

In ADOT we maintain the AWS OTel Collector, with a default configuration allowing you to send traces to X-Ray and metrics to CloudWatch:

AWS OTEL collector

With terminology and the relevant ADOT components out of the way, let’s see it in action.

Preparation

We want to set up an EKS cluster using eksctl that allows us to send traces to X-Ray using ADOT. For this, we first define
a cluster configuration (see also the configuration for eksctl docs for more on this) in a file called cluster-config.yaml:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata: name: adotxray region: eu-west-1 version: '1.18'
iam: withOIDC: true serviceAccounts: - metadata: name: xray namespace: adot labels: {aws-usage: "application"} attachPolicy: Version: "2012-10-17" Statement: - Effect: Allow Action: - "logs:PutLogEvents" - "logs:CreateLogGroup" - "logs:CreateLogStream" - "logs:DescribeLogStreams" - "logs:DescribeLogGroups" - "xray:PutTraceSegments" - "xray:PutTelemetryRecord" - "xray:GetSamplingRules" - "xray:GetSamplingTargets" - "xray:GetSamplingStatisticSummaries" - "ssm:GetParameters" Resource: '*'
managedNodeGroups:
- name: default-ng minSize: 1 maxSize: 3 desiredCapacity: 2 ssh: allow: true publicKeyPath: ~/.ssh/work-default.pub labels: {role: mngworker} iam: withAddonPolicies: imageBuilder: true autoScaler: true externalDNS: true certManager: true ebs: true albIngress: true cloudWatch: true
cloudWatch: clusterLogging: enableTypes: ["*"]

Based on the above file we have everything to create the EKS cluster using the following command:

$ eksctl create cluster -f cluster-config.yaml [ℹ] eksctl version 0.33.0
[ℹ] using region eu-west-1
[ℹ] setting availability zones to [eu-west-1a eu-west-1c eu-west-1b]
...
[ℹ] building iamserviceaccount stack "eksctl-adotxray-addon-iamserviceaccount-kube-system-aws-node"
[ℹ] building iamserviceaccount stack "eksctl-adotxray-addon-iamserviceaccount-adot-xray"
[ℹ] deploying stack "eksctl-adotxray-addon-iamserviceaccount-kube-system-aws-node"
[ℹ] deploying stack "eksctl-adotxray-addon-iamserviceaccount-adot-xray"
...
[✔] EKS cluster "adotxray" in "eu-west-1" region is ready

With the infrastructure set up, let’s move on to the app level.

Sending traces to X-Ray

As pointed out in the beginning we want to use the ADOT Collector as a side-car to our app. Using the public image of our collector served via Amazon ECR Public and a sample app that is instrumented with ADOT to emit traces we can launch the following (store in a file called app.yaml):

apiVersion: apps/v1
kind: Deployment
metadata: name: adot-trace namespace: adot
spec: selector: matchLabels: app: sample replicas: 1 template: metadata: labels: app: sample spec: containers: - name: trace-emitter image: public.ecr.aws/g9c4k4i4/trace-emitter:1 env: - name: OTEL_OTLP_ENDPOINT value: "localhost:55680" - name: OTEL_RESOURCE_ATTRIBUTES value: "service.namespace=AWSObservability,service.name=ADOTEmitService" - name: S3_REGION value: "eu-west-1" imagePullPolicy: Always - name: adot-collector image: public.ecr.aws/aws-observability/aws-otel-collector:latest env: - name: AWS_REGION value: "eu-west-1"

Now launch the app using kubectl apply -f app.yaml and then check the ADOT Collector’s output and you should see something like the following:

$ kubectl logs pod/adot-trace-b45bdbdd9-2zqwl adot-collector AWS OTel Collector version: v0.4.0
2020-12-11T11:40:09.521Z INFO service/service.go:397 Starting AWS OTel Collector... {"Version": "v0.4.0", "GitHash": "be64e63fbd972170e024cbf10d41b7fad0e94394", "NumCPU": 2}
2020-12-11T11:40:09.522Z INFO service/service.go:241 Setting up own telemetry...
2020-12-11T11:40:09.523Z INFO service/telemetry.go:101 Serving Prometheus metrics {"address": "localhost:8888", "level": 0, "service.instance.id": "31343a96-ae2c-4333-9f30-d33f2e5a5f62"}
2020-12-11T11:40:09.523Z INFO service/service.go:278 Loading configuration...
2020-12-11T11:40:09.525Z INFO service/service.go:289 Applying configuration...
2020-12-11T11:40:09.525Z INFO service/service.go:310 Starting extensions...
2020-12-11T11:40:09.525Z INFO builder/extensions_builder.go:53 Extension is starting... {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check"}
2020-12-11T11:40:09.525Z INFO healthcheckextension/healthcheckextension.go:40 Starting health_check extension {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check", "config": {"TypeVal":"health_check","NameVal":"health_check","Port":13133}}
2020-12-11T11:40:09.525Z INFO builder/extensions_builder.go:59 Extension started. {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check"}
2020-12-11T11:40:09.525Z INFO builder/exporters_builder.go:306 Exporter is enabled. {"component_kind": "exporter", "exporter": "awsxray"}
2020-12-11T11:40:09.526Z INFO builder/exporters_builder.go:306 Exporter is enabled. {"component_kind": "exporter", "exporter": "awsemf"}
2020-12-11T11:40:09.526Z INFO service/service.go:325 Starting exporters...
2020-12-11T11:40:09.526Z INFO builder/exporters_builder.go:92 Exporter is starting... {"component_kind": "exporter", "component_type": "awsxray", "component_name": "awsxray"}
2020-12-11T11:40:09.526Z INFO builder/exporters_builder.go:97 Exporter started. {"component_kind": "exporter", "component_type": "awsxray", "component_name": "awsxray"}
2020-12-11T11:40:09.526Z INFO builder/exporters_builder.go:92 Exporter is starting... {"component_kind": "exporter", "component_type": "awsemf", "component_name": "awsemf"}
2020-12-11T11:40:09.526Z INFO builder/exporters_builder.go:97 Exporter started. {"component_kind": "exporter", "component_type": "awsemf", "component_name": "awsemf"}
2020-12-11T11:40:09.526Z INFO builder/pipelines_builder.go:207 Pipeline is enabled. {"pipeline_name": "traces", "pipeline_datatype": "traces"}
2020-12-11T11:40:09.526Z INFO builder/pipelines_builder.go:207 Pipeline is enabled. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2020-12-11T11:40:09.526Z INFO service/service.go:338 Starting processors...
2020-12-11T11:40:09.526Z INFO builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "traces", "pipeline_datatype": "traces"}
2020-12-11T11:40:09.526Z INFO builder/pipelines_builder.go:61 Pipeline is started. {"pipeline_name": "traces", "pipeline_datatype": "traces"}
2020-12-11T11:40:09.526Z INFO builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2020-12-11T11:40:09.526Z INFO builder/pipelines_builder.go:61 Pipeline is started. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2020-12-11T11:40:09.526Z INFO builder/receivers_builder.go:235 Receiver is enabled. {"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp", "datatype": "traces"}
2020-12-11T11:40:09.526Z INFO builder/receivers_builder.go:235 Receiver is enabled. {"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp", "datatype": "metrics"}
2020-12-11T11:40:09.526Z INFO awsxrayreceiver@v0.14.1-0.20201111210848-994cabe5d596/receiver.go:61 Going to listen on endpoint for X-Ray segments {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray", "udp": "0.0.0.0:2000"}
2020-12-11T11:40:09.526Z INFO udppoller/poller.go:105 Listening on endpoint for X-Ray segments {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray", "udp": "0.0.0.0:2000"}
2020-12-11T11:40:09.526Z INFO awsxrayreceiver@v0.14.1-0.20201111210848-994cabe5d596/receiver.go:73 Listening on endpoint for X-Ray segments {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray", "udp": "0.0.0.0:2000"}
2020-12-11T11:40:09.526Z INFO builder/receivers_builder.go:235 Receiver is enabled. {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray", "datatype": "traces"}
2020-12-11T11:40:09.526Z INFO service/service.go:350 Starting receivers...
2020-12-11T11:40:09.526Z INFO builder/receivers_builder.go:70 Receiver is starting... {"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-11T11:40:09.536Z INFO builder/receivers_builder.go:75 Receiver started. {"component_kind": "receiver", "component_type": "otlp", "component_name": "otlp"}
2020-12-11T11:40:09.536Z INFO builder/receivers_builder.go:70 Receiver is starting... {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray"}
2020-12-11T11:40:09.536Z INFO awsxrayreceiver@v0.14.1-0.20201111210848-994cabe5d596/receiver.go:98 X-Ray TCP proxy server started {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray"}
2020-12-11T11:40:09.537Z INFO builder/receivers_builder.go:75 Receiver started. {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray"}
2020-12-11T11:40:09.537Z INFO healthcheck/handler.go:128 Health Check state change {"component_kind": "extension", "component_type": "health_check", "component_name": "health_check", "status": "ready"}
2020-12-11T11:40:09.537Z INFO service/service.go:253 Everything is ready. Begin running and processing data.
2020-12-11T11:41:09.526Z WARN awsemfexporter@v0.14.1-0.20201117192543-4a81c809e720/metric_translator.go:241 Unhandled metric data type. {"component_kind": "exporter", "component_type": "awsemf", "component_name": "awsemf", "DataType": "None", "Name": "processedSpans", "Unit": "1"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.getCWMetrics

That’s all that’s necessary to send the traces using ADOT. Now head over to X-Ray console and check out what you can find there in the service map:

X-Ray service map

As well as the trace analytics:

X-Ray trace analytics

Now that you’ve seen the setup in practice, you likely wonder what’s next. Let’s have a look at a few resource to get you started.

As a developer, you’re first and foremost interested in instrumenting your services. Depending on the programming language you’re using we have SDKs in various maturity stages available. You can today already use the Java and Go SDKs for production environments. Especially if you’re interested in Java, we recommend you to peruse the blog post on Distributed Tracing using AWS Distro for OpenTelemetry. The JavaScript and Python SDKs are work in progress and we’re working on getting those production ready in the coming months.

In this post we demonstrated how to send traces from an ADOT-enabled app to X-Ray. We hope you can start your migration to ADOT for traces and going forward for metrics and logs. This is a fast moving space, so keep you an eye out for more posts in this direction. Learn more about AWS Distro for OpenTelemetry and how to get started using tracing on different compute services in our developer portal. You can connect with us and provide feedback in our forum.