OpenTelemetry: The Complete Developer's Guide to Distributed Tracing
OpenTelemetry: The Complete Developer's Guide to Distributed Tracing
You deploy a microservices architecture. A user reports that checkout is slow. You check the API gateway logs -- response time looks normal. You check the order service -- fine. The payment service -- fine. The inventory service -- also fine. But somehow the end-to-end request takes 4 seconds. Where did the time go?
This is the problem distributed tracing solves, and OpenTelemetry (OTel) is how you implement it without locking yourself into a specific vendor. It's the CNCF project that unified OpenTracing and OpenCensus into a single standard for telemetry data. Every major observability platform -- Datadog, Grafana, Honeycomb, New Relic, Jaeger -- now speaks OpenTelemetry.
The Three Pillars, Explained Simply
OpenTelemetry deals with three types of telemetry data. You've heard these called "the three pillars of observability," but that framing obscures how they actually work together.
Traces follow a single request across service boundaries. A trace is a tree of "spans" -- each span represents a unit of work (an HTTP request, a database query, a function call). When service A calls service B which calls service C, you get a trace showing exactly how long each step took and where failures occurred.
Metrics are aggregated measurements over time -- request count, error rate, response time percentiles, queue depth. Unlike traces (which capture individual requests), metrics summarize behavior across all requests. They're cheap to collect and ideal for dashboards and alerting.
Logs are timestamped text records of events. The key insight from OTel: logs become much more useful when they're correlated with traces. Instead of searching through millions of log lines, you find the trace for a slow request and see exactly which log entries belong to it.
Core Concepts
Before writing code, you need to understand a few OTel primitives.
Spans and Traces
A span has:
- A name (e.g., "HTTP GET /api/orders")
- A start time and duration
- A parent span ID (except for the root span)
- Attributes (key-value metadata)
- Events (timestamped annotations within the span)
- A status (OK, ERROR, UNSET)
A trace is the entire tree of spans that originates from a single root span. The trace ID propagates across service boundaries via HTTP headers (typically traceparent from the W3C Trace Context standard).
Trace: abc123
├── [50ms] HTTP GET /checkout (api-gateway)
│ ├── [12ms] HTTP POST /orders (order-service)
│ │ └── [8ms] INSERT INTO orders (postgres)
│ ├── [180ms] HTTP POST /payment (payment-service) <-- slow!
│ │ ├── [150ms] Stripe API call (external) <-- root cause
│ │ └── [3ms] UPDATE orders (postgres)
│ └── [5ms] HTTP POST /inventory (inventory-service)
│ └── [2ms] UPDATE stock (postgres)
Looking at this trace, you immediately see the Stripe API call is responsible for the slow checkout. Without tracing, you'd be guessing.
The OTel Collector
The OpenTelemetry Collector is a vendor-agnostic proxy that receives, processes, and exports telemetry data. You deploy it as a sidecar or standalone service, and your applications send telemetry to it instead of directly to your backend.
App → OTel Collector → Jaeger (traces)
→ Prometheus (metrics)
→ Loki (logs)
This architecture means you can switch observability backends without changing application code. The Collector handles batching, retry, sampling, and data transformation.
Instrumenting a Node.js Application
Here's how to add OpenTelemetry to an Express application. The process has three parts: install the SDK, configure providers, and add instrumentation.
Installation
npm install @opentelemetry/sdk-node \
@opentelemetry/api \
@opentelemetry/auto-instrumentations-node \
@opentelemetry/exporter-trace-otlp-http \
@opentelemetry/exporter-metrics-otlp-http
Configuration
Create a tracing.ts file that initializes OTel before your app starts:
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
import { PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';
import { Resource } from '@opentelemetry/resources';
import {
ATTR_SERVICE_NAME,
ATTR_SERVICE_VERSION,
} from '@opentelemetry/semantic-conventions';
const sdk = new NodeSDK({
resource: new Resource({
[ATTR_SERVICE_NAME]: 'order-service',
[ATTR_SERVICE_VERSION]: '1.4.2',
}),
traceExporter: new OTLPTraceExporter({
url: 'http://otel-collector:4318/v1/traces',
}),
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: 'http://otel-collector:4318/v1/metrics',
}),
exportIntervalMillis: 15000,
}),
instrumentations: [
getNodeAutoInstrumentations({
// Disable noisy fs instrumentation
'@opentelemetry/instrumentation-fs': { enabled: false },
}),
],
});
sdk.start();
process.on('SIGTERM', () => {
sdk.shutdown().then(() => process.exit(0));
});
Load this before your application code:
node --require ./tracing.js ./server.js
# Or with ts-node:
node --require ts-node/register --require ./tracing.ts ./server.ts
Auto-Instrumentation vs Manual Spans
The auto-instrumentation package automatically creates spans for HTTP requests, database queries, gRPC calls, and many other libraries. This gives you 80% of the value with zero code changes.
For the remaining 20%, add manual spans around business logic:
import { trace, SpanStatusCode } from '@opentelemetry/api';
const tracer = trace.getTracer('order-service');
async function processOrder(orderId: string, items: OrderItem[]) {
return tracer.startActiveSpan('processOrder', async (span) => {
span.setAttribute('order.id', orderId);
span.setAttribute('order.item_count', items.length);
try {
// Validate inventory
await tracer.startActiveSpan('validateInventory', async (childSpan) => {
for (const item of items) {
const available = await checkStock(item.sku, item.quantity);
if (!available) {
childSpan.addEvent('insufficient_stock', {
'item.sku': item.sku,
'item.requested': item.quantity,
});
throw new Error(`Insufficient stock for ${item.sku}`);
}
}
childSpan.end();
});
// Process payment
const paymentResult = await processPayment(orderId, calculateTotal(items));
span.setAttribute('payment.transaction_id', paymentResult.transactionId);
span.setStatus({ code: SpanStatusCode.OK });
return { success: true, transactionId: paymentResult.transactionId };
} catch (error) {
span.setStatus({
code: SpanStatusCode.ERROR,
message: error.message,
});
span.recordException(error);
throw error;
} finally {
span.end();
}
});
}
Instrumenting Python
Python's OTel SDK follows the same pattern. Auto-instrumentation covers Flask, Django, FastAPI, SQLAlchemy, and most common libraries.
pip install opentelemetry-api \
opentelemetry-sdk \
opentelemetry-exporter-otlp \
opentelemetry-instrumentation-flask \
opentelemetry-instrumentation-sqlalchemy \
opentelemetry-instrumentation-requests
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor
# Configure the tracer
resource = Resource.create({
"service.name": "user-service",
"service.version": "2.1.0",
})
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(
OTLPSpanExporter(endpoint="http://otel-collector:4317")
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
# Auto-instrument Flask and SQLAlchemy
FlaskInstrumentor().instrument()
SQLAlchemyInstrumentor().instrument()
tracer = trace.get_tracer(__name__)
@app.route('/users/<user_id>')
def get_user(user_id):
with tracer.start_as_current_span("fetch_user_profile") as span:
span.set_attribute("user.id", user_id)
user = db.session.query(User).get(user_id)
if not user:
span.set_attribute("user.found", False)
abort(404)
span.set_attribute("user.found", True)
return jsonify(user.to_dict())
Setting Up the OTel Collector
The Collector configuration is YAML-based. Here's a production-ready config:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 5s
send_batch_size: 1024
# Sample 10% of traces in production
probabilistic_sampler:
sampling_percentage: 10
# Always keep traces with errors
tail_sampling:
decision_wait: 10s
policies:
- name: errors
type: status_code
status_code:
status_codes: [ERROR]
- name: slow-requests
type: latency
latency:
threshold_ms: 1000
- name: probabilistic
type: probabilistic
probabilistic:
sampling_percentage: 10
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true
prometheus:
endpoint: 0.0.0.0:8889
loki:
endpoint: http://loki:3100/loki/api/v1/push
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, tail_sampling, batch]
exporters: [otlp/jaeger]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [loki]
Deploy with Docker Compose:
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
- "8889:8889" # Prometheus metrics
Sampling Strategies
In production, you can't trace every request -- the data volume and cost would be enormous. Sampling strategies let you collect enough data to debug problems while keeping costs manageable.
| Strategy | How It Works | Best For |
|---|---|---|
| Head-based (probabilistic) | Decide at the start of a trace whether to sample it | High-throughput services with uniform traffic |
| Tail-based | Decide after the trace completes, based on outcomes | Keeping all errors and slow requests |
| Rate-limiting | Sample up to N traces per second | Controlling exact ingestion volume |
| Always-on (debug) | Sample 100% | Development and staging environments |
Tail-based sampling is the most useful for production debugging because it guarantees you capture the traces that matter -- errors and high-latency requests -- while sampling down the happy path.
Context Propagation: How Traces Cross Service Boundaries
The magic of distributed tracing is that a single trace follows a request across multiple services. This works through context propagation -- trace context is serialized into HTTP headers (or gRPC metadata) and deserialized by the receiving service.
The W3C Trace Context standard defines two headers:
traceparent: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01
^ ^ ^ ^
| trace-id (128-bit) parent-id flags
version (64-bit) (sampled)
When service A calls service B, the OTel SDK automatically:
- Injects the current trace context into outgoing request headers
- Extracts the trace context from incoming request headers
- Creates child spans that reference the parent span ID
If you're using auto-instrumentation, this happens transparently. For custom HTTP clients or message queues, you might need to propagate context manually:
import { propagation, context } from '@opentelemetry/api';
// Inject context into outgoing request headers
const headers = {};
propagation.inject(context.active(), headers);
await fetch('http://other-service/api', { headers });
// Extract context from incoming request
const parentContext = propagation.extract(context.active(), req.headers);
context.with(parentContext, () => {
// Spans created here will be children of the incoming trace
});
Custom Metrics
Beyond traces, OTel's metrics API lets you define application-specific metrics:
import { metrics } from '@opentelemetry/api';
const meter = metrics.getMeter('order-service');
// Counter: things that only go up
const ordersCreated = meter.createCounter('orders.created', {
description: 'Total number of orders created',
});
// Histogram: distribution of values
const orderValue = meter.createHistogram('orders.value', {
description: 'Order value in cents',
unit: 'cents',
});
// Up-down counter: things that go up and down
const activeConnections = meter.createUpDownCounter('db.connections.active', {
description: 'Number of active database connections',
});
// Usage
ordersCreated.add(1, { 'order.type': 'subscription', 'order.region': 'us-west' });
orderValue.record(4999, { 'payment.method': 'card' });
activeConnections.add(1);
// ... later
activeConnections.add(-1);
Debugging Common OTel Issues
Traces aren't showing up: Check that your exporter endpoint is reachable. The most common mistake is using localhost:4317 when the Collector is running in a different container. Use the service name (otel-collector:4317) in Docker Compose.
Spans are disconnected: Context propagation is broken somewhere. Check that auto-instrumentation covers the HTTP client library you're using. For async operations, ensure you're not losing context across setTimeout or event emitter boundaries.
High memory usage in Collector: The batch processor buffers spans before exporting. Reduce send_batch_size or increase export frequency. Add the memory_limiter processor (you should always have this).
Missing attributes: Semantic conventions define standard attribute names. Use http.request.method instead of method or httpMethod. Consistent naming lets your observability backend correlate data across services.
Where to Start
If you're adding observability to an existing system, start here:
- Deploy the OTel Collector as a central telemetry pipeline. Even if you only have one service, this separates your app from your backend choice.
- Add auto-instrumentation to your most critical service. This gives you HTTP and database traces with minimal effort.
- Export to Jaeger (open source, free) for local development and initial exploration.
- Add manual spans around business logic that auto-instrumentation doesn't cover.
- Implement tail-based sampling before going to production -- you want all error traces but don't need every successful health check.
The investment in instrumentation pays for itself the first time you debug a cross-service latency issue in minutes instead of hours.