Measuring Go Applications with OpenTelemetry

Measuring Go Applications

August 27, 2025

A high gauge indicating measurement of system health and performance — Measure what matters: latency, errors, saturation, and key business events.

If traces tell you why something is wrong, metrics tell you when something is wrong. In my previous article Tracing Go Applications, we focused on causality and context for debugging. This post is about measurement: using OpenTelemetry metrics in Go to quantify service health and business outcomes.

Why metrics matter

Measurement turns system behavior into signals that people can act on. In day-to-day operations you need quick, reliable indications of whether users are getting the experience you promised and whether the system that delivers it is healthy. Metrics provide those fast, low-cost signals. They capture trends over time, make service levels visible, and form the backbone of capacity planning and incident response. Metrics are the first to tell you that something needs attention.

Useful measurement starts from outcomes, not tools. Identify a small set of Service Level Indicators (SLIs) that reflect user experience and commit them to Service Level Objectives (SLOs) so the team has a clear target. Typically, this is some combination of request rate, error rate, and latency. The RED and USE heuristics can help you think systematically about these metrics: RED focuses on Rate, Errors, and Duration at the request boundary; USE looks at Utilization, Saturation, and Errors for resources like CPU, memory, queues, and external dependencies. Taken together, they keep you honest about both the experience you deliver and the capacity you consume.

Beyond system health, measure the moments that matter for your product or business: events such as orders placed, signups started and completed, or jobs processed. These signals let you connect technical performance to outcomes, spot breakage sooner, and talk about impact in a language the whole organization understands.

Metrics in OpenTelemetry

In OpenTelemetry, metrics are called 'instruments.' Instruments are how you record numbers in context: you call them in your code, they capture values plus attributes, and the SDK takes care of aggregation and export.

The instruments you’ll use most often are:

Counters: counters that represent monotonic totals (e.g., requests, errors). They only ever increase over the application lifetime.
UpDownCounters: counters that can go up and down (e.g., in-flight requests, queue depth).
Histograms: distributions (latency, payload size) with buckets.
Observables: pull-based gauges for runtime stats.

Counters, UpDownCounters, and Histograms are typically used synchronously in your request code paths. By contrast, Observables are asynchronous: the SDK calls your callbacks on a collection interval to read values like current memory usage or gc pause time, keeping hot paths lean.

Similarly to the TracerProvider and the Tracer, there are a MeterProvider and a Meter that provide the API and SDK, as well as the Exporter concept to export metrics to various backends. Different from the trace exporters that are all push, the exporters for metrics can be either push- or pull-based, to accommodate the existing prometheus scrape model. To make sure the observables are queried consistently, the SDK exposes a Reader concept that can be used to harvest measurements. The periodic reader will read the observables at consistent time intervals, while the pull-based reader will read the observables when the exporter retrieves the value.

Two more concepts round out the model:

The Resource describes the entity emitting telemetry (service name, version, environment, region, etc.). Set it once so every metric carries the same identity. If you already configured a Resource for tracing, reuse it to keep signals aligned across metrics and traces.

A View lets you reshape metrics without changing the recording code: choose aggregations (e.g., explicit-bucket histograms for latency), rename instruments, or drop high-cardinality attributes. Views are your primary tool for keeping cardinality under control and aligning buckets with your SLOs.

Minimal Setup (Reusing Tracing Resource)

A compact setup that exports metrics to an OpenTelemetry Collector using OTLP over gRPC:

// go.mod: require go.opentelemetry.io/otel and go.opentelemetry.io/otel/metric
import (
    "context"
    "log"

    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
    "go.opentelemetry.io/otel/metric"
    sdkmetric "go.opentelemetry.io/otel/sdk/metric"
    "go.opentelemetry.io/otel/sdk/resource"
    semconv "go.opentelemetry.io/otel/semconv/v1.24.0"
)

func setupMetrics(ctx context.Context) func(context.Context) error {
    res, _ := resource.Merge(resource.Default(), resource.NewWithAttributes(
        semconv.SchemaURL,
        semconv.ServiceName("users-api"),
        semconv.ServiceVersion("1.4.2"),
        attribute.String("deployment.environment", "prod"),
    ))

    exp, err := otlpmetricgrpc.New(ctx) // or use Prometheus exporter
    if err != nil { log.Fatal(err) }

    reader := sdkmetric.NewPeriodicReader(exp)
    mp := sdkmetric.NewMeterProvider(
        sdkmetric.WithReader(reader),
        sdkmetric.WithResource(res),
    )
    otel.SetMeterProvider(mp)
    return mp.Shutdown
}

Alternatively, the Prometheus exporter exposes a /metrics endpoint to be scraped by Prometheus directly if you're not using the Collector.

As with the traces, the OpenTelemetry ecosystem already provides you with a lot of metrics out of the box: Go runtime (GC pauses, heap size, goroutines,...), HTTP / gRPC (server and client metrics from standard OTel instrumentation, ...), SQL Database (duration, open and closed connections, ...) and many more.

Designing Useful Application Metrics

In many applications, the default instrumentation on the entry and exit points of your code will already give you a ton of information. Sometimes you want to track metrics that track more business-specific SLI's. In that case, you start from the SLI's and pick the instruments accordingly:

Rate (requests/sec): Counter.
Errors: Counter with error=true or status_class label.
Duration: Histogram in seconds.
Saturation: in-flight, queue depth, cache size, pool utilization (UpDownCounter/Observable).

If you want to reduce the cost and improve the performance of your observability backend, you can use a neat, little hack to keep your metric count to a minimum: a histogram can also be used as a rate and to track errors. An example of this practice is the "http.server.request.duration" metric in the default instrumentation. The main purpose of the histogram is to track the duration of the requests. To make the histogram work properly, the histogram also tracks the total number of recorded values, which incidentally acts as a counter to track the requests per second. The histogram also accepts attributes, like the HTTP status code, which allows it to be used as an error rate.

When designing the attributes you want to track, there's one single rule that you need to keep in mind: make sure you keep attributes bounded. Each attribute can only have a limited set of values (e.g., status_class = 2xx/4xx/5xx). Never use free text and do not label metrics with attributes that expand in possible values over time, like user id, session, or ip identifiers. Since observability backends often store a time series per attribute value, unbounded attributes explode the amount of data stored.


var hashLatency metric.Float64Histogram
var hashError metric.Int64Counter

func signinHandler(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()

	// Extract user ID from query parameters
	userID := r.URL.Query().Get("id")
	if userID == "" {
		http.Error(w, "Missing user ID", http.StatusBadRequest)
		return
	}

	// Get user from database
	user, err := getUserFromDB(ctx, userID)
	if err != nil {
		http.Error(w, "Failed to get user", http.StatusInternalServerError)
		return
	}

	start := time.Now()
	err = bcrypt.CompareHashAndPassword(user.password, []byte(r.FormValue("password")))
	hashLatency.Record(r.Context(), time.Since(start).Seconds(), metric.WithAttributes(attribute.Bool("correct", err == nil)))

	if errors.Is(err, bcrypt.ErrHashTooShort) {
		hashError.Add(r.Context(), 1)
	}

	if err != nil {
		http.Error(w, "Not authenticated", http.StatusUnauthorized)
		return
	}

	json.NewEncoder(w).Encode(struct{ Status string }{"OK"})
}

Aggregation, Buckets, and Views

Histograms capture distributions and enable accurate percentiles in your backend. However, the default bucket boundaries may not align properly with your SLO's, reducing their utility. For instance, if your p95 should be smaller than 250 ms, it's recommended to include bucket boundaries around 0.25 s, but the default buckets do not. You can change the buckets when you create the metrics, but sometimes those metrics are created by third-party libraries, and you can't change their code. In this case, you can use the previously mentioned View concept to change the buckets. You can also use a View to drop high-cardinality attributes that would cause severe performance degradation in your observability backend.

view := sdkmetric.NewView(
    sdkmetric.Instrument{Name: "myapp.process.duration"},
    sdkmetric.Stream{
        Aggregation: sdkmetric.AggregationExplicitBucketHistogram{
            Boundaries: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2, 5},
        },
        AttributeFilter: func(k attribute.KeyValue) bool { return k.Key != "myapp.user_id" },
    },
)
mp := sdkmetric.NewMeterProvider(
    sdkmetric.WithReader(reader),
    sdkmetric.WithResource(res),
    sdkmetric.WithView(view),
)

Correlating Metrics with Traces

Usually, metrics are aggregated and individual samples don't carry any extra contextual information. Exemplars are special, randomly sampled data points that carry contextual information (most importantly a trace_id, and sometimes a span_id and attributes) and are stored alongside an aggregated metric. In practice, exemplars let you click from a spike on a metric graph straight into a representative trace that happened during that spike. The trace will then carry a lot more contextual information that allows you to drill down to the root cause of the spike. For more information about traces, see my previous blogpost titled "Tracing Go Applications".

In the Go OpenTelemetry implementation, the creation of exemplars is abstracted away, and you typically don’t attach exemplars manually. The SDK will automatically record exemplars when there’s an active recording span in the Context you pass to metric.Record/Add. That means the main thing you do in code is ensure you record metrics within the same Context as your current span. Naturally, it also implies your exporter and observability backend support exemplars.

The latency and error metrics in the example below are both recorded inside an active span that starts on the first line of the handler. Because the ctx carries the trace_id of that span, it will be passed on to the metrics as well. The SDK will then decide whether it will be turned into an exemplar or not. By default, exemplars are recorded when the trace is marked as sampled. If you've put the trace sampler to Always, pay attention to the fact that the exemplars will be recorded for every trace, even when it gets filtered out by the collector later on. This might cause a performance hit in the metric backend.

In the example below, the hashLatency and hashError metrics are recorded inside the span 'password_check', which allows OpenTelemetry to link the trace to the metric as an exemplar.


var hashLatency metric.Float64Histogram
var hashError metric.Int64Counter

func signinHandler(w http.ResponseWriter, r *http.Request) {
	ctx := r.Context()

	// Extract user ID from query parameters
	userID := r.FormValue("id")
	if userID == "" {
		http.Error(w, "Missing user ID", http.StatusBadRequest)
		return
	}

	// Get user from database
	user, err := getUserFromDB(ctx, userID)
	if err != nil {
		http.Error(w, "Failed to get user", http.StatusInternalServerError)
		return
	}

	ctx, span := otel.Tracer("test").Start(ctx, "password_check")
	defer span.End()

	start := time.Now()
	err = bcrypt.CompareHashAndPassword(user.password, []byte(r.FormValue("password")))
	hashLatency.Record(ctx, time.Since(start).Seconds(), metric.WithAttributes(attribute.Bool("correct", err == nil)))

	if errors.Is(err, bcrypt.ErrHashTooShort) {
		hashError.Add(ctx, 1)
	}

	if err != nil {
		http.Error(w, "Not authenticated", http.StatusUnauthorized)
		return
	}

	json.NewEncoder(w).Encode(struct{ Status string }{"OK"})
}

Production Hardening and Pitfalls

As we wrap up this post, here's a recap of some of the things you need to keep in mind when working with metrics:

Control the cardinality. Keep your label sets predictable and bounded. Avoid user- or session-identifiers in metrics and use enumerated values rather than free text. If third‑party instrumentation emits overly detailed labels, use Views to drop or remap attributes. High cardinality creates memory pressure, slow queries, and unexpected billing spikes.

Version your metrics. Treat metric names and label sets as a public contract. Follow OpenTelemetry semantic conventions where applicable. When you must change an instrument or its attributes, deprecate intentionally: dual‑write the new series, keep the old for a deprecation window, and communicate the change. Renames and label shape changes break dashboards and alerts; prefer additive changes and reserve breaking changes for major versions.

Units discipline. Use consistent SI units and encode units in metadata, not names: durations in seconds, sizes in bytes, ratios as unitless values between 0 and 1. Do not publish request_duration_ms, but publish http.server.duration measured in seconds, and choose histogram buckets that match those units (e.g., 0.005, 0.01, …). Consistent units make cross‑service comparisons, SLO calculations, and backend transformations reliable and less error‑prone.

Sampling is for traces, not metrics. Don’t drop metric data at the source with ad‑hoc sampling: it corrupts counters and distorts histograms. If you need to control cost or volume, aggregate or downsample via the Collector (e.g., metric aggregation, rate conversion, or remote‑write filtering). Reduce cardinality first, then adjust scrape and collection intervals. Keep in mind that exemplar emission follows trace sampling.

Summary

Metrics are your fastest, lowest-cost signal of user impact and system health. They make SLOs actionable, surface regressions quickly, and help you plan capacity sensibly, all while letting you connect technical performance to business outcomes like orders placed or jobs processed.

OpenTelemetry makes metrics simple: set a Resource (service name, version, environment), create a MeterProvider, and pick an exporter. Most coverage comes out of the box, from runtime metrics, HTTP/gRPC client and server middleware, database instrumentation, and host/container metrics via the Collector. You only need to sprinkle a few custom counters and histograms for your SLIs and key business events. Keep labels bounded, choose sane histogram buckets, and let the wrappers do the heavy lifting; you’ll get robust, comparable metrics with minimal code.