There are tons of different observability solutions out there. Most of them release their own SDK for instrumentation. The problem with that is vendor lock-in. Once the SDK is used all over your codebase, it can be very tedious to swap or even just test, something else.
I most often reach for Open Telemtry, or Otel for short. The aim of Otel is to provide SDKs that are consistent across languages and can be used across providers. An app can be instrumented once and then used through DataDog, Grafana, grepping the console, etc.
Perhaps the biggest downside with Otel, at least from a Go point of view, is that the project specification doesn’t really map nicely to idiomatic Go code. But overall, I belive things are going in the right direction. See for example synchonous gauges.
note: Please take everything below as a starting point and crude reference. It is not meant to be run in production as-is.
First up are “resources”, these are tags or metadata that will be stamped onto any exported data.
import semconv "go.opentelemetry.io/otel/semconv/v1.26.0"
res, err := resource.Merge(resource.Default(),
resource.NewWithAttributes(semconv.SchemaURL,
semconv.ServiceName("my-service"),
semconv.ServiceVersion("0.1.0"),
),
)
Note that the semconv is special.
It looks deceptively similar to a regular go package, but does in fact have its version in the module path.
This means that v1.24.0 and v1.25.0 are entirely separate Go modules.
This can be unexpected when dependencies pull in different version of semconv and go get refuses to help.
If, god forbid, different semconv packages cause problems, you’ll have to replace module dependencies in the source code.
Next up are logs, our bread and butter. Many environments will scoop up anything written to stdout and forward it automagically. In such a case, there is no need for Otel to package and export the logs (unless you really want to). Then just use a plain logging library.
But when there isn’t a magic scoop for stdout, Otel can save the day. With this setup we can use the standard library logger slog across the codebase, backed by an Otel exporter.
import (
"context"
"fmt"
"log/slog"
"go.opentelemetry.io/contrib/bridges/otelslog"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp"
"go.opentelemetry.io/otel/log/global"
"go.opentelemetry.io/otel/sdk/log"
)
// Create a logger provider.
// You can pass this instance directly when creating bridges.
logExporter, err := otlploghttp.New(ctx, otlploghttp.WithInsecure())
if err != nil {
return fmt.Errorf("otlploghttp: %w", err)
}
lp := log.NewLoggerProvider(
log.WithResource(res),
log.WithProcessor(log.NewBatchProcessor(logExporter)),
)
// Handle shutdown properly so nothing leaks.
go func() {
<-ctx.Done()
if err := lp.Shutdown(context.Background()); err != nil {
slog.Warn("log provider shutdown", "error", err)
}
}()
// Use it with SLOG.
global.SetLoggerProvider(lp)
logger := otelslog.NewLogger("pkgname", otelslog.WithLoggerProvider(lp))
slog.SetDefault(logger)
NOTE: I am not 100% certain if global.SetLoggerProvider(lp) is necessary here, since slog gets a direct path to the provider regardless.
Next up: metrics, with more of the same.
import (
"go.opentelemetry.io/contrib/instrumentation/runtime"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
"go.opentelemetry.io/otel/sdk/metric"
)
metricExporter, err := otlpmetrichttp.New(ctx, otlpmetrichttp.WithInsecure())
if err != nil {
return fmt.Errorf("otlpmetrichttp: %w", err)
}
mp := metric.NewMeterProvider(
metric.WithResource(res),
metric.WithReader(metric.NewPeriodicReader(metricExporter)),
)
go func() {
<-ctx.Done()
if err := mp.Shutdown(context.Background()); err != nil {
slog.Warn("metric provider shutdown", "error", err)
}
}()
// Baseline metrics of the Go runtime.
err = runtime.Start()
if err != nil {
return fmt.Errorf("runtime metrics: %w", err)
}
otel.SetMeterProvider(mp)
Finally, there are traces.
import (
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
"go.opentelemetry.io/otel/sdk/trace"
)
exp, err := otlptracehttp.New(ctx, otlptracehttp.WithInsecure())
if err != nil {
return fmt.Errorf("otlptracehttp: %w", err)
}
tp := trace.NewTracerProvider(
trace.WithResource(res),
trace.WithBatcher(exp),
)
go func() {
<-ctx.Done()
err := tp.Shutdown(context.Background())
if err != nil {
slog.Warn("trace provider shutdown", "error", err)
}
}()
otel.SetTracerProvider(tp)
Typically traces depend a lot about the ecosystem you are deploying apps into. Since the whole point is piecing together information across boundaries, your app will need to be a good citizen. This usually means using the same kind of heads that everyone else is using for propagation. Configure accordingly.