ABP VNext + OpenTelemetry + Jaeger:分布式追踪与调用链可视化 🚀
📚 目录
- ABP VNext + OpenTelemetry + Jaeger:分布式追踪与调用链可视化 🚀
- 背景与动机 🌟
- 环境与依赖 📦
- 必装 NuGet 包
- 系统架构概览 🖥️
- 跑通示例 🚀
- `Program.cs`
- `docker-compose.yml` 🐳
- 自动 + 手动埋点 🔍
- 自动埋点
- 自定义业务 Span
- 异常与状态码标记 ⚠️
- 日志与 Metrics 关联 📑
- Jaeger 部署与高可用 🏗️
- 采样策略与性能优化 🛡️
- 扩展:OTel Collector & Grafana Tempo 🔗
版本说明:本文基于
OpenTelemetry.Extensions.Hosting
>=1.4.0 编写,推荐使用统一的.AddOpenTelemetry().WithTracing(...).WithMetrics(...)
API。
TL;DR
- 提供完整
Program.cs
+docker-compose.yml
示例 🏃♂️- 自动 + 手动埋点:支持 HTTP/gRPC、数据库、外部调用与自定义 Span 🔍
- 高性能采样:Parent-Based + TraceIdRatioBasedSampler,动态可配 ⚙️
- 生产级部署:Batch 模式、OTel Collector、日志/Metrics 关联、异常标记 🚀
背景与动机 🌟
在分布式微服务架构中,调用链横跨多个进程与网络节点,“谁调用了谁”、“哪些环节慢”成为痛点。
OpenTelemetry(OTel)与 Jaeger 提供了开源、无侵入的端到端分布式追踪解决方案,帮助我们:
- 自动化采集:入站 HTTP/gRPC、数据库、外部 HTTP 等一键埋点 📡
- 自定义业务 Span:灵活埋点关键业务逻辑 🛠️
- 统一可视化:Jaeger UI 或 Grafana Tempo 展示完整调用链 📈
本文基于 ABP VNext 6.x + .NET 6+,演示从零搭建到生产级优化,涵盖自动/手动埋点、采样策略、异常与日志关联等最佳实践。
环境与依赖 📦
- .NET SDK:6.0+
- ABP Framework:vNext 6.x
- OpenTelemetry.Extensions.Hosting:>=1.4.0
- Jaeger:all-in-one(测试);独立 Agent/Collector/Storage(生产)
- 可选:OpenTelemetry Collector、Grafana Tempo、Prometheus/Grafana
必装 NuGet 包
dotnet add package OpenTelemetry.Extensions.Hosting
dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package OpenTelemetry.Instrumentation.Http
dotnet add package OpenTelemetry.Instrumentation.SqlClient
dotnet add package OpenTelemetry.Instrumentation.GrpcNetClient
dotnet add package OpenTelemetry.Instrumentation.GrpcAspNetCore
dotnet add package OpenTelemetry.Exporter.Jaeger
dotnet add package OpenTelemetry.Exporter.Prometheus.AspNetCore
dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol
系统架构概览 🖥️
- 采集:入站 HTTP/gRPC、EF Core/SqlClient、HttpClient、gRPC 客户端、日志、Metrics
- 导出:Trace → Jaeger;Metrics → Prometheus;Logs → OTLP → 日志后端;(可选)Trace → Tempo
跑通示例 🚀
Program.cs
using OpenTelemetry;
using OpenTelemetry.Logs;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using Volo.Abp;var builder = WebApplication.CreateBuilder(args);// 1. ABP 模块注册
builder.Services.AddApplication<MyProjectHttpApiHostModule>();// 2. OpenTelemetry 注册(Tracing + Metrics)
builder.Services.AddOpenTelemetry()// Tracing.WithTracing(tracing => tracing.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("OrderService", serviceVersion: "1.0.0")).AddAspNetCoreInstrumentation().AddHttpClientInstrumentation().AddSqlClientInstrumentation().AddGrpcAspNetCoreInstrumentation().AddGrpcClientInstrumentation().AddSource("MyCompany.MyProduct").AddJaegerExporter(opts =>{opts.AgentHost = builder.Configuration["Jaeger:Host"];opts.AgentPort = int.Parse(builder.Configuration["Jaeger:Port"]!);}, exportProcessorType: ExportProcessorType.Batch).SetSampler(new ParentBasedSampler(new TraceIdRatioBasedSampler(0.1))))// Metrics.WithMetrics(metrics => metrics.AddAspNetCoreInstrumentation().AddHttpClientInstrumentation().AddPrometheusExporter());// 3. 日志关联 TraceContext
builder.Logging.AddOpenTelemetry(logging =>
{logging.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("OrderService"));logging.IncludeFormattedMessage = true;logging.IncludeScopes = true;logging.ParseStateValues = true;logging.AddOtlpExporter(); // 需要 OpenTelemetry.Exporter.OpenTelemetryProtocol
});var app = builder.Build();// 4. Prometheus 默认抓取端点(/metrics),无需额外配置
app.MapPrometheusScrapingEndpoint();// 5. 启动 ABP 应用
app.InitializeApplication();
app.Run();
docker-compose.yml
🐳
version: '3.8'
services:jaeger:image: jaegertracing/all-in-one:1.45ports:- "6831:6831/udp"- "16686:16686"- "14250:14250"
快速启动:
docker-compose up -d
dotnet run --project src/MyProject.HttpApi.Host
- 访问 API &
http://localhost:16686
自动 + 手动埋点 🔍
自动埋点
- HTTP/gRPC:
.AddAspNetCoreInstrumentation()
、.AddGrpcAspNetCoreInstrumentation()
- 外部调用:
.AddHttpClientInstrumentation()
、.AddGrpcClientInstrumentation()
- 数据库:
.AddSqlClientInstrumentation()
自定义业务 Span
public class OrderAppService : ApplicationService
{private static readonly ActivitySource Source = new("MyCompany.MyProduct");public async Task ProcessOrderAsync(Guid orderId){using var activity = Source.StartActivity("ProcessOrder");activity?.SetTag("order.id", orderId);try{await _orderManager.HandleOrderAsync(orderId);}catch (Exception ex){activity?.SetStatus(ActivityStatusCode.Error, ex.Message);activity?.RecordException(ex);throw;}}
}
💡 Tip:通过 ABP 拦截器统一埋点:
public class TraceInterceptor : IInterceptor
{private static readonly ActivitySource Source = new("MyCompany.MyProduct");public void Intercept(IInvocation invocation){using var activity = Source.StartActivity(invocation.Method.Name);activity?.SetTag("abp.service", invocation.TargetType.Name);try{invocation.Proceed();}catch (Exception ex){activity?.SetStatus(ActivityStatusCode.Error, ex.Message);activity?.RecordException(ex);throw;}}
}// 注册拦截器
Configure<AbpInterceptorsOptions>(opts =>opts.Interceptors.Add<TraceInterceptor>()
);
异常与状态码标记 ⚠️
activity?.SetStatus(ActivityStatusCode.Error, message)
标记失败 Spanactivity?.RecordException(ex)
记录异常详情
在 Jaeger UI 中直观区分成功/失败调用链。
日志与 Metrics 关联 📑
-
日志:
builder.Logging.AddOpenTelemetry(logging =>logging.AddOtlpExporter());
-
Metrics:
- 自动采集请求计数与时延
- 自定义
Meter
导出 Prometheus
Jaeger 部署与高可用 🏗️
-
测试:All-in-One 镜像,一键启动
-
生产:
- 组件拆分:Agent/Collector/Query/UI 分离部署
- 后端存储:Cassandra / Elasticsearch / Kafka
- 安全:启用 TLS、鉴权(mTLS、Token),或通过 OTel Collector 做统一接入与流量控制
- 多副本:水平扩展与高可用
采样策略与性能优化 🛡️
- ParentBasedSampler:跨服务一致决策
- Batch 模式:减少网络与 CPU 开销
- 动态调整:环境变量
OTEL_TRACES_SAMPLER
/OTEL_TRACES_SAMPLER_ARG
- 环境差异:开发环境
AlwaysOnSampler
;生产环境 5–10%
扩展:OTel Collector & Grafana Tempo 🔗
- Collector:统一接入、Filter、Auth、转发至 Jaeger/Tempo/Prometheus
- Grafana Tempo:专注 Trace 存储,结合 Prometheus、Loki 构建全栈 Observability