昨天介紹了OpenTelemetry的名詞跟概念.
今天來架設其中一款支持OpenTelemetry的追蹤系統
Jaeger是CNCF項目之一, 受到Dapper和OpenZipkin的起發.
由Uber開源的分布式追蹤系統, 用來監控和診斷鏈路分布式系統.
Uber也在自己的Blog上發表了一篇文章Evolving Distributed Tracing, 講解了Uber在分布式追蹤從一開始到Jaeger的誕生.
Jaeger的服務架構如下圖
其實就是Jaeger client的Library, 有對OpenTelemetry和OpenTracing進行了實現.
也是Sidecar模式的實現, 負責把client透過UDP發出的spans給批量推送到Collector上.
主要是為了屏蔽client對於collector的路由實做細節.
收集Spans, 把Span經過驗證、轉換、索引並且寫入DB內.
Colletcor能設定Sampling採樣邏輯, 根據Sampling的設定進行收集和處理.
因為這組件是無狀態的, 所以可以建立很多個Collector加速寫入到DB.
DB支援了Cassandra、Elasticsearch、Kafka.
官方建議是用Cassandra, 原因有2. Cassanda是一個K-V資料庫, 對於用TracdID來搜尋的場景效率很高. 且寫入吞吐量相當好.
但若是為了分析查詢, 還是Elasticsearch實在.
接收查詢請求, 然後從DB中檢索, 並透過UI展示.
Jaeger Query也是無狀態的, 所以可以啟動多個實例.
安裝Jarger-Collector、Jarger-Agent、Jaeger-Query.
和Elastic Cluster(Master+Node)
version: "3.6"
services:
jaeger-collector:
image: jaegertracing/jaeger-collector
command:
- --es.num-shards=2
- --es.num-replicas=0
- --es.server-urls=http://172.16.230.100:9200,http://172.16.230.102:9201
- --collector.zipkin.host-port=:9411
ports:
- "14269"
- "14268:14268"
- "14250"
- "9411:9411"
environment:
- SPAN_STORAGE_TYPE=elasticsearch
- LOG_LEVEL=debug
networks:
jaeger_net:
ipv4_address: 172.16.230.2
depends_on:
- elasticsearch-master
jaeger-query:
image: jaegertracing/jaeger-query
command:
- --es.num-shards=2
- --es.num-replicas=0
- --es.server-urls=http://172.16.230.100:9200,http://172.16.230.101:9201
ports:
- "16686:16686"
- "16687"
environment:
- SPAN_STORAGE_TYPE=elasticsearch
- LOG_LEVEL=debug
networks:
jaeger_net:
ipv4_address: 172.16.230.3
depends_on:
- elasticsearch-master
jaeger-agent:
image: jaegertracing/jaeger-agent
command:
- --reporter.grpc.host-port=jaeger-collector:14250
- --reporter.grpc.retry.max=1000
ports:
- "5775:5775/udp"
- "6831:6831/udp"
- "6832:6832/udp"
- "5778:5778"
environment:
- LOG_LEVEL=debug
networks:
jaeger_net:
ipv4_address: 172.16.230.4
depends_on:
- jaeger-collector
elasticsearch-master:
container_name: es-master01
hostname: es-master01
image: elasticsearch:7.1.1
volumes:
- ./elasticsearch/master/conf/es-master.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- ./elasticsearch/master/data:/usr/share/elasticsearch/data
- ./elasticsearch/master/logs:/usr/share/elasticsearch/logs
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- 9200:9200
- 9300:9300
expose:
- 9200
networks:
jaeger_net:
ipv4_address: 172.16.230.100
elasticsearch-slave1:
container_name: es-slave01
hostname: es-slave01
image: elasticsearch:7.1.1
volumes:
- ./elasticsearch/slave1/conf/es-slave1.yml:/usr/share/elasticsearch/config/elasticsearch.yml
- ./elasticsearch/slave1/data:/usr/share/elasticsearch/data
- ./elasticsearch/slave1/logs:/usr/share/elasticsearch/logs
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ports:
- 9100:9100
- 9201:9201
expose:
- 9201
networks:
jaeger_net:
ipv4_address: 172.16.230.101
networks:
jaeger_net:
driver: bridge
ipam:
driver: default
config:
-
subnet: 172.16.230.0/24
es-master.yml
cluster.name: es-cluster
node.name: es-master
node.master: true
node.data: true
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
discovery.seed_hosts:
- 172.16.230.100
- 172.16.230.101
cluster.initial_master_nodes:
- es-master
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false
es-slave1.yml
cluster.name: es-cluster
node.name: es-slave1
node.master: false
node.data: true
network.host: 0.0.0.0
http.port: 9201
discovery.seed_hosts:
- 172.16.230.100
- 172.16.230.101
cluster.initial_master_nodes:
- 172.16.230.100
http.cors.enabled: true
http.cors.allow-origin: "*"
xpack.security.enabled: false
上面的一些command設定, 能參考這裡CLI flags
其實能設定的部份不多.
接著就能打開瀏覽器, 輸入http://172.16.230.3:16686/
搭配官方範例的example來試試看.
這裡我是打給Agen, Agent在傳送給Collector.
當然也能直接打給Collector就是了.
要看架構跟吞吐量.
package main
import (
"context"
"log"
"go.opentelemetry.io/otel/api/global"
"go.opentelemetry.io/otel/label"
"go.opentelemetry.io/otel/exporters/trace/jaeger"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
// initTracer creates a new trace provider instance and registers it as global trace provider.
func initTracer() func() {
// Create and install Jaeger export pipeline
flush, err := jaeger.InstallNewPipeline(
jaeger.WithAgentEndpoint("172.16.230.4:6831"),
// jaeger.WithCollectorEndpoint("http://localhost:14268/api/traces"),
jaeger.WithProcess(jaeger.Process{
ServiceName: "trace-demo",
Tags: []label.KeyValue{
label.String("exporter", "jaeger"),
label.Float64("float", 312.23),
},
}),
jaeger.WithSDK(&sdktrace.Config{DefaultSampler: sdktrace.AlwaysSample()}),
)
if err != nil {
log.Fatal(err)
}
return func() {
flush()
}
}
func main() {
fn := initTracer()
defer fn()
ctx := context.Background()
tr := global.Tracer("component-main")
ctx, span := tr.Start(ctx, "foo")
bar(ctx)
span.End()
}
func bar(ctx context.Context) {
tr := global.Tracer("component-bar")
_, span := tr.Start(ctx, "bar")
defer span.End()
// Do bar...
}
在JaegerUI上選擇trace-demo
, 按下Find Traces.
就會看到Traces了. 點進去就會看到.
foo就是Parent Span,
bar則是Sub Span.
這樣基本的就完成了環境建置.
剩下的明天再來看看.