用來追蹤和紀錄微服務架構底下的應用程式, 並且把應用程式的各組件做了合併, 方便追蹤故障位置和導致效能低下的原因.
當有人發出請求後, 就可能會啟動鍊路追蹤
, 追蹤會把上層服務與下游服務之間的調用與處理以有向無環圖DAG展示.
OpenTelemetry很多概念來自於Google Dapper、X-Trace等Tracing系統.
都是為了讓請求跨越服務邊界時, 能夠紀錄Profile以及分析.
像這圖有一個Parent Span和多個Child Span,
Parent Span表示整條鍊路的end-to-end總時間成本.
Child Span就該鍊路其中的子操作.
一個完整的調用鍊路追蹤, 一個Trace表示了一個事務在分布式系統中的執行過程.
一個Trace是由多個Span組成的一個Tree(DAG的一種), 同時Trace也是整個Tree的Root.
[Span A] ←←←(the root span)
|
+------+------+
| |
[Span B] [Span C] ←←←(Span C is a `ChildOf` Span A)
| |
[Span D] +---+-------+
| |
[Span E] [Span F] >>> [Span G] >>> [Span H]
↑
↑
↑
(Span G `FollowsFrom` Span F)
接著透過時間
維度來微分? 投射該Tree每個節點到圖上.
––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time
[Span A···················································]
[Span B··············································]
[Span D··········································]
[Span C········································]
[Span E·······] [Span F··] [Span G··] [Span H··]
一個Trace ID是由[16]Bytes, 也就是128bit表示的UUID.
Span基本工作單元, 一次的鍊路調用, 不管是rpc、DB的調用都能(就都是調用不是當下進程的服務)創建一個Span.
Span包含了操作名稱, 開始和結束的timestamp, 這期間發生的Attribute和Events, 到其他Span的Link, 以及一些操作狀態.
OPenTelemetry內的Span:
type Span struct {
lock sync.RWMutex
tracer *Tracer
spanContext trace.SpanContext
parentSpanID trace.SpanID
ended bool
name string
startTime time.Time
endTime time.Time
statusCode codes.Code
statusMessage string
attributes map[label.Key]label.Value
events []Event
links map[trace.SpanContext][]label.KeyValue
spanKind trace.SpanKind
}
主要定義了KeyValue結構, 跟很多方法來提供操作.
給Attribte和Event等使用.
// KeyValue holds a key and value pair.
type KeyValue struct {
Key Key
Value Value
}
上面那張圖Span之間的連線稱為Span Context, 可以在函數內或是rpc之間傳遞這訊息.
像是Parent span或是一些自定義的Label.
主要就是用來告訴下一個Span, 你的Parent span是誰, 還有都是隸屬於哪個Trace ID下.
Attribute就是用Label提供K-V結構來放一些資訊, 這些資訊會是給所有Span參考用的.
但資訊不包含timestamp.
那可以放啥? 放像是StatusCode、SpanKind等資訊, 方便查詢或是分組分析用.
注意!
Span內的attributes是map[label.Key]label.Value,所以同名字的value, 會被後蓋前!
func (s *Span) SetAttributes(attrs ...label.KeyValue) {
s.lock.Lock()
defer s.lock.Unlock()
if s.ended {
return
}
for _, attr := range attrs {
s.attributes[attr.Key] = attr.Value
}
}
func (s *Span) SetAttribute(k string, v interface{}) {
s.SetAttributes(label.Any(k, v))
}
// Attributes returns the attributes set on the Span, either at or after creation time.
// If the same attribute key was set multiple times, the last call will be used.
// Attributes cannot be changed after End has been called on the Span.
func (s *Span) Attributes() map[label.Key]label.Value {
s.lock.RLock()
defer s.lock.RUnlock()
attributes := make(map[label.Key]label.Value)
for k, v := range s.attributes {
attributes[k] = v
}
return attributes
}
用來表示該Span的類型, 以下是內建的SpanKind
const (
SpanKindUnspecified SpanKind = 0
SpanKindInternal SpanKind = 1
SpanKindServer SpanKind = 2
SpanKindClient SpanKind = 3
SpanKindProducer SpanKind = 4
SpanKindConsumer SpanKind = 5
)
表示在該Span發生的事件.
Event主要能設定Event名稱, timestamp, 跟一些Attribute.
Event是Slice, 因為Event發生的順序是不能顛倒了, 畢竟跟時間有關.
(不是誰都做時間跳躍的, 要用時光寶石做Reply也只能照順序)
舉例: Can't connect to mysql server on 'xxx.xx.xx.xx'(10061)
func (s *Span) AddEvent(ctx context.Context, name string, attrs ...label.KeyValue) {
s.AddEventWithTimestamp(ctx, time.Now(), name, attrs...)
}
func (s *Span) AddEventWithTimestamp(ctx context.Context, timestamp time.Time, name string, attrs ...label.KeyValue) {
s.lock.Lock()
defer s.lock.Unlock()
if s.ended {
return
}
attributes := make(map[label.Key]label.Value)
for _, attr := range attrs {
attributes[attr.Key] = attr.Value
}
s.events = append(s.events, Event{
Timestamp: timestamp,
Name: name,
Attributes: attributes,
})
}
如果有一個Span的處理內部是Batch型式的話, 通常就會用Links.
把目前的Span和內部批量處理的SubSpan做關聯.
Propagator定義了一組interface用來序列化和反序列化, 跨服務所提供的SpanContext資訊.
// Propagators is the interface to a set of injectors and extractors
// for all supported carrier formats. It can be used to chain multiple
// propagators into a single entity.
type Propagators interface {
// HTTPExtractors returns the configured extractors.
HTTPExtractors() []HTTPExtractor
// HTTPInjectors returns the configured injectors.
HTTPInjectors() []HTTPInjector
}
這裡的Extractor, 很多地方的實做會用到Go的WithValue context, 能參考小弟去年的文章
像是TraceContext的Extract