iT邦幫忙

2022 iThome 鐵人賽

DAY 8
5
DevOps

淺談DevOps與Observability系列 第 8

淺談OpenTelemetry Specification - Metrics

  • 分享至 

  • xImage
  •  

今天繼續了解三本柱中的Metrics在OTel中的規範

OTel Metrics

官方文件連結

設計目標

  • OTel希望能相互連結metrics到其它的signals上. 像metrics與traces之間能透過exemplars相互關聯. metrics與logs之間則是能透過之前(Day6)提到的Baggage或Context以及Resource context(Day7)做相互關聯.
  • 能從OpenCensus的metrics輕鬆轉移到OTel metrics
  • 希望能對現在常用的metrics instrumentation協議與標準進行全面性的支持. 最低標準是針對PrometheusStatsD.

Conecpts

如同Day6提到的client架構, Otel Metrics也是分成API與SDK兩大部份.

  • API
    • 捕獲原始測量值
    • 把來自SDK的instrumentation跟API做解耦, 讓程序自己配置SDK.
    • 專案裡要是沒啟動SDK, 則不會收集遙測資料!!!
  • SDK
    • 實做API
    • 提供配置、聚合、處理器與導出器的配置與擴充.
    • 把API與SDK分離, 就是希望能在運行時配置不同的SDK.

Programing Model

OTel metrics, 需要透過全局的MeterProvider來建立數個Meter並且指定數個Instruments相關連來獲取metrics, 每個INstruments用來建立一系列的Measurements.
這些measurements會在in-memory state(或者稱View)中聚合成一個metric.
Metric reader在讀取出來後, 給Metric Exporter匯出(可以提供Pull/Push).

其中的MeterProvider、Meter與Instruements都是屬於API的範疇.

  • MeterProvider是API的入口, 用來存取Meters, 用來配置與存儲任何有狀態的對象.
  • Meters則是負責建立Instruments, 用來提供6種instrument type, 像是[Ayncs]Counter, Histogram, [Async]UpDownCounter, AsyncGauge; 剛好都能對應到Prometheus metric types(Counter, Gauge, Histogram).
  • Instrument負責捕獲回報Measurements.
  • Measurement就封裝了一個Value與一些屬性.

Async的instrument type, 需要使用Observe去觀察對象的值, 通常用在像是我們去看資源的使用率.

來個很粗淺的範例, 透過Prometheus exporter以及OTel metric
簡單的展示怎麼呈現自定義的metric

package main

import (
	"context"
	"fmt"
	"log"
	"math/rand"
	"net/http"
	"os"
	"os/signal"
	"time"

	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/exporters/prometheus"
	"go.opentelemetry.io/otel/metric"
	"go.opentelemetry.io/otel/metric/global"
	"go.opentelemetry.io/otel/metric/instrument"
	"go.opentelemetry.io/otel/metric/unit"
	"go.opentelemetry.io/otel/sdk/metric/aggregator/histogram"
	controller "go.opentelemetry.io/otel/sdk/metric/controller/basic"
	"go.opentelemetry.io/otel/sdk/metric/export/aggregation"
	processor "go.opentelemetry.io/otel/sdk/metric/processor/basic"
	selector "go.opentelemetry.io/otel/sdk/metric/selector/simple"
    "go.opentelemetry.io/otel/sdk/resource"
	semconv "go.opentelemetry.io/otel/semconv/v1.10.0"
)

func initMeter() error {
	res, err := resource.New(context.Background(),
		resource.WithFromEnv(),
		resource.WithProcess(),
		resource.WithTelemetrySDK(),
		resource.WithHost(),
		resource.WithAttributes(
			// the service name used to display traces in backends
			semconv.ServiceNameKey.String("ITHOME_14th_Server"),
			attribute.String("environment", "LOCAL"),
		),
	)

	config := prometheus.Config{
		DefaultHistogramBoundaries: []float64{1, 2, 5, 10, 20, 50},
	}

	c := controller.New(
		processor.NewFactory(
			selector.NewWithHistogramDistribution(
				histogram.WithExplicitBoundaries(config.DefaultHistogramBoundaries),
			),
			aggregation.CumulativeTemporalitySelector(),
			processor.WithMemory(true),
		),
		controller.WithResource(res),
	)

	exporter, err := prometheus.New(config, c)
	if err != nil {
		return fmt.Errorf("failed to initialize prometheus exporter: %w", err)
	}
	global.SetMeterProvider(exporter.MeterProvider())
	http.HandleFunc("/", exporter.ServeHTTP)
	go func() {
		_ = http.ListenAndServe(":2222", nil)
	}()

	fmt.Println("Prometheus server running on :2222")
	return nil
}

func main() {
	if err := initMeter(); err != nil {
		log.Fatal(err)
	}

	meter := global.Meter("ithome.com/14th")
	go counter(context.Background(), meter)
	go counterWithLabels(context.Background(), meter)
	go upDownCounter(context.Background(), meter)
	go histogramCase(context.Background(), meter)
	go gaugeCase(context.Background(), meter)


	ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt)
	defer stop()

	fmt.Println("Example finished updating, please visit :2222")

	<-ctx.Done()

}

func gaugeCase(ctx context.Context, meter metric.Meter) {
	memoryUsage, _ := meter.AsyncFloat64().Gauge(
		"MemoryUsage",
		instrument.WithUnit(unit.Bytes),
	)
	for {
		memoryUsage.Observe(ctx, rand.Float64())
		time.Sleep(time.Second)
	}
}


func counter(ctx context.Context, meter metric.Meter) {
	counter, _ := meter.SyncInt64().Counter(
		"request counter",
		instrument.WithUnit("1"),
		instrument.WithDescription("total request"),
	)

	for {
		counter.Add(ctx, 1)
		time.Sleep(time.Second)
	}
}

func counterWithLabels(ctx context.Context, meter metric.Meter) {
	counter, _ := meter.SyncInt64().Counter(
		"cache",
		instrument.WithDescription("Cache hits and misses"),
	)
	for {
		if rand.Float64() < 0.3 {
			// increment hits
			counter.Add(ctx, 1, attribute.String("type", "hits"))
		} else {
			// increments misses
			counter.Add(ctx, 1, attribute.String("type", "misses"))
		}

		time.Sleep(time.Second)
	}
}

func upDownCounter(ctx context.Context, meter metric.Meter) {
	counter, _ := meter.SyncInt64().UpDownCounter(
		"up_down_counter",
		instrument.WithUnit("1"),
		instrument.WithDescription("up down counter"),
	)

	for {
		if rand.Float64() >= 0.5 {
			counter.Add(ctx, +1)
		} else {
			counter.Add(ctx, -1)
		}

		time.Sleep(time.Second)
	}
}

func histogramCase(ctx context.Context, meter metric.Meter) {
	durRecorder, _ := meter.SyncInt64().Histogram(
		"histogram",
		instrument.WithUnit("microseconds"),
		instrument.WithDescription("histogram"),
	)

	for {
		dur := time.Duration(rand.NormFloat64()*5000000) * time.Microsecond
		durRecorder.Record(ctx, dur.Microseconds())

		time.Sleep(time.Millisecond)
	}
}

瀏覽器輸入localhost:2222
會看到類似的數據畫面
https://ithelp.ithome.com.tw/upload/images/20220909/20104930VMiE5wpYTe.png

有些resource context,是OTel reousce package自動偵測到的內容.

今日小總結

OTel metrics還是希望對現有的流行標準做擴充, 但必須得使用它提供的語意與規範來走.
學習難度相對Prometheus, StatsD都高不少, 因為變成要知道兩邊怎用.


上一篇
淺談OpenTelemetry Specification - Logs
下一篇
淺談OpenTelemetry Specification - Metrics (續)
系列文
淺談DevOps與Observability36
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

1 則留言

1
高魁良
iT邦新手 2 級 ‧ 2022-09-08 12:32:23

請問雷N大大,之後會有篇幅介紹與 Prometheus 的差異比較嗎?

雷N iT邦研究生 1 級 ‧ 2022-09-08 12:41:19 檢舉

有打算耶XD
(讓我周末多爬官網學習吸收一下 )

我要留言

立即登入留言