iT邦幫忙

第 12 屆 iThome 鐵人賽

DAY 25
1
Software Development

服務開發雜談系列 第 25

NATS Monitor介紹

  • 分享至 

  • xImage
  •  

Day23提到, http_port 8222是用來提供監控的API端口.
NATS提供了幾個API, 用來提供一些統計資料和其他資訊.

varz 該服務節點的各種常規統計資訊

打開localhost:18222:varz 會看到類似下面的資訊.
其中connect_urls就是叢集節點的資訊
max_connections就最高連線數, 可更改.

其中的max_payload就是表示一個payload的byte size上限.
max_pending就是對每一個connection的byte buffer大小.
cluster就該cluster的資訊.
cpumemcores都是跟CPU, 記憶體使用率相關的資訊.
slow_consumers這個就是要監控的其中一個項目, 要是非0很久通常要警報.
subscriptions訂閱數量
http_req_stats一些監控API被存取的累積次數

{
  "server_id": "NDNNR4NOELGPAB2HTRSWFLSN5DFXGFNST5Z2PVYFY4BUQI7A6E6IH6XK",
  "server_name": "NDNNR4NOELGPAB2HTRSWFLSN5DFXGFNST5Z2PVYFY4BUQI7A6E6IH6XK",
  "version": "2.1.8",
  "proto": 1,
  "git_commit": "c0b574f",
  "go": "go1.14.8",
  "host": "0.0.0.0",
  "port": 4222,
  "connect_urls": [
    "172.16.230.101:4222",
    "172.16.230.100:4222"
  ],
  "max_connections": 65536,
  "ping_interval": 120000000000,
  "ping_max": 2,
  "http_host": "0.0.0.0",
  "http_port": 8222,
  "http_base_path": "",
  "https_port": 0,
  "auth_timeout": 1,
  "max_control_line": 4096,
  "max_payload": 1048576,
  "max_pending": 67108864,
  "cluster": {
    "addr": "0.0.0.0",
    "cluster_port": 6222,
    "auth_timeout": 1,
    "urls": [
      "nats-1:6222"
    ]
  },
  "gateway": {},
  "leaf": {},
  "tls_timeout": 0.5,
  "write_deadline": 2000000000,
  "start": "2020-09-30T14:36:09.165131164Z",
  "now": "2020-09-30T15:45:32.530157919Z",
  "uptime": "1h9m23s",
  "mem": 16031744,
  "cores": 12,
  "gomaxprocs": 12,
  "cpu": 0,
  "connections": 0,
  "total_connections": 4,
  "routes": 1,
  "remotes": 1,
  "leafnodes": 0,
  "in_msgs": 1126704,
  "out_msgs": 1126704,
  "in_bytes": 44688058,
  "out_bytes": 44688058,
  "slow_consumers": 0,
  "subscriptions": 24,
  "http_req_stats": {
    "/": 0,
    "/connz": 24,
    "/gatewayz": 0,
    "/routez": 0,
    "/subsz": 0,
    "/varz": 17
  },
  "config_load_time": "2020-09-30T14:36:09.165131164Z"
}

connz 現在或是剛被關閉的connections資訊

打開localhost:18222:conz 會看到類似連線資訊.
這隻API有一些query string我平常會用的

  1. subs 1/0 : 用來要不要多顯示詳情資訊
  2. limit : 顯示筆數, 預設是1024, 小公司很夠了QQ
  3. offset : 分頁用, 預設是0
  4. sort : 根據條件排序. More Reference

主要就看idlepending_bytes
subscriptions_list則是該connection的訂閱清單.

{
  "server_id": "NBOJ6Q6DCTJJ7HENSLUH4JHZ6LXF6QLXWJQQCW4EUUSU5J375CPTBY6O",
  "now": "2020-09-30T16:12:56.042362837Z",
  "num_connections": 16,
  "total": 16,
  "offset": 0,
  "limit": 1024,
  "connections": [
    {
      "cid": 36,
      "ip": "172.16.230.1",
      "port": 47700,
      "start": "2020-09-30T16:12:54.11845346Z",
      "last_activity": "2020-09-30T16:12:56.041943175Z",
      "rtt": "446µs",
      "uptime": "1s",
      "idle": "0s",
      "pending_bytes": 0,
      "in_msgs": 5432,
      "out_msgs": 5431,
      "in_bytes": 259659,
      "out_bytes": 164682,
      "subscriptions": 5,
      "lang": "go",
      "version": "1.10.0",
      "subscriptions_list": [
        "_STAN.acks.UgTiAgxx9Uk0Y8Pj2xwfx3",
        "_INBOX.UgTiAgxx9Uk0Y8Pj2xwfz2",
        "_INBOX.UgTiAgxx9Uk0Y8Pj2xwfr6",
        "_INBOX.UgTiAgxx9Uk0Y8Pj2xwft5",
        "_INBOX.UgTiAgxx9Uk0Y8Pj2xwfv4.*"
      ]
    }
  ]
}

subz 當前訂閱資訊

打開localhost:18222:subsz 會看到類似連線資訊.
這隻API有一些query string我平常會用的

  1. subs 1/0 : 用來要不要多顯示詳情資訊
  2. limit : 顯示筆數, 預設是1024, 小公司很夠了QQ
  3. offset : 分頁用, 預設是0
  4. test : 指定subject
{
  "server_id": "NBOJ6Q6DCTJJ7HENSLUH4JHZ6LXF6QLXWJQQCW4EUUSU5J375CPTBY6O",
  "now": "2020-09-30T16:31:50.348121693Z",
  "num_subscriptions": 26,
  "num_cache": 299,
  "num_inserts": 104,
  "num_removes": 78,
  "num_matches": 943,
  "cache_hit_rate": 0.5906680805938495,
  "max_fanout": 1,
  "avg_fanout": 0.9565217391304348,
  "total": 26,
  "offset": 0,
  "limit": 1024,
  "subscriptions_list": [
    {
      "subject": "raft.test-cluster.node1.test-cluster.accept",
      "sid": "1",
      "msgs": 0,
      "cid": 9
    }
  ]
}

NATS Streaming也提供了幾個API, 用來提供一些統計資料和其他資訊.

serverz 該服務節點的各種常規統計資訊

打開localhost:18222:streaming/serverz
主要就是看role, 有Leader、Follower、Candidate. (又是Raft)
還有open_fdsmax_fds

{
  "cluster_id": "test-cluster",
  "server_id": "80cTxp7LklHZZl5OwxW4TF",
  "version": "0.18.0",
  "go": "go1.14.4",
  "state": "CLUSTERED",
  "role": "Leader",
  "now": "2020-09-30T16:35:36.430556512Z",
  "start_time": "2020-09-30T14:36:10.203910734Z",
  "uptime": "1h59m26s",
  "clients": 0,
  "subscriptions": 0,
  "channels": 9,
  "total_msgs": 1002319,
  "total_bytes": 78329272,
  "in_msgs": 819523,
  "in_bytes": 67394943,
  "out_msgs": 819534,
  "out_bytes": 31336342,
  "open_fds": 32,
  "max_fds": 1048576
}
{
  "cluster_id": "test-cluster",
  "server_id": "SVGdOmes52I66DxgT4vqdm",
  "version": "0.18.0",
  "go": "go1.14.4",
  "state": "CLUSTERED",
  "role": "Follower",
  "now": "2020-09-30T16:38:19.612337427Z",
  "start_time": "2020-09-30T14:36:12.185931325Z",
  "uptime": "2h2m7s",
  "clients": 0,
  "subscriptions": 0,
  "channels": 9,
  "total_msgs": 1002319,
  "total_bytes": 78329272,
  "in_msgs": 0,
  "in_bytes": 0,
  "out_msgs": 0,
  "out_bytes": 0,
  "open_fds": 32,
  "max_fds": 1048576
}

clientsz 列出client清單

localhost:18225/streaming/clientsz
這裡可以看有哪些client connected, 還有是不是有設定durable跟max_inflight, ack_wait等設定.

{
  "cluster_id": "test-cluster",
  "server_id": "SVGdOmes52I66DxgT4vqdm",
  "now": "2020-09-30T18:01:42.995346629Z",
  "offset": 0,
  "limit": 1024,
  "count": 1,
  "total": 1,
  "clients": [
    {
      "id": "nathan01",
      "hb_inbox": "_INBOX.TU1UpmKR8mKPKY9L83f9e6",
      "subscriptions": {
        "testTopic": [
          {
            "client_id": "nathan01",
            "inbox": "_INBOX.TU1UpmKR8mKPKY9L83f9t8",
            "ack_inbox": "_INBOX.80cTxp7LklHZZl5OwxW5Zs",
            "is_durable": false,
            "is_offline": false,
            "max_inflight": 1,
            "ack_wait": 10,
            "last_sent": 1066035,
            "pending_count": 0,
            "is_stalled": false
          }
        ]
      }
    }
  ]
}

channelsz 列出所有channels(subject)

{
  "cluster_id": "test-cluster",
  "server_id": "SVGdOmes52I66DxgT4vqdm",
  "now": "2020-09-30T18:03:20.491138838Z",
  "offset": 0,
  "limit": 1024,
  "count": 9,
  "total": 9,
  "channels": [
    {
      "name": "testTopic",
      "msgs": 1000000,
      "bytes": 78140820,
      "first_seq": 69834,
      "last_seq": 1069833,
      "subscriptions": [
        {
          "client_id": "nathan01",
          "inbox": "_INBOX.NPh3Gq6ocJREA30inok1Is",
          "ack_inbox": "_INBOX.80cTxp7LklHZZl5OwxW5cr",
          "is_durable": false,
          "is_offline": false,
          "max_inflight": 1,
          "ack_wait": 10,
          "last_sent": 1069814,
          "pending_count": 1,
          "is_stalled": true
        }
      ]
    }
  ]
}

storez 列出streaming的儲存設定

打開localhost:18225/streaming/storez 會看到類似下面的資訊.
畢竟是in-memory MQ, 不可能無限制的資源存放著沒人要的訊息. 這裡設定的都是針對單個channel的設定.
所以如下顯示, 就是一個channel可以有max_msgs1000000個訊息,容量max_bytes則是1GB,max_age放多久等等的配置.

{
  "cluster_id": "test-cluster",
  "server_id": "9tOlQ2TFjpSopm4erVtvFf",
  "now": "2020-10-01T05:31:03.574407241Z",
  "type": "RAFT_FILE",
  "limits": {
    "max_channels": 100,
    "max_msgs": 1000000,
    "max_bytes": 1024000000,
    "max_age": 0,
    "max_subscriptions": 1000,
    "max_inactivity": 0
  },
  "total_msgs": 1002319,
  "total_bytes": 78331864
}

有了以上的基本概念,
搭配prometheus-nats-exporter
就能針對各節點進行監控, 在命令中加入想要監控的參數即可
像是prometheus-nats-exporter -varz -connz "http://localhost:8222"

題外話, 至於要選擇In-Memory MQ還是Log-based MQ能參考這篇
為什麼日誌型訊息佇列效能這麼高?


上一篇
NATS Client使用&支持場景
下一篇
Distributed Tracing 分布式鏈路追蹤簡介
系列文
服務開發雜談33
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言