iT邦幫忙

2021 iThome 鐵人賽

DAY 29
2
Software Development

系統與服務雜談系列 第 29

Log Agent - Fluent Bit Multiline Parsing

  • 分享至 

  • xImage
  •  

Fluent bit回顧
Log Agent - Fluent Bit 簡介
Log Agent - Fluent Bit 安裝與常見架構模式
Log Agent - Fluent Bit Service配置與內建 API
Log Agent - Fluent Bit Input元件 與 Tail淺談
Log Agent - Fluent Bit Parser元件

Multiline Parsing

昨天介紹的Regex Parser其實只適用於單行的Log資料.
因為Tail是讀取一行就往Parser送.

所以如果Log資料本來就是Multiline多行的.
就需要用Multiline Parser, 對Regex Parser做點處理

內建的Multiline Parsers

針對有些環境所產出的Log, Fluent bit有內建好幾個Multiline Parser
就不必自己刻寫了

  • Docker
  • CRI
  • Go
  • Pythn
  • Java

但我們也能自定義Multiline Parser
上一篇提到的Parser, 它都是[Parser]這樣作為section name
但Multiline Parser則是[MULTILINE_PARSER]作為section name
然後Parser與MULTILINE_PARSER都建議別直接配置在fluent-bit.conf
需要另外拉一個parsers.conf或者是parsers_multiline.conf
在fluent-bit.conf做import

[SERVICE]
    flush 1
    Daemon off
    log_level info
    parsers_file parsers.conf
    parsers_file parsers_multiline.conf

然後MULTILINE_PARSER需要設定幾個properties

  • Name
    • 就Multiline Parser的name
  • type
    • 設定成regex
  • rule
    • 用rule來寫regex, 使得讓multiline parser知道第一行的樣貌跟讀到怎樣的樣貌是結束
    • 下面範例, start_state 就是起始狀態的名字 只要符合其regex pattern的就是多行Log的第一行
    • start_state匹配到第一行後, 就看有沒有next state, 這裡指定下一個state是cont
    • 就繼續讀下一行, 判斷是不是匹配start_state, 不是就拿現在狀態的cont的regex patern來繼續匹配
    • 這樣直到某一行讀到, 它是匹配start_state的, 就是下一段的多行了
# rules   |   state name   | regex pattern         | next state name
# --------|----------------|----------------------------------------
    rule     "start_state"   "/(Dec \d+ \d+\:\d+\:\d+)(.*)/"  "cont"
    rule     "cont"          "/^\s+at.*/"                     "cont"

來個範例
fluent-bit.conf
這裡input的multiline.parser,
我們測試內建的Go multiline parser和自定義的parser
多個parser用,做分隔就好

[INPUT]    
    Name        tail    
    Path        /var/log/demo/demo.log
    read_from_head   true
    multiline.parser      multiline-regex-test, go

parsers_multiline.conf

[MULTILINE_PARSER]
    name          multiline-regex-test
    type          regex
    rule      "start_state"   "/(Dec \d+ \d+\:\d+\:\d+)(.*)/"  "cont"
    rule      "cont"          "/^\s+at.*/"                     "cont"

測試log

Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)

Dec 14 06:41:09 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
Dec 14 06:41:10 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
Dec 14 06:41:11 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)
    
panic: my panic

goroutine 4 [running]:
panic(0x45cb40, 0x47ad70)
  /usr/local/go/src/runtime/panic.go:542 +0x46c fp=0xc42003f7b8 sp=0xc42003f710 pc=0x422f7c
main.main.func1(0xc420024120)
  foo.go:6 +0x39 fp=0xc42003f7d8 sp=0xc42003f7b8 pc=0x451339
runtime.goexit()
  /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003f7e0 sp=0xc42003f7d8 pc=0x44b4d1
created by main.main
  foo.go:5 +0x58
panic: my panic

goroutine 4 [running]:
panic(0x45cb40, 0x47ad70)
  /usr/local/go/src/runtime/panic.go:542 +0x46c fp=0xc42003f7b8 sp=0xc42003f710 pc=0x422f7c
main.main.func1(0xc420024120)
  foo.go:6 +0x39 fp=0xc42003f7d8 sp=0xc42003f7b8 pc=0x451339
runtime.goexit()
  /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003f7e0 sp=0xc42003f7d8 pc=0x44b4d1
created by main.main
  foo.go:5 +0x58

output
可以看到[0]這結構化日誌, 是多行的, 這麼多行才整理成一筆結構化日誌做Output
而不會笨笨的一行就輸出一筆出去
[1]、[2]、[3]則是來玩看看, 那兩個state狀態的匹配順序與切換
[1]那行匹配到了start_state,
接著下一行[2], 又匹配到了start_state, 就表示[1]能輸出了
接著下一行[3],又匹配到了start_state, 就表示[3]能輸出了

再來的一行是Go的, 就怎樣也不匹配start_state和cont這兩個state, 就換注入的下一個parser解析看看

Go的panic log與自定義的Java log都被multiline parser給正確解析

fluentd_1  | [0] tail.0: [1634142611.283200049, {"log"=>"Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
fluentd_1  |     at com.myproject.module.MyProject.badMethod(MyProject.java:22)
fluentd_1  |     at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
fluentd_1  |     at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
fluentd_1  |     at com.myproject.module.MyProject.someMethod(MyProject.java:10)
fluentd_1  |     at com.myproject.module.MyProject.main(MyProject.java:6)
fluentd_1  | "}]
fluentd_1  | [1] tail.0: [1634142611.283256734, {"log"=>"Dec 14 06:41:09 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
fluentd_1  | "}]
fluentd_1  | [2] tail.0: [1634142611.283260341, {"log"=>"Dec 14 06:41:10 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
fluentd_1  | "}]
fluentd_1  | [3] tail.0: [1634142611.283263447, {"log"=>"Dec 14 06:41:11 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
fluentd_1  |     at com.myproject.module.MyProject.badMethod(MyProject.java:22)
fluentd_1  |     at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
fluentd_1  |     at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
fluentd_1  |     at com.myproject.module.MyProject.someMethod(MyProject.java:10)
fluentd_1  |     at com.myproject.module.MyProject.main(MyProject.java:6)
fluentd_1  | "}]
fluentd_1  | [4] tail.0: [1634143058.181373085, {"log"=>"panic: my panic
fluentd_1  | 
fluentd_1  | goroutine 4 [running]:
fluentd_1  | panic(0x45cb40, 0x47ad70)
fluentd_1  |   /usr/local/go/src/runtime/panic.go:542 +0x46c fp=0xc42003f7b8 sp=0xc42003f710 pc=0x422f7c
fluentd_1  | main.main.func1(0xc420024120)
fluentd_1  |   foo.go:6 +0x39 fp=0xc42003f7d8 sp=0xc42003f7b8 pc=0x451339
fluentd_1  | runtime.goexit()
fluentd_1  |   /usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc42003f7e0 sp=0xc42003f7d8 pc=0x44b4d1
fluentd_1  | created by main.main
fluentd_1  |   foo.go:5 +0x58
fluentd_1  | "}]

其實也能在Filter再做multiline.parser的設定,
讓Input單純的只做監聽與撈取資料


上一篇
Log Agent - Fluent Bit Parser元件
下一篇
Log Agent - Fluent Bit Output + Loki + Grafana
系列文
系統與服務雜談33
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中
1
json_liang
iT邦研究生 4 級 ‧ 2021-10-14 12:43:03

賀即將完賽, 仍然是一篇優質好文

1
wajika
iT邦新手 5 級 ‧ 2021-12-29 14:14:44

请问 如果是docker output json log,怎么做multiline parser?

看更多先前的回應...收起先前的回應...
雷N iT邦研究生 1 級 ‧ 2021-12-29 21:23:31 檢舉

Hi?
docker output json log
是指長的像json, 但不是一行嘛?
還是就一行, 但其中一個key的內容也是json結構

wajika iT邦新手 5 級 ‧ 2021-12-31 13:52:01 檢舉

就是输出的标准 json 日志,如何将前后两条组合为一条?

雷N iT邦研究生 1 級 ‧ 2022-01-03 23:36:54 檢舉

https://discuss.newrelic.com/t/sending-multiline-logs-using-fluentbit-plugin/146730/7
舊版的Multiline On
Parser_FirstLine 搭配parser
能參考zackm的回覆
但盡量還是把log給flat,好做事

wajika iT邦新手 5 級 ‧ 2022-01-05 09:35:47 檢舉

你说的log做flat,就是以json format output吗?

0
wajika
iT邦新手 5 級 ‧ 2022-01-05 10:03:32

我在kubernetes cluster deployed fluentd,然后直接抓 service stdout logs。

这里有一段real log

{"stream":"stdout","docker":{"container_id":"ee1af6bd65a06f0e39c927c3a663d21ad27043b06bbac68dff96a0f584556ca6"},"kubernetes":{"container_name":"XXX-XXX-ali-prd","namespace_name":"app","pod_name":"XXX-XXX-ali-prd-667bbbf97d-jmnph","container_image":"0.0.0.0/dev/XXX-XXX:t0.2.7","container_image_id":"docker-pullable://0.0.0.0/dev/XXX-XXX@sha256:7873030b679c318f05fc408a354ba4eb841017ab73c4200d1b44d129fb1dcbf0","pod_id":"581357fa-0155-11ec-87d1-00163e14c100","host":"ack-prd","labels":{"app":"XXX-XXX-ali-prd","pod-template-hash":"667bbbf97d"},"master_url":"https://0.0.0.0:443/api","namespace_id":"cf40acda-5f56-11ea-bb04-00163e14c100","namespace_labels":{"cattle_io/creator":"norman","field_cattle_io/projectId":"p-qdhxw"}},"message":"u001b[40mu001b[32minfou001b[39mu001b[22mu001b[49m: Microsoft.AspNetCore.Cors.Infrastructure.CorsService[4]n"}


{"stream":"stdout","docker":{"container_id":"ee1af6bd65a06f0e39c927c3a663d21ad27043b06bbac68dff96a0f584556ca6"},"kubernetes":{"container_name":"XXX-XXX-ali-prd","namespace_name":"app","pod_name":"XXX-XXX-ali-prd-667bbbf97d-jmnph","container_image":"0.0.0.0/dev/XXX-XXX:t0.2.7","container_image_id":"docker-pullable://0.0.0.0/dev/XXX-XXX@sha256:7873030b679c318f05fc408a354ba4eb841017ab73c4200d1b44d129fb1dcbf0","pod_id":"581357fa-0155-11ec-87d1-00163e14c100","host":"ack-prd","labels":{"app":"XXX-XXX-ali-prd","pod-template-hash":"667bbbf97d"},"master_url":"https://0.0.0.0:443/api","namespace_id":"cf40acda-5f56-11ea-bb04-00163e14c100","namespace_labels":{"cattle_io/creator":"norman","field_cattle_io/projectId":"p-qdhxw"}},"message":"u001b[40mu001b[32minfou001b[39mu001b[22mu001b[49m: Microsoft.AspNetCore.Routing.EndpointMiddleware[0]n"}

json formatted
https://ithelp.ithome.com.tw/upload/images/20220105/20115317oSz9BSSyiO.png

如果按你说的最好做log flat,是不是上面这样的log format? 如果是的话,那么我想问几个问题。

  1. 是否要对 message field 的符号标记做处理? 比如 u001b 这些
  2. 主要log 的body其实就是message field,那么如何做多行合并? 我的思路是先用 json parser 解析 message field,然后根据 start of a line 的feature symbol 合并,但是这样做还是要先去掉符号标记,比如下面这张图
    https://ithelp.ithome.com.tw/upload/images/20220105/20115317mhkYhGcZYD.png

我想了解你会怎么做呢? 是不是有比我更好的方法?

看更多先前的回應...收起先前的回應...
雷N iT邦研究生 1 級 ‧ 2022-01-05 10:53:17 檢舉

是寫入方就把log的一個record, flat成一行
是大大code block那樣沒錯

我看這第三張圖寫出來的是多行@@"
但如果真沒辦法, 我之前處理也是

rule "cont" "/^(?!\d+-\d+-\d+\s\d+:\d+:\d+.\d+)/" "cont"

細節我下班回家試試看大大第二張圖的情境

wajika iT邦新手 5 級 ‧ 2022-01-05 13:17:51 檢舉

其实我一直以来都有个疑问,为什么google出来,大家对处理docker logs很轻松的样子,难道没遇到这种问题吗? 感觉这种问题应该很常见,但是google却找不到合理的解决方案。 大家都是简单几个字带过

wajika iT邦新手 5 級 ‧ 2022-01-05 13:20:40 檢舉

这个问题困扰我好久,看你的文章和我的思路有点类似(不应该使用regex),但是application logging framework 怎么output 才能不使用regex收集日志呢? 我研究过包括filebeat logtash 都没有很好的对策方案。

wajika iT邦新手 5 級 ‧ 2022-01-05 13:25:02 檢舉

是寫入方就把log的一個record, flat成一行

是否可以详细说下?

我要留言

立即登入留言