使用python抓取讀檔紀錄中指定的字串:
原本log:"2024-08-01 09:18:53,093 INFO [stdout] 2024-08-01 09:18:53 INFO (AuthenticationHandler.java:XXX) [AuthenticationHandler.authenticate] ProductName:BPM.JAVA Date:08-01-2024 Time:09:18:53 ProgID:AuthenticationHandler ClientIP:XXX.XX.XXX.XX WebServer:bpm.oitc.com.tw:XXXX UserID:11111 UserName:11111 UserOID: Type:Login TargetID:11111 TargetName:11111 LicenseNotEnoughException"
需抓取內容:Date、Time、UserID(這三個-其他都不用)
=>目前匯出的格式如下:
"Filename: server.log.2024-08-01 | Date: 08-01-2024 Time:09:16:48 ProgID:AuthenticationHandler ClientIP:XXX.XX.XXX.XX WebServer:bpm.oitc.com.tw:XXXX UserID:11111 UserName:11111 UserOID: Type:Login TargetID:11111 TargetName:11111 LicenseNotEnoughException | Time: 09:16:48 ProgID:AuthenticationHandler ClientIP:XXX.XX.XXX.XX WebServer:bpm.oitc.com.tw:XXXX UserID:11111 UserName:11111 UserOID: Type:Login TargetID:11111 TargetName:11111 LicenseNotEnoughException | UserID: 11111 UserName:11111 UserOID: Type:Login TargetID:11111 TargetName:11111 LicenseNotEnoughException | Error: LicenseNotEnoughException"
程式碼如附件
The Stickman Hook game’s physics engine is well-designed, making the swinging mechanics feel realistic and fun. The variety of obstacles ensures that the game doesn’t become repetitive, and there’s always something new to discover as you progress.
Your regex is probably too greedy, so it’s capturing everything after Date: and Time: instead of only the exact values you want.
You only need targeted groups for the three fields:
import re
log = '2024-08-01 09:18:53,093 INFO [stdout] 2024-08-01 09:18:53 INFO (AuthenticationHandler.java:XXX) [AuthenticationHandler.authenticate] ProductName:BPM.JAVA Date:08-01-2024 Time:09:18:53 ProgID:AuthenticationHandler ClientIP:XXX.XX.XXX.XX WebServer:bpm.oitc.com.tw:XXXX UserID:11111 UserName:11111 UserOID: Type:Login TargetID:11111 TargetName:11111 LicenseNotEnoughException'
pattern = r'Date:(\d{2}-\d{2}-\d{4}).*?Time:(\d{2}:\d{2}:\d{2}).*?UserID:(\d+)'
match = re.search(pattern, log)
if match:
date, time, userid = match.groups()
print(f"Date: {date}, Time: {time}, UserID: {userid}")
Output:
Date: 08-01-2024, Time: 09:18:53, UserID: 11111
Key fix: use .*? (non-greedy) between fields.
@ clicker games