iT邦幫忙

0

[gem5][simple-rv-vp] 學習輸出 gem5 的效能數據

  • 分享至 

  • xImage
  •  

系列文章 : [gem5] 從零開始的 gem5 學習筆記

想用 gem5 輸出效能數據非常簡單,因為 gem5 運行結束之後,預期會在 m5out/stats.txt 列出效能數據。要注意,假如使用 gdb --args 去開啟的話,就不會輸出這個檔案了。

gem5.opt 去運行之前開發好的 python configuration file,並且運行簡易 firmware。

./gem5/build/RISCV/gem5.opt ./simple-riscv-vp.py --firmware ./firmware/build/simple.elf --cpu-type o3 --l1-icache

接著用 ctrl + c 停止模擬,預期可以在 m5out 這個資料夾底下看到 stats.txt 檔案。



CPU CPI & IPC

這邊可以看到關於 CPU 的重要資訊,也就是 cycle per instruction ( 每一道指令需要多少 cycle ) 跟 instruction per cycle ( 每個 cycle 可以運行幾道指令 ) ,很明顯的他們互為倒數。

system.cpu.cpi                              13.537585                       # CPI: cycles per instruction (core level) ((Cycle/Count))
system.cpu.ipc                               0.073868                       # IPC: instructions per cycle (core level) ((Count/Cycle))


branch predict miss & hit

這邊能看到 branch predictor 相關的數據,例如說預測命中的機率。

system.cpu.branchPred.BTBLookups                67281                       # Number of BTB lookups (Count)
system.cpu.branchPred.BTBUpdates                   51                       # Number of BTB updates (Count) 
system.cpu.branchPred.BTBHits                   57242                       # Number of BTB hits (Count)
system.cpu.branchPred.BTBHitRatio            0.850790                       # BTB Hit Ratio (Ratio)
system.cpu.branchPred.BTBMispredicted              43                       # Number BTB mispredictions. No target found or target wrong (Count)


cache miss & hit

關於 icache,這邊可以看到它的命中次數,以及 miss 次數。

system.l1_icache.ReadReq.hits::cpu.inst        173166                       # number of ReadReq hits (Count)
system.l1_icache.ReadReq.hits::total           173166                       # number of ReadReq hits (Count)
system.l1_icache.ReadReq.misses::cpu.inst           54                       # number of ReadReq misses (Count)
system.l1_icache.ReadReq.misses::total             54                       # number of ReadReq misses (Count)
system.l1_icache.ReadReq.missLatency::cpu.inst      2386500                       # number of ReadReq miss ticks (Tick)
system.l1_icache.ReadReq.missLatency::total      2386500                       # number of ReadReq miss ticks (Tick)
system.l1_icache.ReadReq.accesses::cpu.inst       173220                       # number of ReadReq accesses(hits+misses) (Count)
system.l1_icache.ReadReq.accesses::total       173220                       # number of ReadReq accesses(hits+misses) (Count)
system.l1_icache.ReadReq.missRate::cpu.inst     0.000312                       # miss rate for ReadReq accesses (Ratio)
system.l1_icache.ReadReq.missRate::total     0.000312                       # miss rate for ReadReq accesses (Ratio)


TLB miss & hit

因為我的 firmware 沒有開啟 paging,所以不會使用到 TLB,於是可以看到 TLB 相關的各項數據會輸出 0。看來要玩 TLB 的話,需要換一個會使用到 TLB 的軟體了。。。希望之後可以把 xv6-riscv 移植到 gem5 上運行 !! ( xv6-riscv 有使用到 paging )

system.cpu.mmu.dtb.readHits                         0                       # read hits (Count)
system.cpu.mmu.dtb.readMisses                       0                       # read misses (Count)
system.cpu.mmu.dtb.readAccesses                     0                       # read accesses (Count)
system.cpu.mmu.dtb.writeHits                        0                       # write hits (Count)
system.cpu.mmu.dtb.writeMisses                      0                       # write misses (Count)
system.cpu.mmu.dtb.writeAccesses                    0                       # write accesses (Count)
system.cpu.mmu.dtb.hits                             0                       # Total TLB (read and write) hits (Count)
system.cpu.mmu.dtb.misses                           0                       # Total TLB (read and write) misses (Count)
system.cpu.mmu.dtb.accesses                         0                       # Total TLB (read and write) accesses (Count)
system.cpu.mmu.dtb.walker.num_4kb_walks             0                       # Completed page walks with 4KB pages (Count)
system.cpu.mmu.dtb.walker.num_64kb_walks            0                       # Completed page walks with 64KB pages (Count)
system.cpu.mmu.dtb.walker.num_2mb_walks             0                       # Completed page walks with 2MB pages (Count)
system.cpu.mmu.dtb.walker.power_state.pwrStateResidencyTicks::UNDEFINED   1971201000                       # Cumulative time (in ticks) in various power states (Tick)
system.cpu.mmu.itb.readHits                         0                       # read hits (Count)
system.cpu.mmu.itb.readMisses                       0                       # read misses (Count)
system.cpu.mmu.itb.readAccesses                     0                       # read accesses (Count)
system.cpu.mmu.itb.writeHits                        0                       # write hits (Count)
system.cpu.mmu.itb.writeMisses                      0                       # write misses (Count)
system.cpu.mmu.itb.writeAccesses                    0                       # write accesses (Count)
system.cpu.mmu.itb.hits                             0                       # Total TLB (read and write) hits (Count)
system.cpu.mmu.itb.misses                           0                       # Total TLB (read and write) misses (Count)
system.cpu.mmu.itb.accesses                         0                       # Total TLB (read and write) accesses (Count)
system.cpu.mmu.itb.walker.num_4kb_walks             0                       # Completed page walks with 4KB pages (Count)
system.cpu.mmu.itb.walker.num_64kb_walks            0                       # Completed page walks with 64KB pages (Count)
system.cpu.mmu.itb.walker.num_2mb_walks             0                       # Completed page walks with 2MB pages (Count)


但有個問題, 這個數據會把開機到現在所有的事件都計算進去,但其實我感興趣的部分可能只有 dhrystone 運行的時間而已,這時候該怎麼濾掉不感興趣的部分,只計算感興趣的部分 ( ROI, region of interest ) 呢 ?

希望有方法可以解決!接下來有空的話,可能會試試看用 M5ops 解決這個問題。


圖片
  熱門推薦
圖片
{{ item.channelVendor }} | {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言