[Day 5] 尋找 bottleneck

12th鐵人賽

MaxHarry

2020-09-19 23:58:53

909 瀏覽

分享至

Ruby 速度相對其他語言比較慢一直是個硬傷，找出拖慢速度所在的工具也層出不窮，今天介紹土炮做法及兩款 Ruby 原生的工具。

土炮作法

在介紹工具前，大家應該會土砲紀錄執行方法前的時間1跟執行方法後時間2，然後兩個相減來產生時間差異
比方說我想知道，將字串freeze後再計算，到底會不會更快

def slow_method
  string = 'asdfsfs'
  5000.times do
    string =~ /a/
  end
end

def fast_method
  string = 'asdfsfs'.freeze
  5000.times do
    string =~ /a/
  end
end

time1 = Time.current
slow_method
time2= Time.current

time3 = Time.current
fast_method
time4 = Time.current

p "slow_method 耗時:#{time2 - time1}秒"
p "fast_method 耗時:#{time4 - time3}秒"

output:

"slow_method 耗時:0.097952秒"
"fast_method 耗時:0.096589秒"

可以看出 slow_method確實比較慢

缺點：

設置時間time1~time4會多產生幾行code
效率變慢
如果有調用library，或其他的方法，不知道到底哪個部分拖慢時間，還有佔據的時間比例

底下就來介紹兩款工具 profile 跟 benchmark

profile

profile 使用起來很簡單，只要require就好
可以在檔案裡面 require 'profile'
或是在執行檔案時 -rprofiles

app/services/test.rb

def slow_method
  string = 'asdfsfs'
  5000.times do
    string =~ /a/
  end
end

def fast_method
  string = 'asdfsfs'.freeze
  5000.times do
    string =~ /a/
  end
end

slow_method
fast_method

在CLI執行:

ruby -rprofile app/services/test.rb

output

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 35.34     0.06      0.06        2    31.60    87.49  Integer#times
 27.02     0.11      0.05     5001     0.01     0.03  Object#slow_method
 21.55     0.15      0.04     5001     0.01     0.03  Object#fast_method
 13.94     0.18      0.02    10000     0.00     0.00  String#=~
  0.37     0.18      0.00        1     0.67     0.67  TracePoint#enable
  0.00     0.18      0.00        2     0.00     0.00  IO#set_encoding
  0.00     0.18      0.00        2     0.00     0.00  Module#method_added
  0.00     0.18      0.00        1     0.00     0.00  TracePoint#disable
  0.00     0.18      0.00        1     0.00     0.00  String#freeze
  0.00     0.18      0.00        1     0.00   178.84  #toplevel

self(自己的)跟 total/cumulative(總和/累計)的差異，可以想成
例如一個A方法 (self，父代) 呼叫，另一個B方法(子代)。

%: 執行時間的百分比 [ex. 35.34% 在呼叫 Integer#times]
cumulative seconds: 自己包含子代呼叫的總時間累計
self seconds: 執行時間(不包含子代)
calls 總呼叫次數 [ex. Integer#times 被呼叫2次]
self ms/call: 此方法不含子代每次呼叫所需要的平均時間
total ms/call: 此方法包含子代每次呼叫所需要的平均時間

benchmark

一樣只要require就能用

require 'benchmark'

def slow_method
  string = 'asdfsfs'
  5000.times do
    string =~ /a/
  end
end

def fast_method
  string = 'asdfsfs'.freeze
  5000.times do
    string =~ /a/
  end
end

slow_method
fast_method

Benchmark.bm do |x|
  x.report('slow_method') do
    slow_method
  end

  x.report('fast_method') do
    fast_method
  end
end

output:

              user     system      total        real
slow_method  0.001342   0.000004   0.001346 (  0.001347)
fast_method  0.000851   0.000004   0.000855 (  0.000867)

ps. 時間單位都是秒

輸出參數依序為：

user CPU time
system CPU time
the sum of the user and system CPU times
the elapsed real time.

x.report(attr)，其中attr 是 table_head 不給也可以，反正是按照x.report被呼叫的順序列出的。

自訂欄位

新增total 跟 avg 欄位

#benchmark(caption = "", label_width = nil, format = nil, *labels)

其中 label_width 是表頭寬度，設定太小會對不齊

require 'benchmark'
include Benchmark  # 如果不 include，會找不到 CAPTION 跟 FORMAT 

def slow_method
  string = 'asdfsfs'
  5000.times do
    string =~ /a/
  end
end

def fast_method
  string = 'asdfsfs'.freeze
  5000.times do
    string =~ /a/
  end
end

Benchmark.benchmark(CAPTION, 11, FORMAT, ">total:", ">avg:") do |x|
  ts = x.report("slow_method:")   { slow_method }
  tf = x.report("fast_method:")   { fast_method }
  [ts + tf, (ts + tf) / 2]
end

output:

                  user     system      total        real
slow_method:  0.001153   0.000083   0.001236 (  0.001234)
fast_method:  0.000898   0.000013   0.000911 (  0.000920)
>total:       0.002051   0.000096   0.002147 (  0.002154)
>avg:         0.001025   0.000048   0.001073 (  0.001077)

#bmbm(width = 0)

有時候程式執行時間會被記憶體回收器garbage collection所影響，先執行跟後執行會有差異，使用 bmbm這個方法可以減少這個影響，ruby跑兩次code，第一次是rehearsal，第二次才是真的執行，在執行第二次前會呼叫GC來清運 [GC.start]，GC 運作時間不包含在程式執行時間
官方文件說明：

the first time as a rehearsal in order to get the runtime environment stable, the second time for real. GC.start is executed before the start of each of the real timings; the cost of this is not included in the timings.

require 'benchmark'

def slow_method
  string = 'asdfsfs'
  5000.times do
    string =~ /a/
  end
end

def fast_method
  string = 'asdfsfs'.freeze
  5000.times do
    string =~ /a/
  end
end

Benchmark.bmbm(width = 0) do |x|
  x.report("slow_method:")   { slow_method }
  x.report("fast_method:")   { fast_method }
end

output:

load 'app/services/test.rb'

Rehearsal ------------------------------------------------
slow_method:   0.001144   0.000003   0.001147 (  0.001141)
fast_method:   0.000830   0.000000   0.000830 (  0.000830)
--------------------------------------- total: 0.001977sec

                   user     system      total        real
slow_method:   0.001085   0.000013   0.001098 (  0.001085)
fast_method:   0.001045   0.000019   0.001064 (  0.001087)

如果在 rails console 想知道某一行 sql 執行的狀況，就更簡單了。

Benchmark.measure { Shop.first.customers.all }

output:

  Shop Load (7.1ms)  SELECT `shops`.* FROM `shops` LIMIT 1
  Customer Load (7.5ms)  SELECT `customers`.* FROM `customers` WHERE `customers`.`shop_id` = 1
#<Benchmark::Tms:0x007fb4bf74b0c8 @label="", @real=0.15561799998977222, @cstime=0.0, @cutime=0.0, @stime=0.07000000000000028, @utime=0.009999999999999787, @total=0.08000000000000007>