iT邦幫忙

2022 iThome 鐵人賽

DAY 25
0
Software Development

SQL rookie 之天天魯一下系列 第 25

Day 25 - SQL Aggregate function(1)

  • 分享至 

  • xImage
  •  

大家好!

雙十節前除了準備早起參加升旗外,還要做些什麼呢?

...

練習和研究自己無法想像的aggregate functions 如何?

/images/emoticon/emoticon04.gif

那讓我們開始吧!


什麼情形要用到aggregate function 呢?

以todos 這table 為例:

  create_table "todos", force: :cascade do |t|
    t.integer "userId"
    t.text "title"
    t.boolean "completed"
    t.datetime "created_at", precision: 6, null: false
    t.datetime "updated_at", precision: 6, null: false
  end

我們可以很直覺地統計「completed 為true 或false 的筆數」、「各user 有幾筆todos」、或進一步計算出「各個user 各自completed 的todos 完成率」、「所有user completed 的todos 平均值」等,在我們練習之前,先來了解aggregate function 是什麼:

根據PostgreSQL documentation

Aggregate functions compute a single result from a set of input values.

翻譯蒟蒻:聚合函數是根據一組輸入值計算並返回單一值

The built-in normal aggregate functions are listed in Table 9-49 and Table 9-50. The built-in ordered-set aggregate functions are listed in Table 9-51 and Table 9-52. Grouping operations, which are closely related to aggregate functions, are listed in Table 9-53. The special syntax considerations for aggregate functions are explained in Section 4.2.7. Consult Section 2.7 for additional introductory information.

以PostgreSQL 來說,aggregate functions 分成四個種類,內建的普通函數,如表9-49(count, max, min, sum, avg, array_agg)、9-50 的標準差、9-51、52 有序集合函數等(後略)

而aggregate function 的基本用法如:

SELECT count(*) FROM sometable;

讓我們先來試試用聚合函數完成以上假想的情境吧!

情境1:completed 為true 或false 的筆數

SELECT completed, count(completed) FROM todos GROUP BY completed

https://ithelp.ithome.com.tw/upload/images/20221010/20150959kLlrIW8KEa.png

情境2:各user 有幾筆todos

SELECT "userId", count(completed) AS completed FROM todos GROUP BY "userId"

https://ithelp.ithome.com.tw/upload/images/20221010/20150959Ig4AQ2QdJ9.png


那我們又能否利用Active Record 裡的方法,但不用SQL 而得到同樣的結果呢?我們來試試看~~

情境1:completed 為true 或false 的筆數

# 這題之前就做過了,簡單~~

irb(main):003:0> Todo.group(:completed).count
   (0.8ms)  SELECT COUNT(*) AS count_all, "todos"."completed" AS todos_completed FROM "todos" GROUP BY "todos"."completed"
{
    false => 111,
     true => 90
}

情境2:各user 有幾筆todos

irb(main):005:0> Todo.select(:userId, :completed).group(:userId, :completed).order(:userId).size
   (1.0ms)  SELECT COUNT(*) AS count_all, "todos"."userId" AS todos_userid, "todos"."completed" AS todos_completed FROM "todos" GROUP BY "todos"."userId", "todos"."completed" ORDER BY "todos"."userId" ASC
{
      [ 1, true ] => 11,
     [ 1, false ] => 10,
     [ 2, false ] => 12,
      [ 2, true ] => 8,
     [ 3, false ] => 13,
      [ 3, true ] => 7,
     [ 4, false ] => 14,
      [ 4, true ] => 6,
     [ 5, false ] => 8,
      [ 5, true ] => 12,
     [ 6, false ] => 14,
      [ 6, true ] => 6,
     [ 7, false ] => 11,
      [ 7, true ] => 9,
     [ 8, false ] => 9,
      [ 8, true ] => 11,
     [ 9, false ] => 12,
      [ 9, true ] => 8,
     [ 10, true ] => 12,
    [ 10, false ] => 8
}

那麼再進一步的情境3、4 又是否能寫出來呢?(對我現在好難,容被翻爛的書再想想 XD)

情境3:所有user completed 的todos 平均值
情境4:各個user 各自completed 的todos 完成率

/images/emoticon/emoticon02.gif

可練習至此令我不禁好奇,aggregate function 回傳的單一結果又代表了什麼?

目前覺得GeeksforGeeks 的解釋還滿不錯的:

In database management an aggregate function is a function where the values of multiple rows are grouped together as input on certain criteria to form a single value of more significant meaning.

我想正如同上述的假設情境,若我們列出200 筆todos 時的資料雖然很清楚但卻不具有可讀性,而當我們用進一步利用aggregate functions 來整理時,這些資料才逐漸有了意義


最後在結束前想再比較SQL 和Active Record query 的statement 有何差異:

情境1:completed 為true 或false 的筆數

# By raw SQL
SELECT completed, count(completed) FROM todos GROUP BY completed

# By Active Record
SELECT COUNT(*) AS count_all, "todos"."completed" AS todos_completed FROM "todos" GROUP BY "todos"."completed"

這句的差異應該是SELECT 後的語句:SELECT completed, count(completed) vs. SELECT COUNT(*) , "todos"."completed",raw SQL 直接針對completed 欄位計算,Active Record 則是計算了每個row?

情境2:各user 有幾筆todos

# By raw SQL
SELECT "userId", count(completed) AS completed FROM todos GROUP BY "userId"

# By Active Record
SELECT COUNT(*) AS count_all, "todos"."userId" AS todos_userid, "todos"."completed" AS todos_completed FROM "todos" GROUP BY "todos"."userId", "todos"."completed" ORDER BY "todos"."userId" ASC

這邊可能是自己的Ruby 寫太差啦,一定有更好的寫法;除了SELECT 後的語句 一樣多了一個count(*)外,GROUP BY 似乎也多了"todos"."completed",但若不用就會噴錯,Hmm...?


總之今天的練習就先到這囉,謝謝大家!


上一篇
Day 24 - count vs. size vs. length
下一篇
Day 26 - SQL Aggregate function 練習
系列文
SQL rookie 之天天魯一下30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言