等下會用到的SQL示意圖
開始前先介紹什麼是Left outer join
跟 Inner join
LEFT JOIN 可以用來建立左外部連接(上圖的 A + B 部分),查詢的 SQL 敘述句 LEFT JOIN 左側資料表 (table_1) 的所有記錄都會加入到查詢結果中,最終產生一張新的表。outer join
會列出主要表格中每一筆資料,無論它的值在另一個表格中有沒有出現
=> 即使右側資料表 (table_2) 有沒有符合的值,左側資料表(table_1)每一筆還是會顯示出來。
LEFT JOIN 查詢用法 (Example)
這是一個客戶資料表 customers:
C_Id | Name | City | Address Phone |
---|---|---|---|
1 | 張一 | 台北市 XX路100號 | 02-12345678 |
2 | 王二 | 新竹縣 YY路200號 | 03-12345678 |
3 | 李三 | 高雄縣 ZZ路300號 | 07-12345678 |
而這是產品訂單的資料表 orders: |
O_Id | Order_No | C_Id |
---|---|---|
1 | 2572 | 3 |
2 | 7375 | 3 |
3 | 7520 | 1 |
4 | 1054 | 1 |
5 | 1257 | 5 |
現在我們想查詢所有客戶與其訂單狀況的資料,我們可以作一個 LEFT JOIN 的 SQL 查詢: |
SELECT customers.Name, orders.Order_No # 挑出要顯示的欄位
FROM customers
LEFT JOIN orders # orders 資料表往左接上 customers 資料表
ON customers.C_Id=orders.C_Id; # 指定兩資料表連接的條件
其中用點號連接之 XXX.YYY 表示 XXX 資料表中的 YYY 欄位。
Name | Order_No |
---|---|
張一 | 7520 |
張一 | 1054 |
王二 | null |
李三 | 2572 |
李三 | 7375 |
LEFT JOIN 會返回 左側資料表中所有資料列 ,就算沒有符合連接條件,也就是右側資料表中沒有匹配的資料值,就會顯示為 NULL (ex. 王二)。 |
|
剛才連接完成的表,可以看到張一跟李三都有兩筆,這是因為 orders 表有兩筆資料分別對應到他們兩。 | |
Left outer join 參考資料 |
INNER JOIN (內部連接,上圖的 B 部分)為等值連接,必需指定等值連接的條件,而查詢結果只會返回符合連接條件的資料。
這是一個客戶資料表 customers:
C_Id | Name | City | Address Phone |
---|---|---|---|
1 | 張一 | 台北市 XX路100號 | 02-12345678 |
2 | 王二 | 新竹縣 YY路200號 | 03-12345678 |
3 | 李三 | 高雄縣 ZZ路300號 | 07-12345678 |
而這是產品訂單的資料表 orders:
O_Id | Order_No | C_Id |
---|---|---|
1 | 2572 | 3 |
2 | 7375 | 3 |
3 | 7520 | 1 |
4 | 1054 | 1 |
5 | 1257 | 5 |
現在我們想列出所有客戶的訂單編號資料,我們可以作一個 INNER JOIN 的 SQL 查詢: |
SELECT customers.Name, orders.Order_No
FROM customers
INNER JOIN orders
ON customers.C_Id=orders.C_Id;
查詢結果如下:
Name | Order_No |
---|---|
李三 | 2572 |
李三 | 7375 |
張一 | 7520 |
張一 | 1054 |
查詢結果只會返回符合連接條件的資料! | |
=> 因為在右邊資料表找不到王二的訂單,所以新的表就沒有王二了。 |
介紹完這兩個特性之後,我們回頭來看正題
使用前面建立的Product
(商品)跟ProductStyle
(商品樣式)Model
#app/models/product.rb
class Product < ApplicationRecord
has_many :styles, class_name: 'ProductStyle' # 關聯取名為styles
end
#app/models/product_style.rb
class ProductStyle < ApplicationRecord
belongs_to :product
end
Table | Model |
---|---|
products | Product |
product_styles | ProductStyle |
Product.count # => 5 # 共5個商品
ProductStyle.count # => 9 # 共9個商品款式
Product.all.map do |product|
product.styles.first.title
end
Product Load (0.4ms) SELECT "products".* FROM "products"
ProductStyle Load (0.2ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2 [["product_id", 3], ["LIMIT", 1]]
ProductStyle Load (0.2ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2 [["product_id", 4], ["LIMIT", 1]]
ProductStyle Load (0.1ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2 [["product_id", 6], ["LIMIT", 1]]
ProductStyle Load (0.1ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2 [["product_id", 7], ["LIMIT", 1]]
ProductStyle Load (0.1ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" = $1 ORDER BY "product_styles"."id" ASC LIMIT $2 [["product_id", 8], ["LIMIT", 1]]
# => ["香蕉乾", "芒果乾", "商品款式2", "商品款式3", "商品款式4"]
後面款式都亂取名,懶得想名字了
N+1 Query這個效能殺手,造成呼叫 5 + 1次SQL(第一次呼叫products
資料表,每次迴圈呼叫一次product_styles
資料表,共5次)
呼叫SQL的成本是巨大的,該怎麼辦?
Rails透過ORM語法把關聯資料表做預加載,如此一來迴圈內呼叫關聯時,不用再呼叫SQL從DB取資料,改從cache取資料。使用預加載,迴圈越多次程式碼效率提升就越顯著。
Rails 提供四種加載關聯資料表的方法
preload
會生成兩條獨立
的SQL查詢,所以關聯資料表會一起被 query 出來,但不會建立資料表之間的關聯
products = Product.preload(:styles).order("products.id DESC")
Product Load (0.4ms) SELECT "products".* FROM "products" ORDER BY products.id DESC LIMIT $1 [["LIMIT", 11]]
ProductStyle Load (0.2ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" IN ($1, $2, $3, $4, $5) [["product_id", 8], ["product_id", 7], ["product_id", 6], ["product_id", 4], ["product_id", 3]]
preload
會將products
有關的資料表統一加載出來,但不會建立資料表之間的關聯,如果要拿preload
生成的SQL語句來做關聯查詢(where)
或排序(order)
,會出錯:
Product.preload(:styles).where("product_styles.title like ?", 'test')
ActiveRecord::StatementInvalid (PG::UndefinedTable: ERROR: missing FROM-clause entry for table "product_styles")
不過preload
確實可以解決N+1 query
:
Product.preload(:styles).map do |product|
product.styles.first.title
end
Product Load (0.4ms) SELECT "products".* FROM "products"
ProductStyle Load (0.4ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" IN ($1, $2, $3, $4, $5) [["product_id", 3], ["product_id", 4], ["product_id", 6], ["product_id", 7], ["product_id", 8]]
# => ["香蕉乾", "芒果乾", "商品款式2", "商品款式3", "商品款式4"]
eager_load
使用LEFT OUTER JOIN
進行單次查詢,並加載所有關聯數據(這樣資料表之間就有關聯了)
products = Product.eager_load(:styles).order("products.id DESC")
SQL (0.6ms) SELECT DISTINCT products.id AS alias_0, "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" ORDER BY products.id DESC LIMIT $1 [["LIMIT", 11]]
SQL (0.6ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE "products"."id" IN ($1, $2, $3, $4, $5) ORDER BY products.id DESC [["id", 8], ["id", 7], ["id", 6], ["id", 4], ["id", 3]]
可以看到為了加載關聯,eager_load
用了更複雜的SQL語法,雖然速度比preload
慢,但是解決了preload
不能關聯查詢的問題
Product.eager_load(:styles).where("product_styles.title like ?", 'test')
SQL (0.5ms) SELECT DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') LIMIT $1 [["LIMIT", 11]]
SQL (0.4ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') AND "products"."id" = $1 [["id", 4]]
eager_load 同時也可以解決N+1 query
Product.eager_load(:styles).map do |product|
product.styles.first.try(:title)
end
SQL (9.4ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "products"."order_id" AS t0_r6, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id"
eager_load
Product.eager_load(:options)
SQL (0.7ms) SELECT DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" LIMIT $1 [["LIMIT", 11]]
SQL (0.5ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "products"."store_id" AS t0_r6, "options"."id" AS t1_r0, "options"."product_id" AS t1_r1, "options"."title" AS t1_r2, "options"."option1" AS t1_r3, "options"."option2" AS t1_r4, "options"."created_at" AS t1_r5, "options"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" WHERE "products"."id" IN ($1, $2, $3, $4, $5) [["id", 4], ["id", 2], ["id", 7], ["id", 8], ["id", 3]]
第二條SQL會把所有product的id都列出來["id", 4], ["id", 2], ["id", 7], ["id", 8], ["id", 3]
eager_load
Product.eager_load(:options).where(options: {title: '尺寸'})
SQL (42.9ms) SELECT DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" WHERE "options"."title" = $1 LIMIT $2 [["title", "尺寸"], ["LIMIT", 11]]
SQL (0.6ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "products"."store_id" AS t0_r6, "options"."id" AS t1_r0, "options"."product_id" AS t1_r1, "options"."title" AS t1_r2, "options"."option1" AS t1_r3, "options"."option2" AS t1_r4, "options"."created_at" AS t1_r5, "options"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "options" ON "options"."product_id" = "products"."id" WHERE "options"."title" = $1 AND "products"."id" IN ($2, $3) [["title", "尺寸"], ["id", 2], ["id", 3]]
可以看到第一條 SQL 有加上條件["title", "尺寸"]
,篩選出符合的 products (id: 2 跟 3)
所以第二條 SQL 在查詢時,只有搜尋符合條件的["id", 2], ["id", 3]
。
Rails 的 ORM語法雖然方便,但是如果能了解一些SQL,就能更容易找到適當的方法。
每次都要想何時用
preload
?何時用eager_load
? 很麻煩所以Rails 還有
includes
這個大家最常用的方法
ps. eager_load在Rails 4是產生一條SQL,這裡用Rails 5 示範
includes
預設效果是preload
,需要做關聯資料表時又會轉成eager_load
。
preload
:Product.includes(:styles)
Product Load (0.3ms) SELECT "products".* FROM "products" LIMIT $1 [["LIMIT", 11]]
ProductStyle Load (0.4ms) SELECT "product_styles".* FROM "product_styles" WHERE "product_styles"."product_id" IN ($1, $2, $3, $4, $5) [["product_id", 3], ["product_id", 4], ["product_id", 6], ["product_id", 7], ["product_id", 8]]
eager_load
:Product.includes(:styles).where(product_styles: {title: 'test'})
SQL (0.6ms) SELECT DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE "product_styles"."title" = $1 LIMIT $2 [["title", "test"], ["LIMIT", 11]]
SQL (0.4ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE "product_styles"."title" = $1 AND "products"."id" = $2 [["title", "test"], ["id", 4]]
# => #<ActiveRecord::Relation [#<Product id: 4, title: "商品二", price: 1985.0, description: "要買要快", created_at: "2019-09-30 14:38:42", updated_at: "2019-09-30 14:38:42">]>
references
才能達到eager_load
的效果products = Product.includes(:styles).where("product_styles.title like ?", 'test').references(:styles)
SQL (0.5ms) SELECT DISTINCT "products"."id" FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') LIMIT $1 [["LIMIT", 11]]
SQL (0.5ms) SELECT "products"."id" AS t0_r0, "products"."title" AS t0_r1, "products"."price" AS t0_r2, "products"."description" AS t0_r3, "products"."created_at" AS t0_r4, "products"."updated_at" AS t0_r5, "product_styles"."id" AS t1_r0, "product_styles"."title" AS t1_r1, "product_styles"."price" AS t1_r2, "product_styles"."sku" AS t1_r3, "product_styles"."product_id" AS t1_r4, "product_styles"."created_at" AS t1_r5, "product_styles"."updated_at" AS t1_r6 FROM "products" LEFT OUTER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') AND "products"."id" = $1 [["id", 4]]
# => #<ActiveRecord::Relation [#<Product id: 4, title: "商品二", price: 1985.0, description: "要買要快", created_at: "2019-09-30 14:38:42", updated_at: "2019-09-30 14:38:42">]>
ps. 模糊搜尋的includes
不搭配references
會噴錯,各位看倌可以拿掉試試
所以includes
既可以解決N+1 query
的問題,搭配reference
又可以建立關聯查詢,提到N+1 query
推薦用includes
原因就在這裡。
joins
主要用於建立model之間的關係,用INNER JOIN
來關聯數據, “不會”將關聯的資料做預加載
,所以對N+1 query
並無幫助,常用來建立關聯搜尋,
以上面的模糊搜尋為例:
Product.joins(:styles).where("product_styles.title like ?", 'test')
Product Load (0.6ms) SELECT "products".* FROM "products" INNER JOIN "product_styles" ON "product_styles"."product_id" = "products"."id" WHERE (product_styles.title like 'test') LIMIT $1 [["LIMIT", 11]]
# => #<ActiveRecord::Relation [#<Product id: 4, title: "商品二", price: 1985.0, description: "要買要快", created_at: "2019-09-30 14:38:42", updated_at: "2019-09-30 14:38:42">]>
只產生一條SQL。所以joins
可以用來建立資料表關聯
但如果想要joins
處理我們今天第一個問題,會更慘。因為可能產生重複資料,SQL查詢次數比沒加joins
還多 w(゚Д゚)w
原因如下:
Product.joins(:styles).size #=> 9
Product.joins(:styles).uniq.size #=> 5
原來Product.joins(:styles)
實際上是:
ProductStyle.joins(:product).size # => 9
ProductStyle.joins(:product).uniq.size # => 9
Rails提供關聯資料表加載,幫助我們解決N+1 query
這個隱形效能殺手,也可以建立資料表的關聯。
四種該如何使用,以下是簡單的判斷
includes
似乎最方便上面只是概略的描述,實際執行會因為資料表的行數及列數不同,導致效能也不同,還是要依照情況而定。