Day23. 一些讓你看來很強的 ORM - prisma ( Aggregate & Group by) - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2024 iThome 鐵人賽

DAY 23

Modern Web

一些讓你看來很強的 ORM - prisma系列第 23 篇

Day23. 一些讓你看來很強的 ORM - prisma ( Aggregate & Group by)

16th鐵人賽

danny101201

團隊GMT(+4)

2024-10-07 22:51:58

353 瀏覽

分享至

今天介紹一個很常會用到的實作 Aggregate ，可以幫助你整合你要的資訊，同時優化 DB query 的效能，這邊將會慢慢介紹，同時 prisma 有 Aggregate 跟 Group by 兩種方法來做聚合的查詢，至於有什麼差別我們接著看下去～

在開始今天正文前，今天的內容會是基於Day21 的內容下去當作範例，model 跟 data都是基於這篇的內容所以還沒看過的讀者記得先看一下～

Aggregate

Aggregate 用來讓你組合多筆資料成一個是 Int 或是 Float 的資料，例如你想查詢所有 user 的平均資料：

 const aggregations = await prismaClient.user.aggregate({
    _avg: {
      age: true
    }
 })

這邊就會根據你 query 的結果去平均所有 user 的 age

{ _avg: { age: 50.12048192771084 } }

甚至進階一點你可以加上 where、orderBy、take 然後再根據你 filter 的結果去平均 age:

Ordered by user (asc)
透過 where 去找出 email 包含 gmail.com 的 data
限制只拿取前10位 user

備註 : 使用 take 記得加上 order by 排序資料喔～

const aggregations = await prismaClient.user.aggregate({
    _avg: {
      age: true
    },
    where: {
      email: {
        contains: 'gmail.com'
      }
    },
    orderBy: {
      age: 'asc'
    },
    take: 10
})

這邊就是根據你塞選的條件最終得到的結果

{ _avg: { age: 11 } }

另外除了 _avg 你也可以用 _count 查看有多少 user 讓你去做平均

 const aggregations = await prismaClient.user.aggregate({
    _avg: {
      age: true
    },
    _count: {
      age: true
    }
  })

{ _avg: { age: 50.12048192771084 }, _count: { age: 83 } }

Aggregate Values Are Nullable

如果當你 aggregate 的欄位他是 nullable 最後 return 的內容他可能會是 0 或是 null，如果是0代表 query 不到資料，null 就是這個欄位沒有值

 const aggregations = await prismaClient.user.aggregate({
    _avg: {
      age: true
    },
    _count: {
      age: true
    }
  })

所以請大家看看 return 的結果

{
  _avg: {
    age: null
  },
  _count: {
    age: 9
  }
}

上面的內容{ _avg: { age: null } } 等同於以下的其中一種情境：

沒有 user 資料
所有 user 的 age 欄位都是 null

所以這邊可以讓你區別 aggregate 的 value 他沒有資料，或是全部 age 的欄位都是 null

GroupBy

別於 Aggregate 只 return 單一一個 Int 或是 float 的欄位，GroupBy 在你 return 的 value 中可以讓你看到更多的欄位，以及優化你 query data 的內容，例如找到特定的 user 在特定的 city 他的平均 age 是多少

這邊簡單舉個 groupBy 的範例，以下是根據 country 去分組，然後根據分組的資料，平均整體的 profileViews

const groupUsers = await prismaClient.user.groupBy({
    by: ['country'],
    _sum: {
      profileViews: true
    }
  })

從這邊 return 的 data 可以看出在 Western Sahara 這個 city 總共 profileViews 有 324 個

[
//..
{ _sum: { profileViews: 324 }, country: 'Western Sahara' }
//..
]

然後 groupBy.by 可以不止 by 一個欄位他可以接受多個

const groupUsers = await prismaClient.user.groupBy({
    by: ['country', 'city'],
    _sum: {
      profileViews: true
    }
  })

這邊就是先根據 country 分組，然後再根據 city 去分類

[
//..
  {
    _sum: { profileViews: 324 },
    country: 'Western Sahara',
    city: 'Amelyhaven'
  }
//..
]

如果你只需要 by 一個欄位的話，可以不用 [] 去包

const groupUsers = await prisma.user.groupBy({
  by: 'country',
})

groupBy 支援 where、having 來幫你 filter data，這邊使用 where 先幫你在 group data 前先 filter data

 const groupUsers = await prismaClient.user.groupBy({
    by: ['country'],
    where: {
      email: {
        contains:'gmail.com'
      }
    },
    _sum: {
      profileViews: true
    }
  })

[
//..
{ _sum: { profileViews: 2399 }, country: 'Montserrat' }
//..
]

having 則是 filter 已經分組好的所有欄位，例如這邊我只想找到 profileViews 的總共數量是大於 4500 的資料

const groupUsers = await prismaClient.user.groupBy({
    by: ['country'],
    where: {
      email: {
        contains: 'gmail.com'
      }
    },
    _sum: {
      profileViews: true
    },
    having: {
      profileViews: {
        _avg: {
         gt: 4500,
        }
      }
    }
  })

[
  { _sum: { profileViews: 4595 }, country: 'Azerbaijan' },
  { _sum: { profileViews: 4665 }, country: 'Czechia' },
  { _sum: { profileViews: 4953 }, country: 'Gabon' },
  { _sum: { profileViews: 4943 }, country: 'Lebanon' }
]

Where VS Having

簡單來說雖然 where 跟 having 都是用來 filter data 但是彼此差別在於 filter 時機

where 是 groupBy 前的塞選條件，用來減少 query data 的大小
having 則是根據 groupBy 後的結果做指標塞選

另外簡單舉個使用 Where 跟 Having 的錯誤用法，這邊主要先排除 country 不是 Sweden 跟 Ghana 的資料，然後再分組

const fd = await prisma.user.groupBy({
  by: ['country'],
  where: {
    country: {
      notIn: ['Sweden', 'Ghana'],
    },
  },
  _sum: {
    profileViews: true,
  },
  having: {
    profileViews: {
      _min: {
        gte: 10,
      },
    },
  },
})

但以下的寫法跟上面的結果是一樣的，而且這邊的 filter 的結果效能還會筆上面全部用 where 的做法來得差，所以不推薦這樣做 having

const groupUsers = await prisma.user.groupBy({
  by: ['country'],
  where: {
    country: {
      not: 'Sweden',
    },
  },
  _sum: {
    profileViews: true,
  },
  having: {
    country: {
      not: 'Ghana',
    },
    profileViews: {
      _min: {
        gte: 10,
      },
    },
  },
})

Order By

Order By 也可以用在 groupBy :

你可以 order 根據 by 的欄位
可以 order aggregate 的結果例如 _sum
使用 take 或是 skip 必須加上 order by

這邊就是根據以上的 demo 排序資料結果

const groupUsers = await prismaClient.user.groupBy({
    by: ['country'],
    where: {
      email: {
        contains: 'gmail.com'
      }
    },
    _sum: {
      profileViews: true
    },
    orderBy: {
      _sum: {
        profileViews: 'asc'
      }
    },
    having: {
      profileViews: {
        _avg: {
          gt: 4500
        }
      }
    }
  })

如此我們就可以排序 profileViews 的內容了

[
  { _sum: { profileViews: 4595 }, country: 'Azerbaijan' },
  { _sum: { profileViews: 4665 }, country: 'Czechia' },
  { _sum: { profileViews: 4943 }, country: 'Lebanon' },
  { _sum: { profileViews: 4953 }, country: 'Gabon' }
]

這邊使用 skip 排除第一筆資料，同時只拿根據 order by 的兩筆內容

const groupUsers = await prismaClient.user.groupBy({
    by: ['country'],
    where: {
      email: {
        contains: 'gmail.com'
      }
    },
    _sum: {
      profileViews: true
    },
    orderBy: {
      _sum: {
        profileViews: 'asc'
      }
    },
    having: {
      profileViews: {
        _avg: {
          gt: 4500
        }
      }
    },
    skip: 2,
    take: 1
  })

[ { _sum: { profileViews: 4943 }, country: 'Lebanon' } ]

常見問題

Q1 我可以在 groupBy() 中使用 select 嗎？

A1 你不能在 groupBy() 中使用 select。因為，by 中包含的所有欄位都會自動返回。

Q2 在 groupBy() 中使用 where 和 having 有什麼不同？

A2 where 在分組之前過濾所有記錄，而 having 過濾整個群組，並支援對聚合欄位值進行過濾，例如該組中特定欄位的平均值或總和。

Q3 groupBy() 和 Aggregate 有什麼不同？

aggregate 和 groupBy() 都以一個或多個唯一欄位值將記錄分組。 groupBy() 允許你在每個群組內聚合資料，例如返回 post 上的平均瀏覽次數，而 aggregate 則不行，只能單純 return 例如 _sum 等等的 Int 或是 Float 的資料。