鼬~~哩賀,我是寫程式的山姆老弟,昨天跟大家一起看了 Rails 的 API-only 模式,今天來看 RailsGuide 的 Multiple Databases,之前沒什麼特別的需求要存多個資料庫,想趁這次機會來看看使用的時機點,還有使用方法,夠夠~
ps. 這個功能是在 Rails 6 之後才有的,如果是 Rails 6 以下的版本,要額外使用 multi_db
gem
Rails 在建立新專案時,會產生一個 config/database.yml
資料庫設定檔,預設是使用 SQLite
,長得下面這樣
# config/database.yml
# SQLite. Versions 3.8.0 and up are supported.
# gem install sqlite3
#
# Ensure the SQLite 3 gem is defined in your Gemfile
# gem "sqlite3"
#
default: &default
adapter: sqlite3
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
development:
<<: *default
database: db/development.sqlite3
# Warning: The database defined as "test" will be erased and
# re-generated from your development database when you run "rake".
# Do not set this db to the same as development or production.
test:
<<: *default
database: db/test.sqlite3
production:
<<: *default
database: db/production.sqlite3
SQLite
本來就不是設計來給 Production 環境使用的,所以通常會改接 Postgres
或 MySQL
從設定檔可以看出,每個環境就是設定一個 database 的設定,也就是單資料庫
而多資料庫是長下面這樣
# config/database.yml
...
production:
primary:
database: db/production.sqlite3
username: root
password: <%= ENV['ROOT_PASSWORD'] %>
adapter: mysql2
primary_replica:
database: my_primary_database
username: root_readonly
password: <%= ENV['ROOT_READONLY_PASSWORD'] %>
adapter: mysql2
replica: true
animals:
database: my_animals_database
username: animals_root
password: <%= ENV['ANIMALS_ROOT_PASSWORD'] %>
adapter: mysql2
migrations_paths: db/animals_migrate
animals_replica:
database: my_animals_database
username: animals_readonly
password: <%= ENV['ANIMALS_READONLY_PASSWORD'] %>
adapter: mysql2
replica: true
通常是在 Production 環境中,設定不只一個資料庫,會有主要的資料庫(primary)、主要的備份資料庫(primary_replica)、另外專屬的資料庫(官方舉例 animals 資料庫)、專屬的資料庫的備份資料庫(animals_replica)
從設定檔的 username 可以看得出來 (username: xxx_readonly)
,備份資料庫(replica)是唯讀(readonly),用來減輕主資料庫的負擔
在一般小專案的情況下,一個資料庫應該就夠用,當使用者越來越多,效能瓶頸在資料庫的時候,你可能先考慮把資料庫加大(如果是雲端平台的話),如果一個資料庫加到最大之後,效能瓶頸還是在資料庫的話,那你可能要考慮使用多個資料庫來分攤效能;又或者是你有特殊的資料,或舊的資料已經存在某資料庫,就把某個 model 指向舊資料庫。
產生 migration 的時候,要多加個 --database
的選項,像是 $ rails g migration CreateDogs name —-database animals
,指定這個 migration 是要給 animals
資料庫的
還有產生 scaffold 的時候也要記得加 --database
,像是 $ rails g scaffold Dog name —-database animals
$ rails g active_record:multi_db
產生 config/initializers/multi_db.rb
檔案,然後把 Rails.application.configure
的註解拿掉
# config/initializers/multi_db.rb
# Multi-db Configuration
#
# This file is used for configuration settings related to multiple databases.
#
# Enable Database Selector
#
# Inserts middleware to perform automatic connection switching.
# The `database_selector` hash is used to pass options to the DatabaseSelector
# middleware. The `delay` is used to determine how long to wait after a write
# to send a subsequent read to the primary.
#
# The `database_resolver` class is used by the middleware to determine which
# database is appropriate to use based on the time delay.
#
# The `database_resolver_context` class is used by the middleware to set
# timestamps for the last write to the primary. The resolver uses the context
# class timestamps to determine how long to wait before reading from the
# replica.
#
# By default Rails will store a last write timestamp in the session. The
# DatabaseSelector middleware is designed as such you can define your own
# strategy for connection switching and pass that into the middleware through
# these configuration options.
#
Rails.application.configure do
config.active_record.database_selector = { delay: 2.seconds }
config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
config.active_record.database_resolver_context = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
end
#
# Enable Shard Selector
#
# Inserts middleware to perform automatic shard swapping. The `shard_selector` hash
# can be used to pass options to the `ShardSelector` middleware. The `lock` option is
# used to determine whether shard swapping should be prohibited for the request.
#
# The `shard_resolver` option is used by the middleware to determine which shard
# to switch to. The application must provide a mechanism for finding the shard name
# in a proc. See guides for an example.
#
# Rails.application.configure do
# config.active_record.shard_selector = { lock: true }
# config.active_record.shard_resolver = ->(request) { Tenant.find_by!(host: request.host).shard }
# end
這樣會啟用 ActiveRecord::Middleware::DatabaseSelector
這個 middleware,可以根據 request method 來判斷多資料庫
delay
時間,是指在寫入資料庫後的兩秒內,如果有 GET
或 HEAD
request,還是一樣到 primary
資料庫拿取資料
GET
資料,那就會去 primary
拿,這個叫做 read your own write
primary
資料庫拿資料了,有可能會從 replica
拿設定你的 config/database.yml
資料庫結構
# config/database.yml
...
production:
primary:
database: db/production.sqlite3
username: root
password: <%= ENV['ROOT_PASSWORD'] %>
adapter: mysql2
primary_replica:
database: my_primary_database
username: root_readonly
password: <%= ENV['ROOT_READONLY_PASSWORD'] %>
adapter: mysql2
replica: true
animals:
database: my_animals_database
username: animals_root
password: <%= ENV['ANIMALS_ROOT_PASSWORD'] %>
adapter: mysql2
migrations_paths: db/animals_migrate
animals_replica:
database: my_animals_database
username: animals_readonly
password: <%= ENV['ANIMALS_READONLY_PASSWORD'] %>
adapter: mysql2
replica: true
ps. 通常會需要到這麼複雜的資料庫結構,就是 Production 環境,如果你只是要做實驗的話,可以把 production
改成 development
,方便測試
連接同個資料庫,就要寫在同一個 class,再用繼承的,因為每寫一個連接的設定,就會佔用一個資料庫連接數
# app/models/animals_record.rb
class AnimalsRecord < ApplicationRecord
self.abstract_class = true
connects_to database: { writing: :animals }
end
Dog
再去繼承 AnimalsRecord
# app/models/dog.rb
class Dog < AnimalsRecord
end
可以自訂什麼時候要切換拿 primary 或 replica 的規則,透過,取代掉原本預設的 ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
class MyCookieResolver
# code for your cookie class
end
這個 MyCookieResolver
的寫法,可以參考預設的寫法
然後在設定換掉 Resolver
# config/initializers/multi_db.rb
...
Rails.application.configure do
config.active_record.database_selector = { delay: 2.seconds }
config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
config.active_record.database_resolver_context = MyCookieResolver
end
...
如果你要讓 POST 的 request 也去讀 replica 資料庫的話,那 Rails 也有提供這個彈性
# 某個 POST action 的 API,也可以指定去 replica 資料庫拿資料
ActiveRecord::Base.connected_to(role: :reading) do
# Rails will check each query to ensure it's a read query
end
還有 prevent_writes: true
選項,可以強制讓這個 block 裡面的操作是 read only,不會對 replica 資料庫寫入,當然,在資料庫的權限控制的就要避免寫入 replica 資料庫了
Rails 6.0 和 6.1 的預設值有些差別,不過這個差別,在我們這篇沒有提到,可以小小注意一下
透過把資料庫的部分資料,分到另一個資料庫,這就是 Horizontal Sharding (真是粗糙的講法 XD),Rails 也有提供相對應的設定方法
# config/database.yml
...
production:
primary:
database: my_primary_database
adapter: mysql2
primary_replica:
database: my_primary_database
adapter: mysql2
replica: true
primary_shard_one:
database: my_primary_shard_one
adapter: mysql2
primary_shard_one_replica:
database: my_primary_shard_one
adapter: mysql2
replica: true
然後在上層的 model 設定 shard 對應到的資料庫名字
# app/models/application_record.rb
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to shards: {
default: { writing: :primary, reading: :primary_replica },
shard_one: { writing: :primary_shard_one, reading: :primary_shard_one_replica }
}
end
別忘記在 config/initializers/multi_db.rb
把 sharding 的註解拿掉
# config/initializers/multi_db.rb
...
Rails.application.configure do
config.active_record.shard_selector = { lock: true }
config.active_record.shard_resolver = ->(request) { Tenant.find_by!(host: request.host).shard }
# 或是另一個官方提供的範例
config.active_record.shard_resolver = ->(request) {
subdomain = request.subdomain
tenant = Tenant.find_by_subdomain!(subdomain)
tenant.shard
}
end
看完這篇覺得長知識了,原來還有這些設定,平常的小專案真的是不需要用到這樣的功能,不過我猜在實作面上,以上這些設定都只是小事,更麻煩的還是怎麼判斷 DB bottleneck 在哪裡,然後怎麼改 DB 架構、怎麼安全的搬資料,在 production 環境上做這些操作,真的是需要很大的心臟
今天就先這樣囉,到了接近賽季結束的尾聲了,明天會用 Rails 的 Security 作為結尾,我們明天見~