鼬~~哩賀,我是寫程式的山姆老弟,昨天跟大家一起看了 Rails 的 API-only 模式,今天來看 RailsGuide 的 Multiple Databases,之前沒什麼特別的需求要存多個資料庫,想趁這次機會來看看使用的時機點,還有使用方法,夠夠~

ps. 這個功能是在 Rails 6 之後才有的,如果是 Rails 6 以下的版本,要額外使用 multi_db gem
Rails 在建立新專案時,會產生一個 config/database.yml 資料庫設定檔,預設是使用 SQLite,長得下面這樣
# config/database.yml
# SQLite. Versions 3.8.0 and up are supported.
#   gem install sqlite3
#
#   Ensure the SQLite 3 gem is defined in your Gemfile
#   gem "sqlite3"
#
default: &default
  adapter: sqlite3
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  timeout: 5000
development:
  <<: *default
  database: db/development.sqlite3
# Warning: The database defined as "test" will be erased and
# re-generated from your development database when you run "rake".
# Do not set this db to the same as development or production.
test:
  <<: *default
  database: db/test.sqlite3
production:
  <<: *default
  database: db/production.sqlite3
SQLite 本來就不是設計來給 Production 環境使用的,所以通常會改接 Postgres 或 MySQL
從設定檔可以看出,每個環境就是設定一個 database 的設定,也就是單資料庫
而多資料庫是長下面這樣
# config/database.yml
...
production:
  primary:
	  database: db/production.sqlite3
		username: root
    password: <%= ENV['ROOT_PASSWORD'] %>
    adapter: mysql2
	primary_replica:
    database: my_primary_database
    username: root_readonly
    password: <%= ENV['ROOT_READONLY_PASSWORD'] %>
    adapter: mysql2
    replica: true
	animals:
    database: my_animals_database
    username: animals_root
    password: <%= ENV['ANIMALS_ROOT_PASSWORD'] %>
    adapter: mysql2
    migrations_paths: db/animals_migrate
  animals_replica:
    database: my_animals_database
    username: animals_readonly
    password: <%= ENV['ANIMALS_READONLY_PASSWORD'] %>
    adapter: mysql2
    replica: true
通常是在 Production 環境中,設定不只一個資料庫,會有主要的資料庫(primary)、主要的備份資料庫(primary_replica)、另外專屬的資料庫(官方舉例 animals 資料庫)、專屬的資料庫的備份資料庫(animals_replica)
從設定檔的 username 可以看得出來 (username: xxx_readonly),備份資料庫(replica)是唯讀(readonly),用來減輕主資料庫的負擔
在一般小專案的情況下,一個資料庫應該就夠用,當使用者越來越多,效能瓶頸在資料庫的時候,你可能先考慮把資料庫加大(如果是雲端平台的話),如果一個資料庫加到最大之後,效能瓶頸還是在資料庫的話,那你可能要考慮使用多個資料庫來分攤效能;又或者是你有特殊的資料,或舊的資料已經存在某資料庫,就把某個 model 指向舊資料庫。
產生 migration 的時候,要多加個 --database 的選項,像是 $ rails g migration CreateDogs name —-database animals,指定這個 migration 是要給 animals 資料庫的
還有產生 scaffold 的時候也要記得加 --database,像是 $ rails g scaffold Dog name —-database animals
$ rails g active_record:multi_db 產生 config/initializers/multi_db.rb
檔案,然後把 Rails.application.configure 的註解拿掉
# config/initializers/multi_db.rb
# Multi-db Configuration
#
# This file is used for configuration settings related to multiple databases.
#
# Enable Database Selector
#
# Inserts middleware to perform automatic connection switching.
# The `database_selector` hash is used to pass options to the DatabaseSelector
# middleware. The `delay` is used to determine how long to wait after a write
# to send a subsequent read to the primary.
#
# The `database_resolver` class is used by the middleware to determine which
# database is appropriate to use based on the time delay.
#
# The `database_resolver_context` class is used by the middleware to set
# timestamps for the last write to the primary. The resolver uses the context
# class timestamps to determine how long to wait before reading from the
# replica.
#
# By default Rails will store a last write timestamp in the session. The
# DatabaseSelector middleware is designed as such you can define your own
# strategy for connection switching and pass that into the middleware through
# these configuration options.
#
Rails.application.configure do
  config.active_record.database_selector = { delay: 2.seconds }
  config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
  config.active_record.database_resolver_context = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
end
#
# Enable Shard Selector
#
# Inserts middleware to perform automatic shard swapping. The `shard_selector` hash
# can be used to pass options to the `ShardSelector` middleware. The `lock` option is
# used to determine whether shard swapping should be prohibited for the request.
#
# The `shard_resolver` option is used by the middleware to determine which shard
# to switch to. The application must provide a mechanism for finding the shard name
# in a proc. See guides for an example.
#
# Rails.application.configure do
#   config.active_record.shard_selector = { lock: true }
#   config.active_record.shard_resolver = ->(request) { Tenant.find_by!(host: request.host).shard }
# end
這樣會啟用 ActiveRecord::Middleware::DatabaseSelector 這個 middleware,可以根據 request method 來判斷多資料庫
delay 時間,是指在寫入資料庫後的兩秒內,如果有 GET 或 HEAD request,還是一樣到 primary 資料庫拿取資料
GET 資料,那就會去 primary 拿,這個叫做 read your own write
primary 資料庫拿資料了,有可能會從 replica 拿設定你的 config/database.yml 資料庫結構
# config/database.yml
...
production:
  primary:
	  database: db/production.sqlite3
		username: root
    password: <%= ENV['ROOT_PASSWORD'] %>
    adapter: mysql2
	primary_replica:
    database: my_primary_database
    username: root_readonly
    password: <%= ENV['ROOT_READONLY_PASSWORD'] %>
    adapter: mysql2
    replica: true
	animals:
    database: my_animals_database
    username: animals_root
    password: <%= ENV['ANIMALS_ROOT_PASSWORD'] %>
    adapter: mysql2
    migrations_paths: db/animals_migrate
  animals_replica:
    database: my_animals_database
    username: animals_readonly
    password: <%= ENV['ANIMALS_READONLY_PASSWORD'] %>
    adapter: mysql2
    replica: true
ps. 通常會需要到這麼複雜的資料庫結構,就是 Production 環境,如果你只是要做實驗的話,可以把 production 改成 development,方便測試
連接同個資料庫,就要寫在同一個 class,再用繼承的,因為每寫一個連接的設定,就會佔用一個資料庫連接數
# app/models/animals_record.rb
class AnimalsRecord < ApplicationRecord
  self.abstract_class = true
  connects_to database: { writing: :animals }
end
Dog 再去繼承 AnimalsRecord
# app/models/dog.rb
class Dog < AnimalsRecord
end
可以自訂什麼時候要切換拿 primary 或 replica 的規則,透過,取代掉原本預設的 ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
class MyCookieResolver
  # code for your cookie class
end
這個 MyCookieResolver 的寫法,可以參考預設的寫法

然後在設定換掉 Resolver
# config/initializers/multi_db.rb
...
Rails.application.configure do
  config.active_record.database_selector = { delay: 2.seconds }
  config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
  config.active_record.database_resolver_context = MyCookieResolver
end
...
如果你要讓 POST 的 request 也去讀 replica 資料庫的話,那 Rails 也有提供這個彈性
# 某個 POST action 的 API,也可以指定去 replica 資料庫拿資料
ActiveRecord::Base.connected_to(role: :reading) do
  # Rails will check each query to ensure it's a read query
end
還有 prevent_writes: true 選項,可以強制讓這個 block 裡面的操作是 read only,不會對 replica 資料庫寫入,當然,在資料庫的權限控制的就要避免寫入 replica 資料庫了
Rails 6.0 和 6.1 的預設值有些差別,不過這個差別,在我們這篇沒有提到,可以小小注意一下
透過把資料庫的部分資料,分到另一個資料庫,這就是 Horizontal Sharding (真是粗糙的講法 XD),Rails 也有提供相對應的設定方法
# config/database.yml
...
production:
  primary:
    database: my_primary_database
    adapter: mysql2
  primary_replica:
    database: my_primary_database
    adapter: mysql2
    replica: true
  primary_shard_one:
    database: my_primary_shard_one
    adapter: mysql2
  primary_shard_one_replica:
    database: my_primary_shard_one
    adapter: mysql2
    replica: true
然後在上層的 model 設定 shard 對應到的資料庫名字
# app/models/application_record.rb
class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
  connects_to shards: {
    default: { writing: :primary, reading: :primary_replica },
    shard_one: { writing: :primary_shard_one, reading: :primary_shard_one_replica }
  }
end
別忘記在 config/initializers/multi_db.rb 把 sharding 的註解拿掉
# config/initializers/multi_db.rb
...
Rails.application.configure do
  config.active_record.shard_selector = { lock: true }
  config.active_record.shard_resolver = ->(request) { Tenant.find_by!(host: request.host).shard }
	# 或是另一個官方提供的範例
	config.active_record.shard_resolver = ->(request) {
	  subdomain = request.subdomain
	  tenant = Tenant.find_by_subdomain!(subdomain)
	  tenant.shard
	}
end
看完這篇覺得長知識了,原來還有這些設定,平常的小專案真的是不需要用到這樣的功能,不過我猜在實作面上,以上這些設定都只是小事,更麻煩的還是怎麼判斷 DB bottleneck 在哪裡,然後怎麼改 DB 架構、怎麼安全的搬資料,在 production 環境上做這些操作,真的是需要很大的心臟
今天就先這樣囉,到了接近賽季結束的尾聲了,明天會用 Rails 的 Security 作為結尾,我們明天見~