在大致介紹完了CASSANDRA/SCYLLA結構之後,順著介紹了一致性與讀寫機制,接下來我們進入講設計資料表的部分。
本章節要說明的materialized views 。
比起MySQL/MSSQL的view,CASSANDRA/SCYLLA的叫做materialized view,materialized
的意思是物理化的、具體化的。
先前我們就提到,在這裡並沒有join
的概念,也因此materialized view與MySQL/MSSQL的view相比,乍看之下有些相似,不過背後的組織方式其實不同。
materialized view其實也就對於原本的單一
一張的table,調整順序和資料組合方式(要知道順序
對於CASSANDRA/SCYLLA非常重要),別於MySQL/MSSQL的view可以是join多張不同的table。
MySQL/MSSQL的view是一種join好的查詢結果,一種虛擬表。
materialized view裡面,每個資料是實質存在的,可以理解為原本table內的column,換個排列方式組織。
因此materialized view的建立,也會影響到原本資料的寫入與更新的效能。所以MySQL/MSSQL的view數量要建幾個都可以,但在CASSANDRA/SCYLLA最好事先評估一下。
以下是SCYLLA官方提供的materialized view 建立語法說明:
create_materialized_view_statement: CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] `view_name` AS
: `select_statement`
: PRIMARY KEY '(' `primary_key` ')'
: WITH `table_options`
範例如下:
CREATE MATERIALIZED VIEW monkeySpecies_by_population AS
SELECT * FROM monkeySpecies
WHERE population IS NOT NULL AND species IS NOT NULL
PRIMARY KEY (population, species)
WITH comment='Allow query by population instead of species';
CREATE TABLE t (
k int,
c1 int,
c2 int,
v1 int,
v2 int,
PRIMARY KEY (k, c1, c2)
);
建立的View,可以任意調動primary key的順序。
CREATE MATERIALIZED VIEW mv1 AS
SELECT * FROM t WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL
PRIMARY KEY (c1, k, c2);
CREATE MATERIALIZED VIEW mv1 AS
SELECT * FROM t WHERE v1 IS NOT NULL AND k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL
PRIMARY KEY (v1, k, c1, c2);
建立的view,primary key的規矩有很重要的兩點
在原本table的column沒有設置primary key的部分,頂多一個
,在view裡面可以設置成primary key。
若原本table的column有設置primary key,就一定要設primary key。
// Error: cannot include both v1 and v2 in the primary key as both are not in the base table primary key
CREATE MATERIALIZED VIEW mv1 AS
SELECT * FROM t WHERE k IS NOT NULL AND c1 IS NOT NULL AND c2 IS NOT NULL AND v1 IS NOT NULL
PRIMARY KEY (v1, v2, k, c1, c2)
// Error: must include k in the primary as it's a base table primary key column
CREATE MATERIALIZED VIEW mv1 AS
SELECT * FROM t WHERE c1 IS NOT NULL AND c2 IS NOT NULL
PRIMARY KEY (c1, c2)
實際下語法看看,回傳了以下的訊息:
"Cannot include more than one non-primary key column 'v2' in materialized view primary key"
而效能評估的部分,比起用反正規化的方式,重開很多類似的table,官方建議還是使用materialized view的做法。一來效能方便比較經濟實惠,二來維護性也較佳。