Postgresql hstore 的運算子,函數及應用介紹

第 11 屆 iThome 鐵人賽
DAY 27
Software Development
以Postgresql為主,聊聊資料庫.系列第 27 篇

11th鐵人賽 postgresql rdbms sql
一級屠豬士
2019-10-12 02:35:23
2471 瀏覽
分享至
Postgresql hstore 的運算子,函數及應用介紹

接續昨天的運算子後續部分.

關於 delete 或說是 remove 的, 有三個
 hstore - text, hstore - text[], hstore - hstore

select ddteam
     , ddteam - '1'::text as "天津風感到寂寞"
     , ddteam - array['2', '3'] as "天津風獨自航行"
     , ddteam - '3=>冬月, 2=>島風'::hstore as "島風你很煩啊"
  from (values (hstore(array['1', '2', '3']
       , array['天津風','島風', '冬月']))) a (ddteam);

+-[ RECORD 1 ]---+-----------------------------------------+
| ddteam         | "1"=>"天津風", "2"=>"島風", "3"=>"冬月"   |
| 天津風感到寂寞   | "2"=>"島風", "3"=>"冬月"                  |
| 天津風獨自航行   | "1"=>"天津風"                             |
| 島風你很煩啊     | "1"=>"天津風"                             |
+----------------+-----------------------------------------+

解構成 array , 在前面講建構式時, 有 hstore(text[]), 裡面的 array
有一維的也有二維的. 有相對應反向的解構函數 hstore_to_array(),
hstore_to_matrix(), 及運算子 %% %#.

select ddteam1
     , hstore_to_array(ddteam1)
     , %% ddteam1 as "%% to array"
     , hstore_to_matrix(ddteam1)
     , %# ddteam1 as "%# to 2D array"
     , ddteam2
     , hstore_to_array(ddteam2)
     , %% ddteam2 as "%% to array"
     , hstore_to_matrix(ddteam2)
     , %# ddteam2 as "%# to matrix"
  from (values( hstore(array['1', '天津風', '2', '島風'])
              , hstore(array[['3', '秋月'], ['4', '涼月']])
              )
        ) a (ddteam1, ddteam2);
     
+-[ RECORD 1 ]-----+----------------------------+
| ddteam1          | "1"=>"天津風", "2"=>"島風" |
| hstore_to_array  | {1,天津風,2,島風}          |
| %% to array      | {1,天津風,2,島風}          |
| hstore_to_matrix | {{1,天津風},{2,島風}}      |
| %# to 2D array   | {{1,天津風},{2,島風}}      |
| ddteam2          | "3"=>"秋月", "4"=>"涼月"   |
| hstore_to_array  | {3,秋月,4,涼月}            |
| %% to array      | {3,秋月,4,涼月}            |
| hstore_to_matrix | {{3,秋月},{4,涼月}}        |
| %# to matrix     | {{3,秋月},{4,涼月}}        |
+------------------+----------------------------+


與record互轉使用 populate_record(record, hstore)

既然是 record , 就是依附於 table.

先來建立一個 table

create table hs1011 (
  col1 integer
, col2 text
, col3 text
);

insert into hs1011 values
(1, '天津風', '島風'),
(2, '秋月', '涼月');


record to hstore:

select hstore(a)
  from hs1011 as a;
+-----------------------------------------------+
|                    hstore                     |
+-----------------------------------------------+
| "col1"=>"1", "col2"=>"天津風", "col3"=>"島風" |
| "col1"=>"2", "col2"=>"秋月", "col3"=>"涼月"   |
+-----------------------------------------------+
(2 rows)

展開為對應的方式
select *
  from populate_record(null::hs1011,
       '"col2"=>"島風", "col3"=>NULL'::hstore
       );

+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
|    ¤ | 島風 | ¤    |
+------+------+------+
(1 row)

select *
  from populate_record(null::hs1011,
       '"col1"=>"1", "col2" => "天津風", "col3"=>"島風"'::hstore
       );
+------+--------+------+
| col1 |  col2  | col3 |
+------+--------+------+
|    1 | 天津風 | 島風 |
+------+--------+------+

可以應用在使用 hstore 型態儲存的資料,轉成另外使用 table 儲存時.

例如我們先建立一個使用 hstore 的 table

create table ddteam (
  dd hstore
);

insert into ddteam values
('"dd1"=> "天津風", "dd2"=>"島風", "dd3"=>"時津風"'::hstore),
(hstore(array['dd1', 'dd2', 'dd3'], array['秋月', '冬月', '涼月']));

select *
  from ddteam;

+-------------------------------------------------+
|                       dd                        |
+-------------------------------------------------+
| "dd1"=>"天津風", "dd2"=>"島風", "dd3"=>"時津風" |
| "dd1"=>"秋月", "dd2"=>"冬月", "dd3"=>"涼月"     |
+-------------------------------------------------+
(2 rows)

這時候我們可以 create 一個 type , 用來與 hstore 裡的 key 對應.

create type dt_type as (dd1 text, dd2 text, dd3 text);

select (populate_record(null::dt_type, dd)).*
  from ddteam;

+--------+------+--------+
|  dd1   | dd2  |  dd3   |
+--------+------+--------+
| 天津風 | 島風 | 時津風 |
| 秋月   | 冬月 | 涼月   |
+--------+------+--------+
(2 rows)

這樣就可以將 hstore 裡的內容展開對應,至於語法為何是後面要加上 * 呢
在前面我們有介紹過 複合資料型態, 又叫 row type , 或是 record type.
https://ithelp.ithome.com.tw/articles/10223641

在昨天介紹 hstore 建構時,有使用到 row(), 就是會產生一個 record.
我們來看 populate_record() , 透過我們指定 dd_type的 record type,
產生出來的是 dt_type 定義的 record.

select pg_typeof(populate_record(null::dt_type, dd))
     , populate_record(null::dt_type, dd)
  from ddteam;
+-----------+----------------------+
| pg_typeof |   populate_record    |
+-----------+----------------------+
| dt_type   | (天津風,島風,時津風) |
| dt_type   | (秋月,冬月,涼月)     |
+-----------+----------------------+
(2 rows)

所以要再用上面的語法將其展開.

我們也可以建立第二個table,定義好名稱與型態,對應hstore的內容.
跟record type 內容幾乎一樣,其實 record type 就是 table 的虛擬啊.

create table ddteam2 (
  dd1 text
, dd2 text
, dd3 text
);

insert into ddteam2
select (populate_record(null::ddteam2, dd)).*
  from ddteam;

select * from ddteam2;
+--------+------+--------+
|  dd1   | dd2  |  dd3   |
+--------+------+--------+
| 天津風 | 島風 | 時津風 |
| 秋月   | 冬月 | 涼月   |
+--------+------+--------+
(2 rows)

這樣我們就完成了 record , hstore 之間的互相轉換.
這時候聰明的你,有想到 hstore 在開發系統時,可以怎樣應用嗎?

我們來把比較運算子完成.

select hs2, hs1
     , hs2 = hs1 as "hs2 = hs1"
     , hs2 #>=# hs1 as "hs2 >= hs1"
     , hs2 #># hs1 as "hs2 > hs1"
     , hs2 #<=# hs1 as "hs2 <= hs1"
     , hs2 #<# hs1 as "hs2 < hs1"
  from (values (hstore(row(1,2,3)), hstore(row(3,2,4)))
       ) a (hs1, hs2);

+-[ RECORD 1 ]---------------------------------+
| hs2        | "f1"=>"3", "f2"=>"2", "f3"=>"4" |
| hs1        | "f1"=>"1", "f2"=>"2", "f3"=>"3" |
| hs2 = hs1  | f                               |
| hs2 >= hs1 | t                               |
| hs2 > hs1  | t                               |
| hs2 <= hs1 | f                               |
| hs2 < hs1  | f                               |
+------------+---------------------------------+

現在已經把運算子都介紹過了.#= 運算子是用在 populate_record() 裡面.
接著來介紹函數.部分函數在前面已經有先介紹了,如建構函數,hstore() 系列,
hstore_to_array(), hstore_to_matrix() 搭配運算子介紹已經先出場了.
有一些函數是官網文件未列出來, 實際上一些運算子就是使用這些函數做實際運算.

select hs1 = hs2
     , hstore_eq(hs1, hs2)
     , hstore_ge(hs1, hs2)
     , hstore_le(hs1, hs2)
  from (values (hstore(row(1,2,3)), hstore(row(1,2,3)))
       ) a (hs1, hs2);

+----------+-----------+-----------+-----------+
| ?column? | hstore_eq | hstore_ge | hstore_le |
+----------+-----------+-----------+-----------+
| t        | t         | t         | t         |
+----------+-----------+-----------+-----------+

在前面透過查詢 pg_catalog.pg_operator 就可以理解到,Postgresql
優雅的體系,運用 operator overloading , 建構強大的資料結構與小夥伴們.

接著介紹一些 hstore 的函數.

akeys(hstore) , return text[]

select akeys('1=>天津風, 2=>島風');
+-------+
| akeys |
+-------+
| {1,2} |
+-------+

skeys(hstore) , return setof text

select skeys('a=>天津風, b=>島風');
+-------+
| skeys |
+-------+
| a     |
| b     |
+-------+

select id, key
  from skeys('a=>天津風, b=>島風')
  with ordinality as t (key, id);
+----+-----+
| id | key |
+----+-----+
|  1 | a   |
|  2 | b   |
+----+-----+
(2 rows)

return 的型態是 set of text, 就是屬於 set returning function,
都能使用 with ordinality 產生序號,在介紹 array 時有分享過
https://ithelp.ithome.com.tw/articles/10222763

avals(hstore) , return text[]

select avals('a=>天津風, b=>島風');

+---------------+
|     avals     |
+---------------+
| {天津風,島風} |
+---------------+

svals(hstore) , return setof text

select svals('a=>天津風, b=>島風');

+--------+
| svals  |
+--------+
| 天津風 |
| 島風   |
+--------+
(2 rows)

select *
  from svals('a=>天津風, b=>島風')
  with ordinality;
+--------+------------+
| svals  | ordinality |
+--------+------------+
| 天津風 |          1 |
| 島風   |          2 |
+--------+------------+
(2 rows)

展示使用 with ordinality 時,不指定欄位名稱的情況.


接下來是轉換到 json / jsonb 的部分

hstore_to_json(hstore) , return json
轉換所有非null的值到 json

select jsonb_pretty(hstore_to_json('"a key"=>1, b=>t, c=>null, 
       d=>12345, e=>012345, null=>xyz,
       f=>1.234, g=>2.345e+4')::jsonb)as js;
+----------------------+
|          js          |
+----------------------+
| {                   +|
|     "b": "t",       +|
|     "c": null,      +|
|     "d": "12345",   +|
|     "e": "012345",  +|
|     "f": "1.234",   +|
|     "g": "2.345e+4",+|
|     "null": "xyz",  +|
|     "a key": "1"    +|
| }                    |
+----------------------+

官方文件寫轉換 non-null 的值為 json string,注意到 c=>null,
null 不是轉換成 "null" , 而是保留 null.
而 null=>xyz, 這裡的null 先是轉換成 "null" 到 hstore 裡面,
當然就是一般的 string, 也轉換進 json了.

hstore_to_jsonb(hstore) , return jsonb
轉換所有非null的值到 jsonb

select jsonb_pretty(hstore_to_jsonb('"a key"=>1, b=>t, c=>null, 
       d=>12345, e=>012345, "null"=>xyz,
       f=>1.234, g=>2.345e+4'))as jsb;

+----------------------+
|         jsb          |
+----------------------+
| {                   +|
|     "b": "t",       +|
|     "c": null,      +|
|     "d": "12345",   +|
|     "e": "012345",  +|
|     "f": "1.234",   +|
|     "g": "2.345e+4",+|
|     "null": "xyz",  +|
|     "a key": "1"    +|
| }                    |
+----------------------+

hstore_to_json_loose(hstore) , return json
boolean 與 數字型態值,不轉換成 string.

select jsonb_pretty(hstore_to_json_loose('"a key"=>1, b=>t, c=>null, 
        d=>12345, e=>012345, f=>1.234, g=>2.345e+4')::jsonb) as js;

+--------------------+
|         js         |
+--------------------+
| {                 +|
|     "b": true,    +|
|     "c": null,    +|
|     "d": 12345,   +|
|     "e": "012345",+|
|     "f": 1.234,   +|
|     "g": 23450,   +|
|     "a key": 1    +|
| }                  |
+--------------------+

注意到 012345 是 string.

slice(hstore, text[]), return hstore

根據 text[]裡的 key,傳出新的 hstore.

select slice('a=>1,b=>2,c=>3'::hstore, ARRAY['b','c','x']);
+--------------------+
|       slice        |
+--------------------+
| "b"=>"2", "c"=>"3" |
+--------------------+

each(hstore) , return setof (key value)

一般的姿勢:
select * from each('a=>1,b=>2');
+-----+-------+
| key | value |
+-----+-------+
| a   | 1     |
| b   | 2     |
+-----+-------+

特別的姿勢:
select * 
 from each('a=>1,b=>2')
 with ordinality;
+-----+-------+------------+
| key | value | ordinality |
+-----+-------+------------+
| a   | 1     |          1 |
| b   | 2     |          2 |
+-----+-------+------------+

exist(hstore, text) , return boolean
是否存在該 key?

select exist('a=>1', 'a');
+-------+
| exist |
+-------+
| t     |
+-------+

defined(hsotre, text), return boolean
檢查該key指向的值是否為 non-null?

select defined('c=>null', 'c')
     , avals('c=>null')
     , (avals('c=>null'))[1]; -- fetch array value;
+---------+--------+-------+
| defined | avals  | avals |
+---------+--------+-------+
| f       | {NULL} | ¤     |
+---------+--------+-------+

接下來是三個 delete 函數,跟三個 - 運算子一樣

select ddteam
     , delete(ddteam , '1'::text) as "天津風感到寂寞"
     , delete(ddteam , array['2', '3']) as "天津風獨自航行"
     , delete(ddteam , '3=>冬月, 2=>島風'::hstore) as "島風你很煩啊"
  from (values (hstore(array['1', '2', '3']
       , array['天津風','島風', '冬月']))) a (ddteam);
+-[ RECORD 1 ]---+-----------------------------------------+
| ddteam       | "1"=>"天津風", "2"=>"島風", "3"=>"冬月"    |
| 天津風感到寂寞 | "2"=>"島風", "3"=>"冬月"                  |
| 天津風獨自航行 | "1"=>"天津風"                             |
| 島風你很煩啊   | "1"=>"天津風"                            |
+----------------+-----------------------------------------+

現在把 hstore 的運算子及函數部分全部介紹完成.

一些操作的運用就是搭配運算子及函數,參考官網文件,就有基本的體會了.
在上面的函數中,each()是非常強大的.

接著來講一些應用.在一些ERP廠商規劃的table,會預先開好保留欄位,
先不使用,留待以後擴充.因為線上的 table 要執行 alter table add column
很容易 lock.所以一般會先預留,這樣就比較方便.
但是,總是會有些意想不到的情況,預留的不夠用.預留太多又容易影響效能.
像 hstore 這樣靈活的資料結構,就可以擔任救援.
若要做統計,透過 each() 就可以很容易計算,不用擔心.
官網上有這樣的用法:

SELECT key, count(*) 
  FROM (SELECT (each(h)).key FROM testhstore) AS stat
  GROUP BY key
  ORDER BY count DESC, key;
    key    | count
-----------+-------
 line      |   883
 query     |   207
 pos       |   203
 ....
 
hstore 雖然是一維, key-value 的 set. 類似 Python 的 Dictionary.
而且所有的資料都是 text. 雖然只是最簡單的資料型態,這樣相對很好做儲存,
先不用管格式是否正確,先用text存起來,後續可以再做過濾修正.
建議可以採用成對配置, raw / clean , 至少放兩對,可做分類使用.
當然可以視業務情況,某些table 放較多 hstore 欄位.

hstore 雖然較為簡單,但是相對速度較快,在大多情況下,已經足夠擔任存放
的功能了.值得試試看.