Postgresql ltree 的應用

第 11 屆 iThome 鐵人賽

DAY 29

Software Development

以Postgresql為主,聊聊資料庫.系列第 29 篇

11th鐵人賽 postgresql rdbms sql

一級屠豬士

2019-10-14 16:12:12

3249 瀏覽

分享至

Postgresql ltree 的應用

接續昨天的.現在來把 C 為 root node 的子樹,接到
'A.B.D'::ltree 之下.
移動之前先模擬一下,也就是說 update 之前先 select 看看.
這次可以運用 || 運算子.

select 'A.B.D'::ltree || path
  from tree1013a
 where path <@ 'C'::ltree;

+-------------+
|  ?column?   |
+-------------+
| A.B.D.C     |
| A.B.D.C.E   |
| A.B.D.C.F   |
| A.B.D.C.E.G |
+-------------+
(4 rows)

一般我們希望能得到通用的解法.

select 'A.B.D'::ltree
    || subpath(path, nlevel('C'::ltree) - 1)
  from tree1013a
 where path <@ 'C'::ltree;

+-------------+
|  ?column?   |
+-------------+
| A.B.D.C     |
| A.B.D.C.E   |
| A.B.D.C.F   |
| A.B.D.C.E.G |
+-------------+
(4 rows)

看來不錯,那動手吧.記得要準備後悔藥,人見人愛,花見花開的 --> begin

update tree1013a
   set path = 'A.B.D'::ltree || subpath(path, nlevel('C'::ltree) - 1)
 where path <@ 'C'::ltree;

UPDATE 4

select *
  from tree1013a;

+----+------+-------------+
| id | node |    path     |
+----+------+-------------+
|  1 | A    | A           |
|  2 | B    | A.B         |
|  4 | D    | A.B.D       |
|  3 | C    | A.B.D.C     |
|  5 | E    | A.B.D.C.E   |
|  6 | F    | A.B.D.C.F   |
|  7 | G    | A.B.D.C.E.G |
+----+------+-------------+
(7 rows)

確認無誤.
commit;

上面是基本的例子的介紹.

Postgresql 的 ltree 在 SQL Server 中也有類似的,從2008版以後, hierarchyid 

https://docs.microsoft.com/en-us/sql/relational-databases/hierarchical-data-sql-server?view=sql-server-ver15

https://docs.microsoft.com/zh-tw/sql/relational-databases/hierarchical-data-sql-server?view=sql-server-ver15

文獻方面
在 2005年時,有一篇探討 Trees in SQL: Nested Sets and Materialized Path

http://www.dbazine.com/oracle/or-articles/tropashko4/

另外 Joe Celko 有出一本專書
Joe Celko's Trees and Hierarchies in SQL for Smarties 
'
至於 Joe Celko 是誰
https://en.wikipedia.org/wiki/Joe_Celko

SQL-89 / SQL-92 (這幾乎是台灣最多人停留在此的標準了,或是還沒學完)的委員之一.

也出了很多關於SQL的書,敝人拜讀上面兩位的作品獲益良多.

另外 Python 的 djanfo 有一個 Package
https://pypi.org/project/django-treebeard/

當然 Python 的生態很豐富,不只這個,其他語言亦同,有興趣的,可以多加研究.

接著回來看 ltree 的應用.
在前面的簡易例子當中,節點跟路徑,為了易於理解,都是使用對應的.
而 ltree 裡面只能存放 A-Za-z0-9_ ,
一般會使用文數字組合的編碼, node 裡才放資料本身.

來看一個例子吧.

create table fleet (
  id smallint primary key
, sh_parentid smallint
, sh_name text
, path ltree
);

create unique index
on fleet
using btree(path);

create index
on fleet
using gist(path);

create or replace function get_node_path(id smallint)
returns ltree as
$code$
select case when f.sh_parentid is null 
        then f.id::text::ltree
        else get_node_path(f.sh_parentid) || f.id::text
       end
  from fleet f
 where f.id = $1;
$code$
language sql;

create or replace function trig_upin_fleet()
returns trigger as
$code$
begin
    if TG_OP = 'UPDATE' then
        if (coalesce(old.sh_parentid, 0) != coalesce(new.sh_parentid, 0)
             or new.id != old.id) then
            update fleet
               set path = get_node_path(id)
             where old.path @> fleet.path;
        end if;
    elsif TG_OP = 'INSERT' then
        update fleet
           set path = get_node_path(new.id)
         where fleet.id = new.id;
    end if;
    return new;
end
$code$
language plpgsql volatile;

create trigger trig_upin_fleet
after insert or update of id, sh_parentid
on fleet for each row
execute procedure trig_upin_fleet();

--
insert into fleet (id, sh_parentid, sh_name) values
(1, null, '第2艦隊'),
(2, 1, '第2水雷戰隊'),
(3, 2, '第16驅逐艦隊'),
(4, 3, '初風'),
(5, 3, '雪風'),
(6, 3, '天津風'),
(7, 3, '時津風'),
(8, 2, '第15驅逐艦隊'),
(9, 8, '夏潮'),
(10, 8, '黑潮'),
(11, 8, '早潮'),
(12, 2, '第18驅逐艦隊'),
(13, 12, '陽炎'),
(14, 12, '不知火');

--
select *
  from fleet;

+----+-------------+--------------+-----------+
| id | sh_parentid |   sh_name    |   path    |
+----+-------------+--------------+-----------+
|  1 |           ¤ | 第2艦隊      | 1         |
|  2 |           1 | 第2水雷戰隊  | 1.2       |
|  3 |           2 | 第16驅逐艦隊 | 1.2.3     |
|  4 |           3 | 初風         | 1.2.3.4   |
|  5 |           3 | 雪風         | 1.2.3.5   |
|  6 |           3 | 天津風       | 1.2.3.6   |
|  7 |           3 | 時津風       | 1.2.3.7   |
|  8 |           2 | 第15驅逐艦隊 | 1.2.8     |
|  9 |           8 | 夏潮         | 1.2.8.9   |
| 10 |           8 | 黑潮         | 1.2.8.10  |
| 11 |           8 | 早潮         | 1.2.8.11  |
| 12 |           2 | 第18驅逐艦隊 | 1.2.12    |
| 13 |          12 | 陽炎         | 1.2.12.13 |
| 14 |          12 | 不知火       | 1.2.12.14 |
+----+-------------+--------------+-----------+
(14 rows)


可以利用 array_agg() 聚合.做層級的展示.

select f.id
     , array_to_string(array_agg(a.sh_name order by a.path), '->') as fullname
  from fleet f
 inner join fleet a
    on (a.path @> f.path)
 group by f.id, f.path, f.sh_name
 order by fullname;

+----+--------------------------------------------+
| id |                  fullname                  |
+----+--------------------------------------------+
|  1 | 第2艦隊                                    |
|  2 | 第2艦隊->第2水雷戰隊                       |
|  8 | 第2艦隊->第2水雷戰隊->第15驅逐艦隊         |
|  9 | 第2艦隊->第2水雷戰隊->第15驅逐艦隊->夏潮   |
| 11 | 第2艦隊->第2水雷戰隊->第15驅逐艦隊->早潮   |
| 10 | 第2艦隊->第2水雷戰隊->第15驅逐艦隊->黑潮   |
|  3 | 第2艦隊->第2水雷戰隊->第16驅逐艦隊         |
|  4 | 第2艦隊->第2水雷戰隊->第16驅逐艦隊->初風   |
|  6 | 第2艦隊->第2水雷戰隊->第16驅逐艦隊->天津風 |
|  7 | 第2艦隊->第2水雷戰隊->第16驅逐艦隊->時津風 |
|  5 | 第2艦隊->第2水雷戰隊->第16驅逐艦隊->雪風   |
| 12 | 第2艦隊->第2水雷戰隊->第18驅逐艦隊         |
| 14 | 第2艦隊->第2水雷戰隊->第18驅逐艦隊->不知火 |
| 13 | 第2艦隊->第2水雷戰隊->第18驅逐艦隊->陽炎   |
+----+--------------------------------------------+
(14 rows)

Time: 32.236 ms

在上面的例子中,先是建立一個SQL Function, 可以透過遞迴方式,將 parentid, 
組成 path. 然後再建立 trigger 呼叫.這樣我們在輸入或是變更時,都能夠將原本
需要透過遞迴方式變動,由Adjacency List 風格,建立 Materialized Path 的儲存.
展示時再透過 array 的 array_agg() 來聚合. 

透過以上的例子,相信大家對階層式資料的處理及 ltree 的運用,能夠靈活巧妙的搭配資料庫的
function 及 trigger 建立有效的處理方式.而不再是迴圈傻傻跑,再搭呆呆測試,或是沒有測試,
沒有靈氣的沙雕寫法了.

轉眼已到29天了,明天我們繼續介紹一些好物.