Log-shipping 與 MariaDB Replication [2]

2017 iT 邦幫忙鐵人賽

DAY 11

自我挑戰組

MariaDB 的使用與管理系列第 11 篇

2017鐵人賽 mariadb logshipping replication binlog

sujunmin

2016-12-11 00:05:13

4123 瀏覽

分享至

(以下文章擷取自筆者的 blog，分享給大家)

上篇筆者說明了一些原理說明，這篇就來做個實驗。

在同一個機器上開新的 Instance

有一個不錯用的工具 mysql_install_db.exe 可以直接開新的 Instance。

筆者用這個先開一個 Instance 在 Port 3307。
設定原來在 3306 的為 Master (server-id=1)，3307 的為 Slave (server-id=2)，都開啟 binlog (log-bin)。
Event 記得設成 Disabled on Slave。

接下來備份 Master

E:\backup>mysqldump -u sujunmin -p --master-data --all-databases --events --routines --gtid > all_db.sql
Enter password: ************

其中幾個重要參數

參數	意義
`master-data`	dump 出來的掛 master ID
`events`	包含 events
`routines`	包含 functions 與 store procedures
`gtid`	產生 `CHANGE MASTER TO master_use_gtid` 語法

觀察一下 all_db.sql 的內容

-- MySQL dump 10.16  Distrib 10.1.17-MariaDB, for Win64 (AMD64)
(中間省略)
--
-- Position to start replication or point-in-time recovery from
--

-- CHANGE MASTER TO MASTER_LOG_FILE='WSTest-bin.000002', MASTER_LOG_POS=3732;

--
-- GTID to start replication from
--

CHANGE MASTER TO MASTER_USE_GTID=slave_pos;
SET GLOBAL gtid_slave_pos='0-1-770';

--
-- Current Database: 'master'
--
(後面省略)

可以看到如果要用舊的方式(兼容 MySQL Replication) 與新的方式。

啟動 Slave 的 Instance。
把 all_db.sql 倒到 Slave (3307) 上面。
在 Master 上開一個 User lsuser 作為 Replication 用，權限是 REPLICATION SLAVE 與 SUPER (這個我不設定會沒權限登入，但是官方網站沒有這個權限)。

在 Slave 上設定 CHANGE MASTER

MariaDB [(none)]> change master to master_host='localhost', master_port=3306, master_user='lsuser', master_password='password';
Query OK, 0 rows affected (0.05 sec)

START SLAVE;

SHOW SLAVE STATUS;

 MariaDB [(none)]> show slave status\G
 *************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: localhost
                Master_User: lsuser
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: WSTest-bin.000002
        Read_Master_Log_Pos: 5429
             Relay_Log_File: WSTest-relay-bin.000002
              Relay_Log_Pos: 5757
      Relay_Master_Log_File: WSTest-bin.000002
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
                         (省略)
                 Using_Gtid: Slave_Pos
                Gtid_IO_Pos: 0-1-775
    Replicate_Do_Domain_Ids:
  Replicate_Ignore_Domain_Ids:
              Parallel_Mode: conservative
  1 row in set (0.00 sec)

看一下同步狀況
移除 Master 模擬 Master 壞掉了狀態
- sc stop mariadb
- sc delete mariadb
- 移除資料

Slave 沒法連到 Master 了

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
             Slave_IO_State: Reconnecting after a failed master event read
                Master_Host: localhost
                Master_User: lsuser
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: WSTest-bin.000002
        Read_Master_Log_Pos: 8223
             Relay_Log_File: WSTest-relay-bin.000002
              Relay_Log_Pos: 8551
      Relay_Master_Log_File: WSTest-bin.000002
           Slave_IO_Running: Connecting
          Slave_SQL_Running: Yes
                         (省略)
              Last_IO_Errno: 2003
              Last_IO_Error: error reconnecting to master 'lsuser@localhost:3306' - retry-time: 60  retries: 86400  message: Can't connect to MySQL server on'localhost' (10061 "Unknown error")
             Last_SQL_Errno: 0
             Last_SQL_Error:
 Replicate_Ignore_Server_Ids:
           Master_Server_Id: 1
             Master_SSL_Crl:
         Master_SSL_Crlpath:
                 Using_Gtid: Slave_Pos
                Gtid_IO_Pos: 0-1-784
    Replicate_Do_Domain_Ids:
 Replicate_Ignore_Domain_Ids:
              Parallel_Mode: conservative
 1 row in set (0.00 sec)

STOP SLAVE
打開 Slave 的 Event 服務
繼續服務

重建 Master

E:\backup>mysqldump -u sujunmin -p --port 3307 --master-data --all-databases --events --routines --gtid > all_db.sql
Enter password: ************

觀察一下 all_db.sql 的內容

-- MySQL dump 10.16  Distrib 10.1.17-MariaDB, for Win64 (AMD64)
(中間省略)
--
-- Position to start replication or point-in-time recovery from
--

-- CHANGE MASTER TO MASTER_LOG_FILE='WSTest-bin.000003', MASTER_LOG_POS=12441;

--
-- GTID to start replication from
--

CHANGE MASTER TO MASTER_USE_GTID=slave_pos;
SET GLOBAL gtid_slave_pos='0-2-976';

--
-- Current Database: 'master'
--
(後面省略)

啟動 Master 的 Instance。
把 all_db.sql 倒到 Master (3307) 上面。

在 Slave 上設定 CHANGE MASTER 為 current_pos

MariaDB [(none)]> change master to master_host='localhost', master_port=3306, master_user='lsuser', master_password='password', master_use_gtid=current_pos;
Query OK, 0 rows affected (0.05 sec)

START SLAVE;

SHOW SLAVE STATUS;

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
                          (省略)
             Last_IO_Errno: 1236
             Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-2-980, which is not in the master's binlog. Since the master's binlog contains GTIDs with higher sequence numbers, it probably means that the slave has diverged due to executing extra erroneous transactions'
            Last_SQL_Errno: 0
            Last_SQL_Error:
Replicate_Ignore_Server_Ids:
          Master_Server_Id: 1
            Master_SSL_Crl:
        Master_SSL_Crlpath:
                Using_Gtid: Current_Pos
               Gtid_IO_Pos: 0-2-980
   Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
              Parallel_Mode: conservative
1 row in set (0.00 sec)

STOP SLAVE;

在 Slave 上設定 CHANGE MASTER 為 slave_pos

MariaDB [(none)]> change master to master_host='localhost', master_port=3306, master_user='lsuser', master_password='password', master_use_gtid=slave_pos;
Query OK, 0 rows affected (0.05 sec)

START SLAVE;

SHOW SLAVE STATUS;

MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
            Slave_IO_State: Waiting for master to send event
               Master_Host: localhost
               Master_User: lsuser
               Master_Port: 3306
             Connect_Retry: 60
           Master_Log_File: WSTest-bin.000003
       Read_Master_Log_Pos: 7177
            Relay_Log_File: WSTest-relay-bin.000003
             Relay_Log_Pos: 7466
     Relay_Master_Log_File: WSTest-bin.000003
          Slave_IO_Running: Yes
         Slave_SQL_Running: Yes
                        (省略)
                Using_Gtid: Slave_Pos
               Gtid_IO_Pos: 0-1-1167
   Replicate_Do_Domain_Ids:
 Replicate_Ignore_Domain_Ids:
               Parallel_Mode: conservative
1 row in set (0.00 sec)

這兩個對應 (0-2-976, 0-1-1167) 都幫你做好了
完成

一些想法

如果是 Multi-Master 的這個系統會很好做 (CREATE MASTER to ..., master_use_gtid=slave_pos;)，因為 slave_pos 可以是多個的，例如 gtid_slave_pos='0-1-123,1-2-456,...'
Slave 如果要快速完復，不需要備份還原的話，那麼就只能當 Snapshot 來用 (不能動他讓他產生自己的 binlog)
對於原來的需求 (單一 Master 單一 Slave) 基本上設定一下就完成了
有時候會碰到有些指令沒法傳過去的問題(如建立 FederatedX Table)，這時候 Slave 會卡在 Error 的地方不能過去，所以要手動 bypass 它。
但是不太清楚要 skip 幾個的狀態下，就一個一個看了。

stop slave;
set global sql_slave_skip_counter=1;
start slave;

select sleep(10);

show slave status;

雖然有人說不要太常用 sql_slave_skip_counter，但這個是我覺得重建外最快的方法了，到現在還沒有碰到其他問題。

Log-shipping 與 MariaDB Replication [1]

Audit 的實作 [1]

系列文

MariaDB 的使用與管理共 30 篇

RSS系列文訂閱系列文

22 人訂閱

完整目錄

尚未有邦友留言

立即登入留言

參賽組數

1064 組

團體組數

40 組

累計文章數

22206 篇

完賽人數

600 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# windows server linux css react vue.js

MariaDB 的使用與管理系列 第 11 篇