我們從 Day18 開始,一連 4 天評估如何 porting GitLab CI/CD Pipelines 成 GitHub Actions。執行 CI/CD job 的方式若以「是否採用容器化技術」來做區別的話,有 4 種組合:
run:
) 給弄亂今天繼續看 4.,目標是讓 self-hosted runner & job 都在獨立 container 內執行。
這次來 survey 可以在多人使用環境使用的方式,我們改採 docker swarm 來一次建立多個 runner (k8s 其實才是 GitHub 官方建議的方式,不過一步一步來wwwwww)。
首先撰寫 docker-compose.yml
(參考 https://github.com/myoung34/docker-github-actions-runner/issues/72):
version: '3'
services:
runner:
image: myoung34/github-runner:latest
restart: unless-stopped
environment:
REPO_URL: https://github.com/........./dotnet-cicd-eval
RUNNER_NAME: containerized_{{.Task.Slot}}
ACCESS_TOKEN: ..............
RUNNER_WORKDIR: /tmp/runner/work
RUNNER_SCOPE: 'repo'
RUNNER_WORKDIR: /tmp/runner/work_{{.Task.Slot}}
volumes:
- '/var/run/docker.sock:/var/run/docker.sock'
# 就算我不 binding /tmp/runner,不知為何我去 host 的資料夾看還是會有資料...
- '/tmp/runner:/tmp/runner'
deploy:
replicas: 3
啟動這個 service
# 沒設定過 docker swarm,先跑一次設定 (參考 https://testdriven.io/blog/github-actions-docker/),其中 10.99.0.6 是這台機器的對外 IP
docker swarm init --advertise-addr 10.99.0.6
docker stack deploy --compose-file=docker-compose.yml actions
結果去 GitHub 上,沒有新的 runner 被註冊。
開始 debug,看一下 container 的狀態:
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
q4cmzrm2apmh actions_runner.1 ... pi-host Ready Rejected 1 second ago "network sandbox join failed: …"
pyszm0tz2h0g \_ actions_runner.1 ... pi-host Shutdown Rejected 6 seconds ago "network sandbox join failed: …"
m43v0by08erv actions_runner.2 ... pi-host Ready Rejected less than a second ago "network sandbox join failed: …"
noo8aod6dj77 \_ actions_runner.2 ... pi-host Shutdown Rejected 5 seconds ago "network sandbox join failed: …"
36ettm7bzh8v actions_runner.3 ... pi-host Ready Assigned less than a second ago
nwzbw52t4sf0 \_ actions_runner.3 ... pi-host Shutdown Rejected 4 seconds ago "network sandbox join failed: …"
lglxhh9dean0 \_ actions_runner.3 ... pi-host Shutdown Rejected 9 seconds ago "network sandbox join failed: …"
network sandbox join failed: …
訊息後面看不到。使用 journalctl -u docker --no-pager
查看 log:
Sep 23 05:11:27 pi-host dockerd[1041]: time="2023-09-23T05:11:27.500258442+08:00" level=error msg="Failed creating ingress network: network sandbox join failed: subnet sandbox join failed for \"10.0.0.0/24\": error creating vxlan interface: operation not supported"
不支援建立 vxlan?查了資料發現 Raspberry Pi 版本的 Ubuntu 需要再安裝 extra modules:
sudo apt install linux-modules-extra-raspi
裝好後,先 shutdown stack,接著重開機:
sudo docker stack rm actions
sudo reboot
再次挑戰建立 stack:
...........$ docker stack deploy --compose-file=docker-compose.yml actions
Creating network actions_default
Creating service actions_runner
...........$ docker stack ps actions
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE
o3aio4zl3nig actions_runner.1 ... pi-host Running Preparing 6 seconds ago
tv0tpp4j30vn actions_runner.2 ... pi-host Running Preparing 6 seconds ago
v78qtlzuuk4z actions_runner.3 ... pi-host Running Preparing 6 seconds ago
等了大概 10 分鐘,CURRENT STATE 終於從 Preparing 變成 Running.....
OK 註冊成功,跑 workflow:
跑的時候可以看到有兩個 runner 撿起工作開始做了,這工作方式跟 GitLab 不太一樣:
build
job 成功。
validate
失敗是因為 runner 環境沒有裝 .NET,無法執行 application
我就當作該 workflow 就技術評估上成功了 (這個做法就是之前的 2. 作法,但是 runner 是用 docker swarm 架起來的。
全部失敗
原因是在那個 container 裡面找不到 Node.js
Run actions/checkout@v4
with:
repository: **masked**/dotnet-cicd-eval
token: ***
ssh-strict: true
persist-credentials: true
clean: true
sparse-checkout-cone-mode: true
fetch-depth: 1
fetch-tags: false
show-progress: true
lfs: false
submodules: false
set-safe-directory: true
/usr/bin/docker exec 1bf5fa192e4402e63b7e4ba56ab7130ee93a59f047d8c5835ddbbe8ed9a0a439 sh -c "cat /etc/*release | grep ^ID"
OCI runtime exec failed: exec failed: unable to start container process: exec: "/__e/node20/bin/node": stat /__e/node20/bin/node: no such file or directory: unknown
我嘗試加了一個 install-node 的 action,沒有用。評估後還是先不管這個了,CP 值過低。
其實就技術評估而言這個應算是成功,因為至少我們確定 docker in docker 可以動,而且 job 雖然失敗,但也確實有被執行過。
這 4 天的 survey 至少讓我之後幫公司導入 GitHub 的 CI/CD 方案時有個底。