卡尼多隨筆

認識自我 • 感受世界 • 創造價值

0%

用 Docker 和 Kubernetes 部署一個使用到 RabbitMQ 及 MongoDB 的 Express App

這個 Project 主要是用來熟悉 RabbitMQ、MongoDB、Docker 及 Kubernetes。
接下來我會用筆記的方式分享整個過程的點點滴滴(和一些我踩過的坑 😅)

綜觀全貌

Crawler 每分鐘固定抓 YouBike 的資料,切分成一個一個站點資訊以後,丟進 message queue (RabbitMQ) 裡。

Worker 得知 queue 裡有新東西時,就將它們拿出來一一處理,並存進一個 NoSQL database (MongoDB) 裡。

Client 端能透過呼叫 Express Server 的 API,抓取「離使用者最近的三個站點還剩下多少台 YouBike」的資訊。

整個流程在本地端測試完以後,先練習用 Docker 去建置整個環境。

OK 以後,再練習用 Kubernetes,或者更精確來說是 k3s + k3d,來建整個環境。

功能實作

此 Project 需要用到的資料集:YouBike臺北市公共自行車即時資訊
JSON主要欄位說明(要點開 資料資源下載網址檢視資料 才看得到)

建 Project

1
2
3
4
5
6
7
8
9
mkdir youbike-helper
cd youbike-helper
npm init # 我把 entry point 改成 app.js
git init
npm install express
touch .gitignore # 加上 node_modules/
touch app.js
# 貼 https://expressjs.com/en/starter/hello-world.html 範例 code
nodemon app.js

定期做事情

can print message every minute with node-cron

撈資料

can fetch online api json data with node-fetch

  • npm install node-fetch@2(選用 v2 的原因
  • 使用方法可見 node-fetch 的 Common Usage → JSON

Message Queue

can publish & consume object with amqplib

  • 安裝 RabbitMQ:brew install rabbitmq
  • 啟動:rabbitmq-server
  • npm install amqplib

整合 MongoDB

can insert data we want into MongoDB with mongoose

1
2
3
mkdir -p ~/mongos/db7
mongod --port 2777 --dbpath ~/mongos/db7
# mongodb://localhost:2777
  • npm install dotenv --saverequire('dotenv').config()
  • type 為 Date 的部分參考自這個

Code

  • model.js:用來定 schema
  • 修改 worker.js 讓它能新增 document
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
const { Info } = require('./model')

const newInfo = {
station: info['sna'],
version: 'v1',
available: info['sbi'],
location: {
type: 'Point',
coordinates: [ Number(info['lat']).toFixed(4), Number(info['lng']).toFixed(4) ]
},
// Transform '20220523235035' to '2022-05-23T23:50:35'
datatime: new Date(`${info['mday'].slice(0,4)}-${info['mday'].slice(4,6)}-${info['mday'].slice(6,8)}T${info['mday'].slice(8,10)}:${info['mday'].slice(10,12)}:${info['mday'].slice(12,14)}`)
};

const theInfo = await Info.create(newInfo);
console.log(theInfo);
console.log('==============================');

秀出最接近的幾個站點

can print nearest stations regarding a point with aggregate and $geoNear

  • 🎉 主要是參考這篇「Or using aggregate instead」做到的 🎉
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Info.aggregate(
[
{
'$geoNear': {
'near': {
'type': 'Point',
// TODO: change to the data that users gave us
'coordinates': [ 121.5312, 25.0299 ]
},
'spherical': true,
'distanceField': 'distance'
},
},
{ '$skip': 0 },
{ '$limit': 3 }
],
(err, nearestInfos) => {
if (err) throw err;
// console.log(nearestInfos);
for (const info of nearestInfos) {
console.log(info['station']);
console.log(info['available']);
console.log('==============================');
}
}
);
  • 新版 model.js,為了能做 $geoNear(參考這篇修改的)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
const mongoose = require('mongoose');
const { Schema } = mongoose;
require('dotenv').config();

const URI = process.env.MONGODB_URI;
mongoose.connect(`${URI}/demo`);
const db = mongoose.connection;
db.on('error', console.error.bind(console, 'MongoDB connection error:'));

const infoSchema = new Schema({
station: {
type: String,
required: true,
},
version: {
type: String,
required: true,
},
available: {
type: Number,
required: true,
},
// 這邊這邊這邊這邊這邊
location: {
type: {
type: String,
enum: ['Point'],
default: 'Point',
},
coordinates: {
type: [Number],
}
},
// 這邊這邊這邊這邊這邊
datatime: {
type: Date,
required: true,
},
}, {
strict: 'throw'
});

// 這邊這邊這邊這邊這邊
infoSchema.index({ location: '2dsphere' });
// 這邊這邊這邊這邊這邊
const Info = mongoose.model('Info', infoSchema);

module.exports = {
Info
};
  • worker.js 經緯度次序也要對調,不然會噴錯!(這篇有提到)

1
2
3
4
coordinates: [
Number(Number(info['lng']).toFixed(4)),
Number(Number(info['lat']).toFixed(4))
]

實際秀出來給使用者看

can show to users with ejs and geolocation

改成 upsert 而不是一直 insert

change to upsert to prevent infinite growth

1
2
3
4
5
// const theInfo = await Info.create(newInfo);
const theInfo = await Info.findOneAndUpdate({
station: info['sna'],
version: 'v1',
}, newInfo, { upsert: true });

環境變數

對了,我的 .env 長這樣:

1
2
3
MONGODB_URI='mongodb://localhost:2777'
AMQP_URI='amqp://localhost'
SEARCH_API_URL='http://localhost:3000/search'

成果

在本地端測試:

1
2
3
rabbitmq-server                            # 啟動 RabbitMQ
mongod --port 2777 --dbpath ~/mongos/db7 # 啟動 MongoDB
nodemon app.js # 啟動 Express Server

Nice!

Docker

啟動多個 Docker container

can do the same job on docker containers with docker-compose up

  1. 先將 docker daemon 跑起來
  2. 寫 backend 的 Dockerfile:touch Dockerfile
1
2
3
4
5
6
7
8
9
FROM node:lts-alpine
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY . /usr/src/app

RUN npm install -g nodemon

EXPOSE 3000
CMD [ "npm", "start" ]
  1. docker-compose.yml:用來組合多個 container 成為一個完整的服務

也參考了「How to Set Up RabbitMQ With Docker Compose」及「Docker Compose wait for container X before starting Y

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
version: "3.9"

services:
rabbitmq:
image: rabbitmq:3-management-alpine
container_name: 'rabbitmq'
ports:
- 5672:5672
- 15672:15672
volumes:
- ~/.docker-conf/rabbitmq/data/:/var/lib/rabbitmq/
- ~/.docker-conf/rabbitmq/log/:/var/log/rabbitmq
healthcheck:
test: rabbitmq-diagnostics -q ping
interval: 10s
timeout: 5s
retries: 5
start_period: 3s
mongo:
image: mongo
ports:
- "27017:27017"
healthcheck:
test: echo 'db.runCommand("ping").ok' | mongo localhost:27017/test --quiet
interval: 2s
timeout: 3s
retries: 3
start_period: 3s
backend:
build:
context: .
ports:
- "3000:3000"
environment:
- MONGODB_URI=mongodb://mongo:27017
- AMQP_URI=amqp://guest:guest@rabbitmq:5672
links:
- mongo
- rabbitmq
depends_on:
mongo:
condition: service_healthy
rabbitmq:
condition: service_healthy
  1. 跑起來:docker-compose up(後端有改 code 的話,先 docker-compose downdocker-compose up --build,不然 code 不會更新到,這個地方花了我超多時間在 debug 😂)

Kubernetes

deploy to K3s with k3d

幾乎都參考自「Kubernetes Web App Deployment with k3d」,那篇實在是太優質啦!

建立 Cluster

  • /mnt/data 加進 Docker → Preferences… → Resources → File Sharing
1
2
3
4
5
sudo mkdir -p /mnt/data
k3d cluster create youbike-helper --servers 1 --agents 2 --port "30000-30005:30000-30005@server:0" --volume /mnt/data:/mnt/data

kubectl config get-contexts # check the newly created cluster
kubectl cluster-info # 啟用(?)的樣子

建後端的 Docker Image

  • 參考文:Build the Docker images corresponding to the application components and pushing them to Docker Hub, so they become available to our deployment process.
  • 透過 docker build 指令可依序讀取 Dockerfile 中的指令以自動建立 image。
  • Docker Hub 註冊一下吧~
1
2
3
4
5
6
7
docker build -t youbike-helper-backend .

# docker tag youbike-helper-backend <you_docker_hub_username>/youbike-helper-backend
docker tag youbike-helper-backend kanido386/youbike-helper-backend

# docker push <your_docker_hub_username>/youbike-helper-backend
docker push kanido386/youbike-helper-backend

建置 Database

Persistent Volume

  • 參考文:A pod is ephemeral (短暫存在的), so its data is lost if the pod is destroyed. In this example, we are going to use a persistent volume (pv) to hold the application data so it persists beyond the lifetime of the pod.

pv.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: PersistentVolume
metadata:
name: youbike-helper-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"

去 command line 跑一下:

1
2
kubectl apply -f pv.yaml   # create
kubectl get pv # check

Persistent Volume Claim

  • 參考文persistent volume claim (pvc) are used by pods to request physical storage.

db-pv-claim.yaml

1
2
3
4
5
6
7
8
9
10
11
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: youbike-helper-db-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi

去 command line 跑一下:

1
2
kubectl apply -f db-pv-claim.yaml   # create
kubectl get pvc

想移除 PV 和 PVC 怎麼辦?

Deployment

  • 參考文
    • A deployment defines the desired state for a component in our application.
    • In this case, we are declaring the desired state of our database and Kubernetes takes care of sticking to this definition.

db-deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
apiVersion: apps/v1
kind: Deployment
metadata:
name: youbike-helper-db-deployment
spec:
selector:
matchLabels:
app: youbike-helper
tier: db
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
# refers to the pod
template:
metadata:
labels:
app: youbike-helper
tier: db
spec:
containers:
- name: youbike-helper-db
image: mongo
ports:
- containerPort: 27017
volumeMounts:
# 若 mountPath 跟 PersistentVolume 的 path 不一樣
# pod 會卡在 CrashLoopBackOff 的狀態...
- mountPath: /mnt/data
name: youbike-helper-db-volume
volumes:
- name: youbike-helper-db-volume
persistentVolumeClaim:
claimName: youbike-helper-db-pv-claim
  • 參考文:What is a rolling update strategy?
    • The rolling update strategy is a gradual process that allows you to update your Kubernetes system with only a minor effect on performance and no downtime.
    • In this strategy, the Deployment selects a Pod with the old programming, deactivates it, and creates an updated Pod to replace it.

去 command line 跑一下:

1
2
3
kubectl apply -f db-deployment.yaml
kubectl get deployments
kubectl get pods -o wide # check the state of the pod

Service

  • 參考文
    • We need a service to make the database available to the backend.
    • This makes the database pod accessible only from within the cluster and the connection will target the port 27017 in the pod.

db-service.yaml

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: Service
metadata:
name: youbike-helper-db-service
spec:
selector:
app: youbike-helper
tier: db
ports:
- protocol: TCP
port: 27017
targetPort: 27017

去 command line 跑一下:

1
2
kubectl apply -f db-service.yaml
kubectl get services

各種 port 傻傻分不清楚?

(截圖自「Kubernetes Services explained | ClusterIP vs NodePort vs LoadBalancer vs Headless Service」)

弄 RabbitMQ

Deployment 和 Service 可寫在一起

rabbitmq.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
## Service
apiVersion: v1
kind: Service
metadata:
name: my-rabbitmq-service
labels:
app: rabbitmq
spec:
selector:
app: rabbitmq
type: NodePort
ports:
- name: rabbitmq
port: 5672
targetPort: 5672
nodePort: 30001
- name: rabbitmq-management
port: 15672
targetPort: 15672
nodePort: 30002
---
## Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-rabbitmq-deployment
labels:
app: rabbitmq
spec:
replicas: 1
selector:
matchLabels:
app: rabbitmq
template:
metadata:
labels:
app: rabbitmq
spec:
containers:
- name: rabbitmq
image: rabbitmq:3.9.18-management-alpine
ports:
- containerPort: 5672
- containerPort: 15672
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 500m
memory: 512Mi
livenessProbe:
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
tcpSocket:
port: 5672
readinessProbe:
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
tcpSocket:
port: 5672

去 command line 跑一下:

1
2
3
4
kubectl apply -f rabbitmq.yaml   # create
kubectl get deployment # check deployment
kubectl get service # check service
# kubectl get <單數複數好像都可以>

處理 Backend

環境變數

  • 參考文:3 ways to provide some environment variables when creating the deployment for the backend
    • defining a variable value directly in the deployment definition
    • get a variable’s value from a separate configuration object called ConfigMap
    • get a variable’s value from a separate configuration object called Secret

backend-config.yaml

1
2
3
4
5
6
7
apiVersion: v1
kind: ConfigMap
metadata:
name: backend-config
data:
MONGODB_URI: youbike-helper-db-service
AMQP_URI: my-rabbitmq-service
  • 參考文
    • Please note, we provide the name of the service created before.
    • That’s because Kubernetes handles the name resolution so the requests will reach the database pod without problems.

去 command line 跑一下:

1
2
3
kubectl apply -f backend-config.yaml
kubectl get configmap
kubectl describe configmap backend-config

(因為 project 不需要,所以就沒設定 secret 了,設定方式詳見參考文「Next, let’s create a Secret.」)

Deployment

backend-deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
apiVersion: apps/v1
kind: Deployment
metadata:
name: youbike-helper-backend-deployment
spec:
selector:
matchLabels:
app: youbike-helper
tier: backend
replicas: 3
# refers to the pod
template:
metadata:
labels:
app: youbike-helper
tier: backend
spec:
containers:
- name: youbike-helper-backend
# image: <your_docker_hub_username>/youbike-helper-backend:latest
image: kanido386/youbike-helper-backend:latest
env:
# - name: NODE_ENV
# value: "production"
- name: MONGODB
valueFrom:
configMapKeyRef:
name: backend-config
key: MONGODB_URI
# 後來才知道要這樣寫,所以變數名稱有點亂
- name: MONGODB_URI
value: mongodb://$(MONGODB)
- name: AMQP
valueFrom:
configMapKeyRef:
name: backend-config
key: AMQP_URI
- name: AMQP_URI
value: amqp://$(AMQP)
# - name: SECRET
# valueFrom:
# secretKeyRef:
# name: backend-secrets
# key: backend_secret
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1024Mi"
ports:
- containerPort: 3000

去 command line 跑一下:

1
2
3
4
5
6
kubectl apply -f backend-deployment.yaml
kubectl get deployments
# kubectl get pods 發現 STATUS 為 ImagePullBackOff 不用擔心!
# 過一段時間就會變為 Running(我這邊是這樣)

kubectl exec -it [pod] /bin/ash # 我沒有打錯,是 ash 沒錯😂

Service

  • 參考文
    • If we need to connect to the backend pods directly, how could we find out and keep track of the IP addresses to connect to?
    • Thanks to this abstraction, we don’t need to worry about any of this.
    • The service “knows” where to find all the pods that matches its selector and any request that reaches the service will reach one of those pods.
    • Even if a pod dies or even a node crashes, Kubernetes will make sure to keep the desired number of pods and the service will allow us to reach the required pods.

backend-service.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: Service
metadata:
name: youbike-helper-backend-service
spec:
selector:
app: youbike-helper
tier: backend
type: NodePort
ports:
- name: youbike-helper-backend
port: 3000
targetPort: 3000
nodePort: 30000

去 command line 跑一下:

1
2
kubectl apply -f backend-service.yaml
kubectl get services

試試 Ingress

  • 參考文:We can access the application from outside and make sure the requests will reach the correct destination.

ingress.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: youbike-helper-ingress
annotations:
ingress.kubernetes.io/ssl-redirect: "false"
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: youbike-helper-backend-service
port:
number: 3000
# - path: /
# pathType: Prefix
# backend: # 這仍舊要寫 backend
# service:
# name: youbike-helper-frontend-service
# port:
# number: 80
  • 參考文:The ingress controller will start acting as a HTTP reverse proxy.

去 command line 跑一下:

1
2
kubectl apply -f ingress.yaml
kubectl get ingress

(但後來我沒靠 Ingress 連,而是靠 NodePort 😂)
(我靠 NodePort 連的這個做法其實挺不安全的!)

進行測試

  • 參考文:You can play with the application and then check the logs:(也可以用這個 debug!)
1
2
3
4
kubectl get pods

# get the name of a pod and:
kubectl logs [pod]

http://localhost:30000 成功連進頁面 🎉
但當我按「查找附近的 YouBike 站點!」居然沒反應⋯⋯
啊!原來是我 call 後端 API 的網址忘記改啦⋯⋯
(應該要把 http://localhost:3000/search 改成 http://localhost:30000/search,當然或許有更好的改法 🤔)

解掉問題

要改 code 的話,會有以下程序要做:

  1. 那部分改為環境變數

.env

for 本地端測試用 (假裝自己沒架 k3d),得加上

1
SEARCH_API_URL='http://localhost:3000/search'

app.js

html 裡面好像不能吃到環境變數,得靠後端傳入

1
2
3
4
5
app.get('/', (req, res) => {
res.render('index', {
SEARCH_API_URL: process.env.SEARCH_API_URL
});
});

index.ejs

參考自「Accessing EJS variable in Javascript logic」,在 <script> 裡面加上:

1
2
3
4
const response = await axios.post('<%- SEARCH_API_URL %>', {
longitude: position.coords.longitude,
latitude: position.coords.latitude
});
  1. 更新 docker image 並 push 上去
1
2
3
4
5
6
7
docker build -t youbike-helper-backend .

# docker tag youbike-helper-backend <you_docker_hub_username>/youbike-helper-backend
docker tag youbike-helper-backend kanido386/youbike-helper-backend

# docker push <your_docker_hub_username>/youbike-helper-backend
docker push kanido386/youbike-helper-backend
  1. 修改 backend-config.yamlbackend-deployment.yaml
1
2
3
4
5
6
7
8
9
# backend-config.yaml 加上:
SEARCH_API_URL: 'http://localhost:30000/search' # 總覺得有更好的寫法

# backend-deployment.yaml 加上:
- name: SEARCH_API_URL
valueFrom:
configMapKeyRef:
name: backend-config
key: SEARCH_API_URL
  1. 重新 apply 上面那兩個
1
2
kubectl delete -f backend-config.yaml && kubectl apply -f backend-config.yaml
kubectl delete -f backend-deployment.yaml && kubectl apply -f backend-deployment.yaml
  1. 再跑跑看 http://localhost:30000/ 然後點「查找附近的 YouBike 站點!」,成功啦 😎

Kubernetes Dashboard

參考文有放「Kubernetes Dashboard」的操作教學,這邊就不贅述了~


希望讀完這篇文章的您能夠有所收穫,我們下篇文見啦 😃