這個 Project 主要是用來熟悉 RabbitMQ、MongoDB、Docker 及 Kubernetes。
綜觀全貌 
Crawler 每分鐘固定抓 YouBike 的資料,切分成一個一個站點資訊以後,丟進 message queue (RabbitMQ) 裡。
Worker 得知 queue 裡有新東西時,就將它們拿出來一一處理,並存進一個 NoSQL database (MongoDB) 裡。
Client 端能透過呼叫 Express Server 的 API,抓取「離使用者最近的三個站點還剩下多少台 YouBike」的資訊。
整個流程在本地端測試完以後,先練習用 Docker 去建置整個環境。
OK 以後,再練習用 Kubernetes,或者更精確來說是 k3s + k3d,來建整個環境。
功能實作 此 Project 需要用到的資料集:YouBike臺北市公共自行車即時資訊JSON 、主要欄位說明 (要點開 資料資源下載網址 → 檢視資料 才看得到)
建 Project 1 2 3 4 5 6 7 8 9 mkdir youbike-helper cd  youbike-helpernpm init                 git init npm install express touch .gitignore         touch app.js nodemon app.js 
定期做事情 
can print message every minute with node-cron 
 
撈資料 
can fetch online api json data with node-fetch 
 
npm install node-fetch@2(選用 v2 的原因 )使用方法可見 node-fetch  的 Common Usage → JSON 
 
Message Queue 
can publish & consume object with amqplib 
 
安裝 RabbitMQ:brew install rabbitmq 
啟動:rabbitmq-server 
npm install amqplib 
整合 MongoDB 
can insert data we want into MongoDB with mongoose 
 
1 2 3 mkdir -p ~/mongos/db7 mongod --port 2777 --dbpath ~/mongos/db7 
npm install dotenv --save → require('dotenv').config()type 為 Date 的部分參考自這個  
 
Code 
model.js:用來定 schema修改 worker.js 讓它能新增 document 
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 const  { Info } = require ('./model' )const  newInfo = {  station : info['sna' ],   version : 'v1' ,   available : info['sbi' ],   location : {     type : 'Point' ,     coordinates : [ Number (info['lat' ]).toFixed(4 ), Number (info['lng' ]).toFixed(4 ) ]   },      datatime : new  Date (`${info['mday' ].slice(0 ,4 )} -${info['mday' ].slice(4 ,6 )} -${info['mday' ].slice(6 ,8 )} T${info['mday' ].slice(8 ,10 )} :${info['mday' ].slice(10 ,12 )} :${info['mday' ].slice(12 ,14 )} ` ) }; const  theInfo = await  Info.create(newInfo);console .log(theInfo);console .log('==============================' );
秀出最接近的幾個站點 
can print nearest stations regarding a point with aggregate and $geoNear 
 
🎉 主要是參考這篇 「Or using aggregate instead」做到的 🎉 
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Info.aggregate(   [     {       '$geoNear' : {         'near' : {           'type' : 'Point' ,                      'coordinates' : [ 121.5312 , 25.0299  ]         },         'spherical' : true ,         'distanceField' : 'distance'        },     },     { '$skip' : 0  },     { '$limit' : 3  }   ],   (err, nearestInfos ) =>  {     if  (err) throw  err;          for  (const  info of  nearestInfos) {       console .log(info['station' ]);       console .log(info['available' ]);       console .log('==============================' );     }   } ); 
新版 model.js,為了能做 $geoNear(參考這篇 修改的) 
 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 const  mongoose = require ('mongoose' );const  { Schema } = mongoose;require ('dotenv' ).config();const  URI = process.env.MONGODB_URI;mongoose.connect(`${URI} /demo` ); const  db = mongoose.connection;db.on('error' , console .error.bind(console , 'MongoDB connection error:' )); const  infoSchema = new  Schema({  station : {     type : String ,     required : true ,   },   version : {     type : String ,     required : true ,   },   available : {     type : Number ,     required : true ,   },      location : {     type : {       type : String ,       enum : ['Point' ],       default : 'Point' ,     },     coordinates : {       type : [Number ],     }   },      datatime : {     type : Date ,     required : true ,   }, }, {   strict : 'throw'  }); infoSchema.index({ location : '2dsphere'  }); const  Info = mongoose.model('Info' , infoSchema);module .exports = {  Info }; 
worker.js 經緯度次序也要對調,不然會噴錯!(這篇 有提到) 
1 2 3 4 coordinates: [   Number (Number (info['lng' ]).toFixed(4 )),   Number (Number (info['lat' ]).toFixed(4 )) ] 
實際秀出來給使用者看 
can show to users with ejs and geolocation 
 
改成 upsert 而不是一直 insert 
change to upsert to prevent infinite growth 
 
1 2 3 4 5 const  theInfo = await  Info.findOneAndUpdate({  station : info['sna' ],   version : 'v1' , }, newInfo, { upsert : true  }); 
環境變數 對了,我的 .env 長這樣:
1 2 3 MONGODB_URI='mongodb://localhost:2777' AMQP_URI='amqp://localhost' SEARCH_API_URL='http://localhost:3000/search' 
成果 在本地端測試:
1 2 3 rabbitmq-server                             mongod --port 2777 --dbpath ~/mongos/db7    nodemon app.js                              
Nice!
Docker 啟動多個 Docker container 
can do the same job on docker containers with docker-compose up 
 
先將 docker daemon 跑起來 
寫 backend 的 Dockerfile:touch Dockerfile 
 
1 2 3 4 5 6 7 8 9 FROM  node:lts-alpineRUN  mkdir -p /usr/src/app WORKDIR  /usr/src/app COPY  . /usr/src/app RUN  npm install -g nodemon EXPOSE  3000 CMD  [ "npm" , "start"  ] 
docker-compose.yml:用來組合多個 container 成為一個完整的服務 
也參考了「How to Set Up RabbitMQ With Docker Compose 」及「Docker Compose wait for container X before starting Y 」
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 version:  "3.9" services:   rabbitmq:      image:  rabbitmq:3-management-alpine      container_name:  'rabbitmq'      ports:        -  5672 :5672        -  15672 :15672      volumes:        -  ~/.docker-conf/rabbitmq/data/:/var/lib/rabbitmq/        -  ~/.docker-conf/rabbitmq/log/:/var/log/rabbitmq      healthcheck:        test:  rabbitmq-diagnostics  -q  ping        interval:  10s        timeout:  5s        retries:  5        start_period:  3s    mongo:      image:  mongo      ports:        -  "27017:27017"      healthcheck:        test:  echo  'db.runCommand("ping").ok'  |  mongo  localhost:27017/test  --quiet        interval:  2s        timeout:  3s        retries:  3        start_period:  3s    backend:      build:        context:  .      ports:        -  "3000:3000"      environment:        -  MONGODB_URI=mongodb://mongo:27017        -  AMQP_URI=amqp://guest:guest@rabbitmq:5672      links:        -  mongo        -  rabbitmq      depends_on:        mongo:          condition:  service_healthy        rabbitmq:          condition:  service_healthy  
跑起來:docker-compose up(後端有改 code 的話,先 docker-compose down 再 docker-compose up --build,不然 code 不會更新到,這個地方花了我超多時間在 debug 😂) 
 
Kubernetes 
deploy to K3s with k3d 
 
幾乎都參考自「Kubernetes Web App Deployment with k3d 」,那篇實在是太優質啦!
建立 Cluster 
將 /mnt/data 加進 Docker → Preferences… → Resources → File Sharing 
 
1 2 3 4 5 sudo mkdir -p /mnt/data k3d cluster create youbike-helper --servers 1 --agents 2 --port "30000-30005:30000-30005@server:0"  --volume /mnt/data:/mnt/data kubectl config get-contexts     kubectl cluster-info            
建後端的 Docker Image 
參考文 :Build the Docker images corresponding to the application components and pushing them to Docker Hub, so they become available to our deployment process.透過 docker build 指令可依序讀取 Dockerfile 中的指令以自動建立 image。 
Docker Hub  註冊一下吧~ 
1 2 3 4 5 6 7 docker build -t youbike-helper-backend . docker tag youbike-helper-backend kanido386/youbike-helper-backend docker push kanido386/youbike-helper-backend 
建置 Database Persistent Volume 
參考文 :A pod is ephemeral (短暫存在的), so its data is lost if the pod is destroyed. In this example, we are going to use a persistent volume  (pv) to hold the application data so it persists beyond the lifetime of the pod. 
pv.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 apiVersion:  v1 kind:  PersistentVolume metadata:   name:  youbike-helper-pv    labels:      type:  local  spec:   storageClassName:  manual    capacity:      storage:  5Gi    accessModes:      -  ReadWriteOnce    hostPath:      path:  "/mnt/data"  
去 command line 跑一下:
1 2 kubectl apply -f pv.yaml    kubectl get pv              
Persistent Volume Claim 
參考文 :persistent volume claim  (pvc) are used by pods to request physical storage. 
db-pv-claim.yaml
1 2 3 4 5 6 7 8 9 10 11 apiVersion:  v1 kind:  PersistentVolumeClaim metadata:   name:  youbike-helper-db-pv-claim  spec:   storageClassName:  manual    accessModes:      -  ReadWriteOnce    resources:      requests:        storage:  3Gi  
去 command line 跑一下:
1 2 kubectl apply -f db-pv-claim.yaml    kubectl get pvc 
想移除 PV 和 PVC 怎麼辦? 
Deployment 
參考文 :
A deployment defines the desired state for a component in our application. 
In this case, we are declaring the desired state of our database and Kubernetes takes care of sticking to this definition. 
 
 
db-deployment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 apiVersion:  apps/v1 kind:  Deployment metadata:   name:  youbike-helper-db-deployment  spec:   selector:      matchLabels:        app:  youbike-helper        tier:  db    replicas:  1    strategy:      rollingUpdate:        maxSurge:  1        maxUnavailable:  1      type:  RollingUpdate       template:      metadata:        labels:          app:  youbike-helper          tier:  db      spec:        containers:        -  name:  youbike-helper-db          image:  mongo          ports:          -  containerPort:  27017          volumeMounts:  				 				         -  mountPath:  /mnt/data            name:  youbike-helper-db-volume        volumes:        -  name:  youbike-helper-db-volume          persistentVolumeClaim:            claimName:  youbike-helper-db-pv-claim  
參考文 :What is a rolling update strategy?
The rolling update strategy is a gradual process that allows you to update your Kubernetes system with only a minor effect on performance and no downtime . 
In this strategy, the Deployment selects a Pod with the old programming, deactivates it, and creates an updated Pod to replace it. 
 
 
去 command line 跑一下:
1 2 3 kubectl apply -f db-deployment.yaml kubectl get deployments kubectl get pods -o wide    
Service 
參考文 :
We need a service to make the database available to the backend. 
This makes the database pod accessible only from within the cluster and the connection will target the port 27017 in the pod. 
 
 
db-service.yaml
1 2 3 4 5 6 7 8 9 10 11 12 apiVersion:  v1 kind:  Service metadata:   name:  youbike-helper-db-service  spec:   selector:      app:  youbike-helper      tier:  db    ports:      -  protocol:  TCP        port:  27017        targetPort:  27017  
去 command line 跑一下:
1 2 kubectl apply -f db-service.yaml kubectl get services 
各種 port 傻傻分不清楚? 
(截圖自「Kubernetes Services explained | ClusterIP vs NodePort vs LoadBalancer vs Headless Service 」)
弄 RabbitMQ Deployment 和 Service 可寫在一起 rabbitmq.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 apiVersion:  v1 kind:  Service metadata:   name:  my-rabbitmq-service    labels:      app:  rabbitmq  spec:   selector:      app:  rabbitmq    type:  NodePort    ports:      -  name:  rabbitmq        port:  5672        targetPort:  5672        nodePort:  30001      -  name:  rabbitmq-management        port:  15672        targetPort:  15672        nodePort:  30002  --- apiVersion:  apps/v1 kind:  Deployment metadata:   name:  my-rabbitmq-deployment    labels:      app:  rabbitmq  spec:   replicas:  1    selector:      matchLabels:        app:  rabbitmq    template:      metadata:        labels:          app:  rabbitmq      spec:        containers:          -  name:  rabbitmq            image:  rabbitmq:3.9.18-management-alpine            ports:              -  containerPort:  5672              -  containerPort:  15672            resources:              limits:                cpu:  500m                memory:  512Mi              requests:                cpu:  500m                memory:  512Mi            livenessProbe:              initialDelaySeconds:  30              periodSeconds:  10              timeoutSeconds:  5              successThreshold:  1              failureThreshold:  3              tcpSocket:                port:  5672            readinessProbe:              initialDelaySeconds:  10              periodSeconds:  10              timeoutSeconds:  5              successThreshold:  1              failureThreshold:  3              tcpSocket:                port:  5672  
去 command line 跑一下:
1 2 3 4 kubectl apply -f rabbitmq.yaml    kubectl get deployment            kubectl get service               
處理 Backend 環境變數 
參考文 :3 ways to provide some environment variables when creating the deployment for the backend
defining a variable value directly in the deployment definition 
get a variable’s value from a separate configuration object called ConfigMap  
get a variable’s value from a separate configuration object called Secret  
 
 
backend-config.yaml
1 2 3 4 5 6 7 apiVersion:  v1 kind:  ConfigMap metadata:   name:  backend-config  data:   MONGODB_URI:  youbike-helper-db-service    AMQP_URI:  my-rabbitmq-service  
參考文 :
Please note, we provide the name of the service created before. 
That’s because Kubernetes handles the name resolution so the requests will reach the database pod without problems. 
 
 
去 command line 跑一下:
1 2 3 kubectl apply -f backend-config.yaml kubectl get configmap kubectl describe configmap backend-config 
(因為 project 不需要,所以就沒設定 secret 了,設定方式詳見參考文「Next, let’s create a Secret.」)
Deployment backend-deployment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 apiVersion:  apps/v1 kind:  Deployment metadata:   name:  youbike-helper-backend-deployment  spec:   selector:      matchLabels:        app:  youbike-helper        tier:  backend    replicas:  3       template:      metadata:        labels:          app:  youbike-helper          tier:  backend      spec:        containers:          -  name:  youbike-helper-backend                       image:  kanido386/youbike-helper-backend:latest            env:                                        -  name:  MONGODB                valueFrom:                  configMapKeyRef:                    name:  backend-config                    key:  MONGODB_URI                           -  name:  MONGODB_URI                value:  mongodb://$(MONGODB)              -  name:  AMQP                valueFrom:                  configMapKeyRef:                    name:  backend-config                    key:  AMQP_URI              -  name:  AMQP_URI                value:  amqp://$(AMQP)                                                                             resources:              requests:                cpu:  "500m"                memory:  "512Mi"              limits:                cpu:  "1000m"                memory:  "1024Mi"            ports:              -  containerPort:  3000  
去 command line 跑一下:
1 2 3 4 5 6 kubectl apply -f backend-deployment.yaml kubectl get deployments kubectl exec  -it [pod] /bin/ash    
Service 
參考文 :
If we need to connect to the backend pods directly, how could we find out and keep track of the IP addresses to connect to? 
Thanks to this abstraction, we don’t need to worry about any of this. 
The service “knows” where to find all the pods that matches its selector and any request that reaches the service will reach one of those pods. 
Even if a pod dies or even a node crashes, Kubernetes will make sure to keep the desired number of pods and the service will allow us to reach the required pods. 
 
 
backend-service.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 apiVersion:  v1 kind:  Service metadata:   name:  youbike-helper-backend-service  spec:   selector:      app:  youbike-helper      tier:  backend    type:  NodePort    ports:      -  name:  youbike-helper-backend        port:  3000        targetPort:  3000        nodePort:  30000  
去 command line 跑一下:
1 2 kubectl apply -f backend-service.yaml kubectl get services 
試試 Ingress 
參考文 :We can access the application from outside and make sure the requests will reach the correct destination. 
ingress.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 apiVersion:  networking.k8s.io/v1 kind:  Ingress metadata:   name:  youbike-helper-ingress    annotations:      ingress.kubernetes.io/ssl-redirect:  "false"  spec:   rules:    -  http:        paths:        -  path:  /          pathType:  Prefix          backend:            service:              name:  youbike-helper-backend-service              port:                number:  3000                                                   
參考文 :The ingress controller will start acting as a HTTP reverse proxy. 
去 command line 跑一下:
1 2 kubectl apply -f ingress.yaml kubectl get ingress 
(但後來我沒靠 Ingress 連,而是靠 NodePort 😂)
進行測試 
參考文 :You can play with the application and then check the logs:(也可以用這個 debug!) 
1 2 3 4 kubectl get pods kubectl logs [pod] 
用 http://localhost:30000 成功連進頁面 🎉http://localhost:3000/search 改成 http://localhost:30000/search,當然或許有更好的改法 🤔)
解掉問題 要改 code 的話,會有以下程序要做:
那部分改為環境變數 
 
.env
for 本地端測試用 (假裝自己沒架 k3d),得加上
1 SEARCH_API_URL='http://localhost:3000/search' 
app.js
html 裡面好像不能吃到環境變數,得靠後端傳入
1 2 3 4 5 app.get('/' , (req, res ) =>  {   res.render('index' , {     SEARCH_API_URL : process.env.SEARCH_API_URL   }); }); 
index.ejs
參考自「Accessing EJS variable in Javascript logic 」,在 <script> 裡面加上:
1 2 3 4 const  response = await  axios.post('<%- SEARCH_API_URL %>' , {  longitude : position.coords.longitude,   latitude : position.coords.latitude }); 
更新 docker image 並 push 上去 
 
1 2 3 4 5 6 7 docker build -t youbike-helper-backend . docker tag youbike-helper-backend kanido386/youbike-helper-backend docker push kanido386/youbike-helper-backend 
修改 backend-config.yaml 及 backend-deployment.yaml 
 
1 2 3 4 5 6 7 8 9 SEARCH_API_URL:  'http://localhost:30000/search'    -  name:  SEARCH_API_URL   valueFrom:      configMapKeyRef:        name:  backend-config        key:  SEARCH_API_URL  
重新 apply 上面那兩個 
 
1 2 kubectl delete -f backend-config.yaml && kubectl apply -f backend-config.yaml kubectl delete -f backend-deployment.yaml && kubectl apply -f backend-deployment.yaml 
再跑跑看 http://localhost:30000/ 然後點「查找附近的 YouBike 站點!」,成功啦 😎 
 
Kubernetes Dashboard 參考文有放「Kubernetes Dashboard 
希望讀完這篇文章的您能夠有所收穫,我們下篇文見啦 😃