LinuCエヴァンジェリスト・Open Source Summit Japanボランティアリーダーの鯨井貴博@opensourcetechです。
はじめに
今回は、kubernetesでLivenessProbe・ReadinessProbe・StartupProbeを使ってみます。
それぞれの機能ですが、以下の通りとなります。
- LivenessProbe:Pod内コンテナが期待されるサービス応答をするかどうか
- ReadinessProbe:Pod内コンテナのサービスレベルでの応答可否
- StartupProbe:上記2つを開始させる前のコンテナ初回動作確認
本家ドキュメントは、こちら。
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
3つを設定した場合、
StartupProbeの完了 → LivenessProbe/ReadinessProbeという順で動きます。
使ってみる
nginx(Webサーバ)のPodを例として設定を確認します。
root@rke2-1:~# cat probe_http.yaml apiVersion: v1 kind: Pod metadata: labels: test: probetest name: probe-http namespace: ckad spec: containers: - name: probe-http image: nginx startupProbe: httpGet: path: /index.html port: 80 httpHeaders: - name: User-Agent value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 successThreshold: 1 livenessProbe: httpGet: path: /index.html port: 80 httpHeaders: - name: User-Agent value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 successThreshold: 1 readinessProbe: httpGet: path: /50x.html port: 80 httpHeaders: - name: User-Agent value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 successThreshold: 1
StartupProbeは、以下の部分。
startupProbe: httpGet: path: /index.html port: 80 httpHeaders: - name: User-Agent value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 successThreshold: 1
- httpGet:HTTPのGetメソッドの使用
- path:nginxのドキュメントルート(/usr/share/nginx/html/)配下のアクセスさせるファイル
- httpHeaders:リクエスト時に使うヘッダー情報 ※なくてもいい
- initialDelaySeconds:最初のProbeを実行するまでの待ち時間
- periodSeconds:Probeの実行間隔
- failureThreshold:Probe失敗と判断する回数
- successThreshold:Probe成功と判断する回数 ※LivenessとStartupは必ず1である必要あり
- schema:HTTP(デフォルト) or HTTPSの選択
root@rke2-1:~# kubectl explain pod.spec.containers.livenessProbe.httpGet KIND: Pod VERSION: v1 FIELD: httpGet <HTTPGetAction> DESCRIPTION: HTTPGet specifies the http request to perform. HTTPGetAction describes an action based on HTTP Get requests. FIELDS: host <string> Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead. httpHeaders <[]HTTPHeader> Custom headers to set in the request. HTTP allows repeated headers. path <string> Path to access on the HTTP server. port <IntOrString> -required- Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. scheme <string> Scheme to use for connecting to the host. Defaults to HTTP. Possible enum values: - `"HTTP"` means that the scheme used will be http:// - `"HTTPS"` means that the scheme used will be https://
LivenessProbeは、以下の部分。
StartupProbeと同じindex.htmlを監視しています。
livenessProbe: httpGet: path: /index.html port: 80 httpHeaders: - name: User-Agent value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 successThreshold: 1
ReadinessProbeは、以下の部分。
こちらは、nginxのドキュメントルートにある50x.htmlというファイルを監視しており、アプリケーションレベルでの応答が可能かを監視しています。
readinessProbe: httpGet: path: /50x.html port: 80 httpHeaders: - name: User-Agent value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 3 successThreshold: 1
では、デプロイ。
Probeが失敗している場合は、
kubectl describeのEventsの部分に出力されます。
root@rke2-1:~# kubectl apply -f probe_http.yaml pod/probe-http created root@rke2-1:~# kubectl get pods -n ckad NAME READY STATUS RESTARTS AGE probe-http 0/1 Running 0 5s
root@rke2-1:~# kubectl describe pods probe-http -n ckad Name: probe-http Namespace: ckad Priority: 0 Service Account: default Node: rke2-3/192.168.1.65 Start Time: Tue, 23 Jan 2024 04:48:15 +0000 Labels: test=probetest Annotations: cni.projectcalico.org/containerID: 04db1546e5f390a60eb34b5b9bd6608408f6d32b5b735a91f5ade626328fb784 cni.projectcalico.org/podIP: 10.42.2.34/32 cni.projectcalico.org/podIPs: 10.42.2.34/32 Status: Running IP: 10.42.2.34 IPs: IP: 10.42.2.34 Containers: probe-http: Container ID: containerd://040b4fef4deb19d3c3c0aa248b72713791b34c394d6adac0f6ec24c05f9d4dd3 Image: nginx Image ID: docker.io/library/nginx@sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac Port: <none> Host Port: <none> State: Running Started: Tue, 23 Jan 2024 04:48:17 +0000 Ready: True Restart Count: 0 Liveness: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3 Readiness: http-get http://:80/50x.html delay=5s timeout=1s period=5s #success=1 #failure=3 Startup: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lghr5 (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready True ContainersReady True PodScheduled True Volumes: kube-api-access-lghr5: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 42s default-scheduler Successfully assigned ckad/probe-http to rke2-3 Normal Pulling 41s kubelet Pulling image "nginx" Normal Pulled 40s kubelet Successfully pulled image "nginx" in 1.43s (1.43s including waiting) Normal Created 40s kubelet Created container probe-http Normal Started 40s kubelet Started container probe-http
試しにProbeが失敗するようにしてみます。
※監視対象のindex.htmlの名前変更。
root@rke2-1:~# kubectl exec -it probe-http -n ckad -- sh -c "/bin/bash" root@probe-http:/# cd /usr/share/nginx/html/ root@probe-http:/usr/share/nginx/html# ls 50x.html index.html root@probe-http:/usr/share/nginx/html# mv index.html index.html_old root@probe-http:/usr/share/nginx/html# exit exit
すると、EventsにLivenessProbeの失敗が記録されました。
root@rke2-1:~# kubectl describe pods probe-http -n ckad Name: probe-http Namespace: ckad Priority: 0 Service Account: default Node: rke2-3/192.168.1.65 Start Time: Tue, 23 Jan 2024 04:48:15 +0000 Labels: test=probetest Annotations: cni.projectcalico.org/containerID: 04db1546e5f390a60eb34b5b9bd6608408f6d32b5b735a91f5ade626328fb784 cni.projectcalico.org/podIP: 10.42.2.34/32 cni.projectcalico.org/podIPs: 10.42.2.34/32 Status: Running IP: 10.42.2.34 IPs: IP: 10.42.2.34 Containers: probe-http: Container ID: containerd://040b4fef4deb19d3c3c0aa248b72713791b34c394d6adac0f6ec24c05f9d4dd3 Image: nginx Image ID: docker.io/library/nginx@sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac Port: <none> Host Port: <none> State: Running Started: Tue, 23 Jan 2024 04:48:17 +0000 Ready: True Restart Count: 0 Liveness: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3 Readiness: http-get http://:80/50x.html delay=5s timeout=1s period=5s #success=1 #failure=3 Startup: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lghr5 (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready True ContainersReady True PodScheduled True Volumes: kube-api-access-lghr5: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning Unhealthy 4s kubelet Liveness probe failed: HTTP probe failed with statuscode: 404
また、LivenessProbeが失敗した場合はPodのrestartPolicyで決まりますが、
今回はデフォルトのAlwaysなのでPodが再起動され、
kubectl getで確認するとRESTARTSが1となっています。
root@rke2-1:~# kubectl get pods probe-http -n ckad -w NAME READY STATUS RESTARTS AGE probe-http 1/1 Running 1 (11s ago) 178m
おわりに
restartPolicyについては、以下を参照。
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/