LinuCエヴァンジェリスト・Open Source Summit Japanボランティアリーダーの鯨井貴博@opensourcetechです。
はじめに
今回は、kubernetesでLivenessProbe・ReadinessProbe・StartupProbeを使ってみます。
それぞれの機能ですが、以下の通りとなります。
- LivenessProbe:Pod内コンテナが期待されるサービス応答をするかどうか
- ReadinessProbe:Pod内コンテナのサービスレベルでの応答可否
- StartupProbe:上記2つを開始させる前のコンテナ初回動作確認
本家ドキュメントは、こちら。
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
3つを設定した場合、
StartupProbeの完了 → LivenessProbe/ReadinessProbeという順で動きます。
使ってみる
nginx(Webサーバ)のPodを例として設定を確認します。
root@rke2-1:~# cat probe_http.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: probetest
name: probe-http
namespace: ckad
spec:
containers:
- name: probe-http
image: nginx
startupProbe:
httpGet:
path: /index.html
port: 80
httpHeaders:
- name: User-Agent
value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
livenessProbe:
httpGet:
path: /index.html
port: 80
httpHeaders:
- name: User-Agent
value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
readinessProbe:
httpGet:
path: /50x.html
port: 80
httpHeaders:
- name: User-Agent
value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
StartupProbeは、以下の部分。
startupProbe:
httpGet:
path: /index.html
port: 80
httpHeaders:
- name: User-Agent
value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
- httpGet:HTTPのGetメソッドの使用
- path:nginxのドキュメントルート(/usr/share/nginx/html/)配下のアクセスさせるファイル
- httpHeaders:リクエスト時に使うヘッダー情報 ※なくてもいい
- initialDelaySeconds:最初のProbeを実行するまでの待ち時間
- periodSeconds:Probeの実行間隔
- failureThreshold:Probe失敗と判断する回数
- successThreshold:Probe成功と判断する回数 ※LivenessとStartupは必ず1である必要あり
- schema:HTTP(デフォルト) or HTTPSの選択
root@rke2-1:~# kubectl explain pod.spec.containers.livenessProbe.httpGet
KIND: Pod
VERSION: v1
FIELD: httpGet <HTTPGetAction>
DESCRIPTION:
HTTPGet specifies the http request to perform.
HTTPGetAction describes an action based on HTTP Get requests.
FIELDS:
host <string>
Host name to connect to, defaults to the pod IP. You probably want to set
"Host" in httpHeaders instead.
httpHeaders <[]HTTPHeader>
Custom headers to set in the request. HTTP allows repeated headers.
path <string>
Path to access on the HTTP server.
port <IntOrString> -required-
Name or number of the port to access on the container. Number must be in the
range 1 to 65535. Name must be an IANA_SVC_NAME.
scheme <string>
Scheme to use for connecting to the host. Defaults to HTTP.
Possible enum values:
- `"HTTP"` means that the scheme used will be http://
- `"HTTPS"` means that the scheme used will be https://
LivenessProbeは、以下の部分。
StartupProbeと同じindex.htmlを監視しています。
livenessProbe:
httpGet:
path: /index.html
port: 80
httpHeaders:
- name: User-Agent
value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
ReadinessProbeは、以下の部分。
こちらは、nginxのドキュメントルートにある50x.htmlというファイルを監視しており、アプリケーションレベルでの応答が可能かを監視しています。
readinessProbe:
httpGet:
path: /50x.html
port: 80
httpHeaders:
- name: User-Agent
value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
successThreshold: 1
では、デプロイ。
Probeが失敗している場合は、
kubectl describeのEventsの部分に出力されます。
root@rke2-1:~# kubectl apply -f probe_http.yaml pod/probe-http created root@rke2-1:~# kubectl get pods -n ckad NAME READY STATUS RESTARTS AGE probe-http 0/1 Running 0 5s
root@rke2-1:~# kubectl describe pods probe-http -n ckad
Name: probe-http
Namespace: ckad
Priority: 0
Service Account: default
Node: rke2-3/192.168.1.65
Start Time: Tue, 23 Jan 2024 04:48:15 +0000
Labels: test=probetest
Annotations: cni.projectcalico.org/containerID: 04db1546e5f390a60eb34b5b9bd6608408f6d32b5b735a91f5ade626328fb784
cni.projectcalico.org/podIP: 10.42.2.34/32
cni.projectcalico.org/podIPs: 10.42.2.34/32
Status: Running
IP: 10.42.2.34
IPs:
IP: 10.42.2.34
Containers:
probe-http:
Container ID: containerd://040b4fef4deb19d3c3c0aa248b72713791b34c394d6adac0f6ec24c05f9d4dd3
Image: nginx
Image ID: docker.io/library/nginx@sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 23 Jan 2024 04:48:17 +0000
Ready: True
Restart Count: 0
Liveness: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
Readiness: http-get http://:80/50x.html delay=5s timeout=1s period=5s #success=1 #failure=3
Startup: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lghr5 (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-lghr5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 42s default-scheduler Successfully assigned ckad/probe-http to rke2-3
Normal Pulling 41s kubelet Pulling image "nginx"
Normal Pulled 40s kubelet Successfully pulled image "nginx" in 1.43s (1.43s including waiting)
Normal Created 40s kubelet Created container probe-http
Normal Started 40s kubelet Started container probe-http
試しにProbeが失敗するようにしてみます。
※監視対象のindex.htmlの名前変更。
root@rke2-1:~# kubectl exec -it probe-http -n ckad -- sh -c "/bin/bash" root@probe-http:/# cd /usr/share/nginx/html/ root@probe-http:/usr/share/nginx/html# ls 50x.html index.html root@probe-http:/usr/share/nginx/html# mv index.html index.html_old root@probe-http:/usr/share/nginx/html# exit exit
すると、EventsにLivenessProbeの失敗が記録されました。
root@rke2-1:~# kubectl describe pods probe-http -n ckad
Name: probe-http
Namespace: ckad
Priority: 0
Service Account: default
Node: rke2-3/192.168.1.65
Start Time: Tue, 23 Jan 2024 04:48:15 +0000
Labels: test=probetest
Annotations: cni.projectcalico.org/containerID: 04db1546e5f390a60eb34b5b9bd6608408f6d32b5b735a91f5ade626328fb784
cni.projectcalico.org/podIP: 10.42.2.34/32
cni.projectcalico.org/podIPs: 10.42.2.34/32
Status: Running
IP: 10.42.2.34
IPs:
IP: 10.42.2.34
Containers:
probe-http:
Container ID: containerd://040b4fef4deb19d3c3c0aa248b72713791b34c394d6adac0f6ec24c05f9d4dd3
Image: nginx
Image ID: docker.io/library/nginx@sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 23 Jan 2024 04:48:17 +0000
Ready: True
Restart Count: 0
Liveness: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
Readiness: http-get http://:80/50x.html delay=5s timeout=1s period=5s #success=1 #failure=3
Startup: http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lghr5 (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-lghr5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 4s kubelet Liveness probe failed: HTTP probe failed with statuscode: 404
また、LivenessProbeが失敗した場合はPodのrestartPolicyで決まりますが、
今回はデフォルトのAlwaysなのでPodが再起動され、
kubectl getで確認するとRESTARTSが1となっています。
root@rke2-1:~# kubectl get pods probe-http -n ckad -w NAME READY STATUS RESTARTS AGE probe-http 1/1 Running 1 (11s ago) 178m
おわりに
restartPolicyについては、以下を参照。
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/