Opensourcetechブログ

OpensourcetechによるNGINX/Kubernetes/Zabbix/Neo4j/Linuxなどオープンソース技術に関するブログです。

LivenessProbe/ReadinessProbe/StartupProbeを使ってみる(kubernetes)

LinuCエヴァンジェリスト・Open Source Summit Japanボランティアリーダー鯨井貴博@opensourcetechです。


はじめに
今回は、kubernetesでLivenessProbe・ReadinessProbe・StartupProbeを使ってみます。

それぞれの機能ですが、以下の通りとなります。

  • LivenessProbe:Pod内コンテナが期待されるサービス応答をするかどうか
  • ReadinessProbe:Pod内コンテナのサービスレベルでの応答可否
  • StartupProbe:上記2つを開始させる前のコンテナ初回動作確認

    本家ドキュメントは、こちら
    https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

    3つを設定した場合、
    StartupProbeの完了 → LivenessProbe/ReadinessProbeという順で動きます。


    使ってみる
    nginx(Webサーバ)のPodを例として設定を確認します。
root@rke2-1:~# cat probe_http.yaml 
apiVersion: v1
kind: Pod
metadata:
  labels:
    test: probetest
  name: probe-http
  namespace: ckad
spec:
  containers:
  - name: probe-http
    image: nginx
    startupProbe:
      httpGet:
        path: /index.html
        port: 80
        httpHeaders:
        - name: User-Agent
          value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1
    livenessProbe:
      httpGet:
        path: /index.html
        port: 80
        httpHeaders:
        - name: User-Agent
          value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1
    readinessProbe:
      httpGet:
        path: /50x.html
        port: 80
        httpHeaders:
        - name: User-Agent
          value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1


StartupProbeは、以下の部分。

    startupProbe:
      httpGet:
        path: /index.html
        port: 80
        httpHeaders:
        - name: User-Agent
          value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1
  • httpGet:HTTPのGetメソッドの使用
  • path:nginxのドキュメントルート(/usr/share/nginx/html/)配下のアクセスさせるファイル
  • httpHeaders:リクエスト時に使うヘッダー情報 ※なくてもいい
  • initialDelaySeconds:最初のProbeを実行するまでの待ち時間
  • periodSeconds:Probeの実行間隔
  • failureThreshold:Probe失敗と判断する回数
  • successThreshold:Probe成功と判断する回数 ※LivenessとStartupは必ず1である必要あり
  • schema:HTTP(デフォルト) or HTTPSの選択
root@rke2-1:~# kubectl explain pod.spec.containers.livenessProbe.httpGet
KIND:       Pod
VERSION:    v1

FIELD: httpGet <HTTPGetAction>

DESCRIPTION:
    HTTPGet specifies the http request to perform.
    HTTPGetAction describes an action based on HTTP Get requests.
    
FIELDS:
  host  <string>
    Host name to connect to, defaults to the pod IP. You probably want to set
    "Host" in httpHeaders instead.

  httpHeaders   <[]HTTPHeader>
    Custom headers to set in the request. HTTP allows repeated headers.

  path  <string>
    Path to access on the HTTP server.

  port  <IntOrString> -required-
    Name or number of the port to access on the container. Number must be in the
    range 1 to 65535. Name must be an IANA_SVC_NAME.

  scheme        <string>
    Scheme to use for connecting to the host. Defaults to HTTP.
    
    Possible enum values:
     - `"HTTP"` means that the scheme used will be http://
     - `"HTTPS"` means that the scheme used will be https://


LivenessProbeは、以下の部分。
StartupProbeと同じindex.htmlを監視しています。

    livenessProbe:
      httpGet:
        path: /index.html
        port: 80
        httpHeaders:
        - name: User-Agent
          value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1


ReadinessProbeは、以下の部分。
こちらは、nginxのドキュメントルートにある50x.htmlというファイルを監視しており、アプリケーションレベルでの応答が可能かを監視しています。

    readinessProbe:
      httpGet:
        path: /50x.html
        port: 80
        httpHeaders:
        - name: User-Agent
          value: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
      successThreshold: 1



では、デプロイ。
Probeが失敗している場合は、
kubectl describeEventsの部分に出力されます。

root@rke2-1:~# kubectl apply -f probe_http.yaml 
pod/probe-http created

root@rke2-1:~# kubectl get pods -n ckad
NAME         READY   STATUS    RESTARTS   AGE
probe-http   0/1     Running   0          5s


root@rke2-1:~# kubectl describe pods probe-http -n ckad
Name:             probe-http
Namespace:        ckad
Priority:         0
Service Account:  default
Node:             rke2-3/192.168.1.65
Start Time:       Tue, 23 Jan 2024 04:48:15 +0000
Labels:           test=probetest
Annotations:      cni.projectcalico.org/containerID: 04db1546e5f390a60eb34b5b9bd6608408f6d32b5b735a91f5ade626328fb784
                  cni.projectcalico.org/podIP: 10.42.2.34/32
                  cni.projectcalico.org/podIPs: 10.42.2.34/32
Status:           Running
IP:               10.42.2.34
IPs:
  IP:  10.42.2.34
Containers:
  probe-http:
    Container ID:   containerd://040b4fef4deb19d3c3c0aa248b72713791b34c394d6adac0f6ec24c05f9d4dd3
    Image:          nginx
    Image ID:       docker.io/library/nginx@sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 23 Jan 2024 04:48:17 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
    Readiness:      http-get http://:80/50x.html delay=5s timeout=1s period=5s #success=1 #failure=3
    Startup:        http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lghr5 (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  kube-api-access-lghr5:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  42s   default-scheduler  Successfully assigned ckad/probe-http to rke2-3
  Normal  Pulling    41s   kubelet            Pulling image "nginx"
  Normal  Pulled     40s   kubelet            Successfully pulled image "nginx" in 1.43s (1.43s including waiting)
  Normal  Created    40s   kubelet            Created container probe-http
  Normal  Started    40s   kubelet            Started container probe-http


試しにProbeが失敗するようにしてみます。
※監視対象のindex.htmlの名前変更。

root@rke2-1:~# kubectl exec -it probe-http -n ckad -- sh -c "/bin/bash"
root@probe-http:/# cd /usr/share/nginx/html/
root@probe-http:/usr/share/nginx/html# ls
50x.html  index.html
root@probe-http:/usr/share/nginx/html# mv index.html index.html_old 
root@probe-http:/usr/share/nginx/html# exit
exit


すると、EventsにLivenessProbeの失敗が記録されました。

root@rke2-1:~# kubectl describe pods probe-http -n ckad
Name:             probe-http
Namespace:        ckad
Priority:         0
Service Account:  default
Node:             rke2-3/192.168.1.65
Start Time:       Tue, 23 Jan 2024 04:48:15 +0000
Labels:           test=probetest
Annotations:      cni.projectcalico.org/containerID: 04db1546e5f390a60eb34b5b9bd6608408f6d32b5b735a91f5ade626328fb784
                  cni.projectcalico.org/podIP: 10.42.2.34/32
                  cni.projectcalico.org/podIPs: 10.42.2.34/32
Status:           Running
IP:               10.42.2.34
IPs:
  IP:  10.42.2.34
Containers:
  probe-http:
    Container ID:   containerd://040b4fef4deb19d3c3c0aa248b72713791b34c394d6adac0f6ec24c05f9d4dd3
    Image:          nginx
    Image ID:       docker.io/library/nginx@sha256:4c0fdaa8b6341bfdeca5f18f7837462c80cff90527ee35ef185571e1c327beac
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 23 Jan 2024 04:48:17 +0000
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
    Readiness:      http-get http://:80/50x.html delay=5s timeout=1s period=5s #success=1 #failure=3
    Startup:        http-get http://:80/index.html delay=5s timeout=1s period=5s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lghr5 (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  kube-api-access-lghr5:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age   From     Message
  ----     ------     ----  ----     -------
  Warning  Unhealthy  4s    kubelet  Liveness probe failed: HTTP probe failed with statuscode: 404


また、LivenessProbeが失敗した場合はPodのrestartPolicyで決まりますが、
今回はデフォルトのAlwaysなのでPodが再起動され、
kubectl getで確認するとRESTARTS1となっています。

root@rke2-1:~# kubectl get pods probe-http -n ckad -w
NAME         READY   STATUS    RESTARTS      AGE
probe-http   1/1     Running   1 (11s ago)   178m



おわりに
restartPolicyについては、以下を参照。
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/

Opensourcetech by Takahiro Kujirai