kubernetes 使daemonset节点初始化pod每个节点只运行一次

rta7y2nd  于 2023-10-17  发布在  Kubernetes
关注(0)|答案(2)|浏览(102)

我想在每个节点上运行一个初始化脚本,并且只运行一次。
在这里,我使用yaml在每个节点上进行一些基本的初始化,但是一旦初始化脚本完成执行,pod就会退出exit code: 0,daemonset会重新启动pod,一次又一次地运行初始化脚本。

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: test-init-node-cr
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: test-init-node-sa
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: test-init-node-cr
subjects:
- kind: ServiceAccount
  name: test-init-node-sa
  namespace: default
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: test-init-node-sa
  namespace: default
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: test-init-node
  namespace: default
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: test-init-node
      app.kubernetes.io/component: configurator
  # replicas: 3
  template:
    metadata:
      name: test-init-node
      labels:
        app.kubernetes.io/name: test-init-node
        app.kubernetes.io/component: configurator
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: k8s.amazee.io/node-configured
                operator: DoesNotExist
      hostPID: true
      hostNetwork: true
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      serviceAccount: test-init-node-sa
      containers:
      - name: init
        env:
        - name: MY_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        command: 
        - nsenter
        - --mount=/proc/1/ns/mnt
        - --
        - bash
        - -xc
        - |
          echo "starting the magic"
          echo "*   hard  core    unlimited" >>  /etc/security/limits.d/game.conf 
          echo "*   soft  core    unlimited" >>  /etc/security/limits.d/game.conf 

        image: alpine/k8s:1.28.0
        resources:
          requests:
            cpu: 50m
            memory: 50M
        securityContext:
          runAsUser: 0
          privileged: true

如果分离舱退出,有什么办法可以阻止守护进程重启分离舱?即确保初始化每个节点只发生一次。
我尝试添加一个preStop,但似乎没有任何效果。这个想法是,如果k8s.amazee.io/node-configured被设置,那么守护进程将不会调度到该节点上。

preStop:
            exec:
              command:
              - /bin/sh"
              - -c
              - kubectl label node "$MY_NODE_NAME" k8s.amazee.io/node-configured=$(date +%s)

添加一个插件也不是(嗯,这是预期的,但我想为什么不给予一个尝试)

command: 
        - nsenter
        - --mount=/proc/1/ns/mnt
        - --
        - bash
        - -xc
        - |
          echo "starting the magic"
          echo "*   hard  core    unlimited" >>  /etc/security/limits.d/game.conf 
          echo "*   soft  core    unlimited" >>  /etc/security/limits.d/game.conf 
        - ; 
        - /bin/sh"
        - -c
        - kubectl label node "$MY_NODE_NAME" k8s.amazee.io/node-configured=$(date +%s)

如果分离舱正常退出,有什么方法可以阻止守护进程重启分离舱?即确保初始化每个节点只发生一次。

h5qlskok

h5qlskok1#

解决办法是:
1.使daemonset中的脚本知道脚本何时已经执行。所以如果它重新启动,你可以跳过这些步骤
1.使脚本在执行后无限休眠。这确保了daemonset不会重新启动(除非它被杀死,但第一点将避免重新执行步骤)
在您的特定情况下,您可以转换脚本:

echo "starting the magic"
echo "*   hard  core    unlimited" >> /etc/security/limits.d/game.conf
echo "*   soft  core    unlimited" >> /etc/security/limits.d/game.conf

收件人:

if [ ! -f /etc/game-conf-limits-updated ]
then
    echo "starting the magic"
    echo "*   hard  core    unlimited" >> /etc/security/limits.d/game.conf
    echo "*   soft  core    unlimited" >> /etc/security/limits.d/game.conf

    touch /etc/game-conf-limits-updated
fi

sleep infinity

/etc/game-conf-limits-updated仅在脚本执行一次时才存在。正如我上面所说的,sleep infinity将阻止daemonset重新启动pod。但是如果由于某种原因重新启动,检查/etc/game-conf-limits-updated是否存在([ ! -f /etc/game-conf-limits-updated ])将避免再次写入/etc/security/limits.d/game.conf。脚本将无限休眠,避免重新启动。
除了检查/etc/game-conf-limits-updated是否存在(并在节点文件系统上写入文件),您还可以检查节点上是否设置了标签。这可能更优雅。
注意,在GCP tutorial中,它们使用了一个带有init容器和pause容器的pod。这相当于我的解决方案与一个单一的容器。
唯一的缺点是你的daemonset pod将永远在你的节点上运行。
更好的解决方案是直接配置您的节点(例如,使用自定义图像为您的集群节点与已经更新的文件,如你所愿),但我不知道这是否可能与GKE。

vd2z7a6w

vd2z7a6w2#

运行daemonset pod可以工作,但它仍然会占用一些资源,并且感觉不优雅。
从@norbjd的回答中,我在GCP tutorial中看到了这一点。

initContainers:
    - image: ubuntu:18.04
      name: node-initializer
      command: ["/scripts/entrypoint.sh"]
      env:
        - name: ROOT_MOUNT_DIR
          value: /root
      securityContext:
        privileged: true
  containers:
    - image: "gcr.io/google-containers/pause:2.0"
      name: pause

本教程讨论的是使用google-containers中的pause contain来避免pod重启。然而,吸引我眼球的是initContainers
资料来源:

Init containers are exactly like regular containers, except:

Init containers always run to completion.
Each init container must complete successfully before the next one starts.

这给了我一个主意。如果我运行两个容器,第一个是initContainers,它执行所有的初始化,第二个containers将运行一个命令来添加一个标签以防止调度,从而停止任何进一步的pod创建/重启该特定节点的Daemonset。
当然,按照相同的逻辑,两者都可以是initContainers,但在我的例子中,我使用了1 initContainers和1 containers,因为containers将等待所有initContainers完成,所以它们与initContainers具有相同的结果。
工作示例

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: test-init-node-cr
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: test-init-node-sa
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: test-init-node-cr
subjects:
- kind: ServiceAccount
  name: test-init-node-sa
  namespace: default
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: test-init-node-sa
  namespace: default
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: test-init-node
  namespace: default
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: test-init-node
      app.kubernetes.io/component: configurator
  # replicas: 3
  template:
    metadata:
      name: test-init-node
      labels:
        app.kubernetes.io/name: test-init-node
        app.kubernetes.io/component: configurator
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test-init-node-date
                operator: DoesNotExist
      hostPID: true
      hostNetwork: true
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      serviceAccount: test-init-node-sa
      initContainers:
      - name: init
        command:         
        - nsenter
        - --mount=/proc/1/ns/mnt
        - --
        - bash
        - -xc
        - |
          echo "starting the magic"
          echo "*   hard  core    unlimited" >>  /etc/security/limits.d/game.conf 
          echo "*   soft  core    unlimited" >>  /etc/security/limits.d/game.conf 
          echo "user00   soft  core    unlimited" >>  /etc/security/limits.d/game.conf 
        image: alpine/k8s:1.28.0
        resources:
          requests:
            cpu: 50m
            memory: 50M
        securityContext:
          runAsUser: 0
          privileged: true

      containers: 
      - name: add-label-to-remove-scheduling
        env:
        - name: MY_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        command:
        - sh
        - -c
        - |
          kubectl label node "$MY_NODE_NAME" test-init-node-date=$(date +%s) 
        image: alpine/k8s:1.28.0
        resources:
          requests:
            cpu: 50m
            memory: 50M
        securityContext:
          runAsUser: 0
          privileged: true

粗略解释:
1.创建相关的服务帐户和权限
1.检查是否设置了test-init-node-date标签。如果是,跳过,什么也不做
1.创建initContainers并根据需要运行init脚本
1.创建Containers并添加test-init-node-date标签
样品标签:

kubernetes.io/os=linux
node.kubernetes.io/instance-type=n2d-standard-8
test-init-node-date=1693280374

这将创建一个运行init pod一次的daemonset,启动另一个pod添加test-init-node-date标签。由于设置了test-init-node-date标签,daemonset将不会调度新的pod。
最后,引用norbjd的话,防止意外重新运行init脚本(例如,有人删除了标签),您可以在运行脚本之前添加安全检查。

if [ ! -f /etc/game-conf-limits-updated ]
then
    echo "starting the magic"
    echo "*   hard  core    unlimited" >> /etc/security/limits.d/game.conf
    echo "*   soft  core    unlimited" >> /etc/security/limits.d/game.conf

    touch /etc/game-conf-limits-updated
fi

相关问题