对Kubernetes不正常的容忍感到困惑

9fkzdhlc  于 2023-10-17  发布在  Kubernetes
关注(0)|答案(1)|浏览(87)

当一个豆荚有低于容忍度。

tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  - effect: NoExecute
    operator: Exists
  - effect: NoSchedule
    operator: Exists

为什么它可以被调度到具有低于污点的节点。Runtime=true:NoSchedule
我还搜索了Kubernetes文档。一般来说,一个公差组将包括'键',那么下面是如何工作的?

- effect: NoSchedule
    operator: Exists
z9smfwbn

z9smfwbn1#

我复制了这期杂志,
我已经用Runtime=true:NoSchedule污染了节点gke-cluster-4-default-pool-8ad24f8f-2ixm

$ kubectl get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
gke-cluster-4-default-pool-8ad24f8f-2ixm   Ready    <none>   10d   v1.26.6-gke.1700
gke-cluster-4-default-pool-8ad24f8f-ncy0   Ready    <none>   10d   v1.26.6-gke.1700
gke-cluster-4-default-pool-8ad24f8f-o537   Ready    <none>   10d   v1.26.6-gke.1700

$ kubectl taint nodes gke-cluster-4-default-pool-8ad24f8f-2ixm Runtime=true:NoSchedule
node/gke-cluster-4-default-pool-8ad24f8f-2ixm tainted

然后,我创建了一个没有任何容忍的部署,所以pod不会与带有污点的节点一起调度:

$ kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP           NODE                                       NOMINATED NODE   READINESS GATES
nginx-deployment-7f456874f4-95x66   1/1     Running   0          34s   10.24.2.6    gke-cluster-4-default-pool-8ad24f8f-o537   <none>           <none>
nginx-deployment-7f456874f4-9sj68   1/1     Running   0          34s   10.24.0.12   gke-cluster-4-default-pool-8ad24f8f-ncy0   <none>           <none>
nginx-deployment-7f456874f4-f4s98   1/1     Running   0          34s   10.24.0.13   gke-cluster-4-default-pool-8ad24f8f-ncy0   <none>           <none>
nginx-deployment-7f456874f4-zbgp9   1/1     Running   0          34s   10.24.2.7    gke-cluster-4-default-pool-8ad24f8f-o537   <none>           <none>
nginx-deployment-7f456874f4-zs4js   1/1     Running   0          34s   10.24.0.11   gke-cluster-4-default-pool-8ad24f8f-ncy0   <none>           <none>

后来我添加了您提供的容忍度,并在应用了污点的节点上调度了2个Pod:(我已删除现有部署,并在添加容差后进行部署)

kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP           NODE                                       NOMINATED NODE   READINESS GATES
nginx-deployment-6d998db8f-58wr2   1/1     Running   0          6s    10.24.1.7    gke-cluster-4-default-pool-8ad24f8f-2ixm   <none>           <none>
nginx-deployment-6d998db8f-62dcm   1/1     Running   0          6s    10.24.2.8    gke-cluster-4-default-pool-8ad24f8f-o537   <none>           <none>
nginx-deployment-6d998db8f-srcmg   1/1     Running   0          6s    10.24.0.14   gke-cluster-4-default-pool-8ad24f8f-ncy0   <none>           <none>
nginx-deployment-6d998db8f-wv48m   1/1     Running   0          6s    10.24.1.6    gke-cluster-4-default-pool-8ad24f8f-2ixm   <none>           <none>
nginx-deployment-6d998db8f-zbck2   1/1     Running   0          6s    10.24.0.15   gke-cluster-4-default-pool-8ad24f8f-ncy0   <none>           <none>

因此,宽容和污点工作正常,所以问题是与资源本身,它可能是由于以下原因:
1.使用kubectl describe命令检查节点上的CPU或内存等资源不足。
2.仔细检查污点和容差,是否有任何其他污点或容差正在阻止Pod进行调度。
3.检查是否有任何nodeselector和affinity规则阻止pod调度节点。
为了进一步调试,添加pod和node的describe命令。附上motoskia写的blog供你参考。

相关问题