kubernetes 状态管理器不规范化临时容器状态,

4szc88ey  于 4个月前  发布在  Kubernetes
关注(0)|答案(6)|浏览(51)

在状态管理器中,EphemeralContainerStatuses 未对 normalizeStatus() 进行规范化处理:

`EphemeralContainerStatuses` is not normalized in `normalizeStatus()` in the status manager:
kubernetes/pkg/kubelet/status/status_manager.go
Lines 1026 to 1043 in [cade1dd](https://github.com/kubernetes/kubernetes/commit/cade1dddd81eba338df85de7b5d17324a87243b5)
|  | // update container statuses |
|  | fori:=rangestatus.ContainerStatuses { |
|  | cstatus:=&status.ContainerStatuses[i] |
|  | normalizeContainerState(&cstatus.State) |
|  | normalizeContainerState(&cstatus.LastTerminationState) |
|  | } |
|  | // Sort the container statuses, so that the order won't affect the result of comparison |
|  | sort.Sort(kubetypes.SortedContainerStatuses(status.ContainerStatuses)) |
|  | |
|  | // update init container statuses |
|  | fori:=rangestatus.InitContainerStatuses { |
|  | cstatus:=&status.InitContainerStatuses[i] |
|  | normalizeContainerState(&cstatus.State) |
|  | normalizeContainerState(&cstatus.LastTerminationState) |
|  | } |
|  | // Sort the container statuses, so that the order won't affect the result of comparison |
|  | kubetypes.SortInitContainerStatuses(pod, status.InitContainerStatuses) |
|  | returnstatus |

这似乎不会导致任何面向用户的问题,因为 EphemeralContainerStatuses 在传递给状态管理器之前已经排序,并且在创建补丁时对时间戳进行了规范化。
然而,由于内部未对时间戳进行规范化,导致了意外的行为。如果日志详细程度为3或更高,当存在临时容器时,会定期在 kubelet 中记录以下消息:

This message is emitted here:
kubernetes/pkg/kubelet/status/status_manager.go
Lines 984 to 988 in [cade1dd](https://github.com/kubernetes/kubernetes/commit/cade1dddd81eba338df85de7b5d17324a87243b5)
|  | klog.V(3).InfoS("Pod status is inconsistent with cached status for pod, a reconciliation should be triggered", |
|  | "pod", klog.KObj(pod), |
|  | "statusDiff", cmp.Diff(podStatus, &status)) |

needsReconcile() 返回 true 后,调用 syncPod()。这看起来并不那么有害,因为 unchaged 最终会得到 true:

kubernetes/pkg/kubelet/status/status_manager.go
Lines 873 to 881 in [cade1dd](https://github.com/kubernetes/kubernetes/commit/cade1dddd81eba338df85de7b5d17324a87243b5)
|  | newPod, patchBytes, unchanged, err:=statusutil.PatchPodStatus(context.TODO(), m.kubeClient, pod.Namespace, pod.Name, pod.UID, pod.Status, mergedStatus) |
|  | klog.V(3).InfoS("Patch status for pod", "pod", klog.KObj(pod), "podUID", uid, "patch", string(patchBytes)) |

尽管如此,API 还是会被无谓地调用一次:

kubernetes/pkg/kubelet/status/status_manager.go
Line 843 in [cade1dd](https://github.com/kubernetes/kubernetes/commit/cade1dddd81eba338df85de7b5d17324a87243b5)
|  | pod, err:=m.kubeClient.CoreV1().Pods(status.podNamespace).Get(context.TODO(), status.podName, metav1.GetOptions{}) |

为了避免这种意外行为,最好在状态管理器中对临时容器的状态进行规范化处理,至少对时间戳进行规范化。

kse8i1jr

kse8i1jr2#

这是一个bug。你能提交一个PR来修复它吗?如果不能,我很乐意帮你解决这个问题。

nmpmafwu

nmpmafwu3#

如果你感兴趣,请随意分配任务给自己。

bnlyeluc

bnlyeluc4#

如果你有兴趣,请随意分配任务给自己。
谢谢

wh6knrhe

wh6knrhe6#

/triage accepted
/priority important-longterm

相关问题