**环境信息:**k3s版本:k3s版本v1.24.3+k3s1(990ba0e8)go版本go1.18.1
节点CPU架构、操作系统和版本:5个运行Headless 64位Raspbian的RPI 4,每个RPI 4都具有以下信息:Linux 5.15.56-V8+#1575 SMP Preempt Fri Jul 22 20:31:26 BST 2022 aarch64 GNU/Linux
集群配置:3个节点配置为控制平面,2个节点配置为工作节点
**描述错误:**Pods:coredns-b96499967-ktgtc,local-path-Provisioner-7b7dc8d6f5-5cfds,metrics-server-668d979685-9szb9,traefik-7cd4fcff68-gfmhm,svclb-traefik-aa9f6b38-j27sw状态未知,0/1 Pod Ready。这意味着集群DNS服务不起作用,因此Pod无法解析内部或外部名称
复制步骤:
- 已使用以下说明在HA模式下安装K3:https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/
**预期行为:**重要示例应处于运行状态,状态已知。此外,DNS应该可以工作,这意味着无头服务应该可以工作,POD应该能够解析集群内外的主机名
**实际行为:**DNSPod应该在已知状态下运行,Pod应该能够解析集群内外的主机名,并且无头服务应该能够工作
额外的上下文/日志:
kubectl -n kube-system get configmap coredns -o go-template={{.data.Corefile}}
.:53 {
errors
health
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
hosts /etc/coredns/NodeHosts {
ttl 60
reload 15s
fallthrough
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
import /etc/coredns/custom/*.server
相关示例说明:
kubectl describe pods --namespace=kube-system
Name: coredns-b96499967-ktgtc
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: master0/192.168.0.68
Start Time: Fri, 05 Aug 2022 16:09:38 +0100
Labels: k8s-app=kube-dns
pod-template-hash=b96499967
Annotations: <none>
Status: Running
IP:
IPs: <none>
Controlled By: ReplicaSet/coredns-b96499967
Containers:
coredns:
Container ID: containerd://1a83a59275abdb7b783aa06eb56cb1e5367c1ca196598851c2b7d5154c0a4bb9
Image: rancher/mirrored-coredns-coredns:1.9.1
Image ID: docker.io/rancher/mirrored-coredns-coredns@sha256:35e38f3165a19cb18c65d83334c13d61db6b24905f45640aa8c2d2a6f55ebcb0
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 05 Aug 2022 19:19:19 +0100
Finished: Fri, 05 Aug 2022 19:20:29 +0100
Ready: False
Restart Count: 8
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=2s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/etc/coredns/custom from custom-config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zbbxf (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
custom-config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns-custom
Optional: true
kube-api-access-zbbxf:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 41d (x419 over 41d) kubelet Pod sandbox changed, it will be killed and re-created.
Normal SandboxChanged 64m (x11421 over 42h) kubelet Pod sandbox changed, it will be killed and re-created.
Normal SandboxChanged 2m24s (x139 over 32m) kubelet Pod sandbox changed, it will be killed and re-created.
Name: metrics-server-668d979685-9szb9
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: master0/192.168.0.68
Start Time: Fri, 05 Aug 2022 16:09:38 +0100
Labels: k8s-app=metrics-server
pod-template-hash=668d979685
Annotations: <none>
Status: Running
IP:
IPs: <none>
Controlled By: ReplicaSet/metrics-server-668d979685
Containers:
metrics-server:
Container ID: containerd://cd02643f7d7bc78ea98abdec20558626cfac39f70e1127b2281342dd00905e44
Image: rancher/mirrored-metrics-server:v0.5.2
Image ID: docker.io/rancher/mirrored-metrics-server@sha256:48ecad4fe641a09fa4459f93c7ad29d4916f6b9cf7e934d548f1d8eff96e2f35
Port: 4443/TCP
Host Port: 0/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--kubelet-use-node-status-port
--metric-resolution=15s
State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 05 Aug 2022 19:19:19 +0100
Finished: Fri, 05 Aug 2022 19:20:29 +0100
Ready: False
Restart Count: 8
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get https://:https/livez delay=60s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:https/readyz delay=0s timeout=1s period=2s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-djqgk (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-djqgk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 41d (x418 over 41d) kubelet Pod sandbox changed, it will be killed and re-created.
Normal SandboxChanged 64m (x11427 over 42h) kubelet Pod sandbox changed, it will be killed and re-created.
Normal SandboxChanged 2m27s (x141 over 32m) kubelet Pod sandbox changed, it will be killed and re-created.
Name: traefik-7cd4fcff68-gfmhm
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: master0/192.168.0.68
Start Time: Fri, 05 Aug 2022 16:10:43 +0100
Labels: app.kubernetes.io/instance=traefik
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=traefik
helm.sh/chart=traefik-10.19.300
pod-template-hash=7cd4fcff68
Annotations: prometheus.io/path: /metrics
prometheus.io/port: 9100
prometheus.io/scrape: true
Status: Running
IP:
IPs: <none>
Controlled By: ReplicaSet/traefik-7cd4fcff68
Containers:
traefik:
Container ID: containerd://779a1596fb204a7577acda97e9fb3f4c5728cf1655071d8e5faad6a8d407d217
Image: rancher/mirrored-library-traefik:2.6.2
Image ID: docker.io/rancher/mirrored-library-traefik@sha256:ad2226527eea71b7591d5e9dcc0bffd0e71b2235420c34f358de6db6d529561f
Ports: 9100/TCP, 9000/TCP, 8000/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
--global.checknewversion
--global.sendanonymoususage
--entrypoints.metrics.address=:9100/tcp
--entrypoints.traefik.address=:9000/tcp
--entrypoints.web.address=:8000/tcp
--entrypoints.websecure.address=:8443/tcp
--api.dashboard=true
--ping=true
--metrics.prometheus=true
--metrics.prometheus.entrypoint=metrics
--providers.kubernetescrd
--providers.kubernetesingress
--providers.kubernetesingress.ingressendpoint.publishedservice=kube-system/traefik
--entrypoints.websecure.http.tls=true
State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 05 Aug 2022 19:19:19 +0100
Finished: Fri, 05 Aug 2022 19:20:29 +0100
Ready: False
Restart Count: 8
Liveness: http-get http://:9000/ping delay=10s timeout=2s period=10s #success=1 #failure=3
Readiness: http-get http://:9000/ping delay=10s timeout=2s period=10s #success=1 #failure=1
Environment: <none>
Mounts:
/data from data (rw)
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jw4qc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
tmp:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-jw4qc:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 41d (x415 over 41d) kubelet Pod sandbox changed, it will be killed and re-created.
Normal SandboxChanged 64m (x11418 over 42h) kubelet Pod sandbox changed, it will be killed and re-created.
Normal SandboxChanged 2m30s (x141 over 32m) kubelet Pod sandbox changed, it will be killed and re-created.
1条答案
按热度按时间eqqqjvef1#
我找到的解决问题的解决方案--至少目前是这样--是手动重新启动使用部署命令找到的所有Kube-System部署
如果它们都同样没有准备好,可以使用以下命令重新启动
具体地说,核心部署、本地路径调配器、指标服务器和traefik部署都需要重新启动