Kubernetes master启动后停止响应请求

kognpnkq  于 2023-08-03  发布在  Kubernetes
关注(0)|答案(1)|浏览(123)

我有一个两个节点的Kubernetes cluester:

*主节点- Ubuntu Desktop 22.04
*工作节点- Ubuntu Server 22.04

我使用的是最新版本的kubectlkubeletkubeadm都在v1.27.04中。
我用kubeadm创建了集群。我用docker正确地设置了容器环境。在主节点中,我执行sudo kubeadm init,它会成功启动控制位置一段时间,如果我执行kubectl get nodes,它会正确显示控制平面已准备就绪
但是在一段时间后,kubectl get nodes开始给予这样的错误:

E0731 21:31:48.146646  281127 memcache.go:265] couldn't get current server API group list: Get "https://192.168.15.80:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

字符串
如果我运行sudo journalctl -u kubelet,我会得到这样的结果:

Jul 31 21:40:56 Orca kubelet[175666]: E0731 21:40:56.189126  175666 pod_workers.go:1294] "Error syncing pod, skipping" err="failed to \"KillPodSandbox\" for \"95fa1492-9967-4bab-989b-a87b401df8fb\" with KillPodSandboxError: \"rpc error: code = Unknown desc = failed to destroy network for sandbox \\\"a0da1751ea78d4ee04151ca7509ad371c52890d03580a638a3b74b5e486167a2\\\": plugin type=\\\"calico\\\" failed (delete): error getting ClusterInformation: Get \\\"https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default\\\": x509: certificate signed by unknown authority (possibly because of \\\"crypto/rsa: verification error\\\" while trying to verify candidate authority certificate \\\"kubernetes\\\")\"" pod="kube-system/coredns-5d78c9869d-gdnbc" podUID=95fa1492-9967-4bab-989b-a87b401df8fb
Jul 31 21:40:56 Orca kubelet[175666]: E0731 21:40:56.189058  175666 kuberuntime_manager.go:1038] "killPodWithSyncResult failed" err="failed to \"KillPodSandbox\" for \"95fa1492-9967-4bab-989b-a87b401df8fb\" with KillPodSandboxError: \"rpc error: code = Unknown desc = failed to destroy network for sandbox \\\"a0da1751ea78d4ee04151ca7509ad371c52890d03580a638a3b74b5e486167a2\\\": plugin type=\\\"calico\\\" failed (delete): error getting ClusterInformation: Get \\\"https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default\\\": x509: certificate signed by unknown authority (possibly because of \\\"crypto/rsa: verification error\\\" while trying to verify candidate authority certificate \\\"kubernetes\\\")\""
Jul 31 21:40:56 Orca kubelet[175666]: E0731 21:40:56.188973  175666 kuberuntime_manager.go:1312] "Failed to stop sandbox" podSandboxID={Type:containerd ID:a0da1751ea78d4ee04151ca7509ad371c52890d03580a638a3b74b5e486167a2}
Jul 31 21:40:56 Orca kubelet[175666]: E0731 21:40:56.188910  175666 remote_runtime.go:205] "StopPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to destroy network for sandbox \"a0da1751ea78d4ee04151ca7509ad371c52890d03580a638a3b74b5e486167a2\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")" podSandboxID="a0da1751ea78d4ee04151ca7509ad371c52890d03580a638a3b74b5e486167a2"


如果我尝试使用以下命令的输出从worker节点加入:
kubeadm token create --print-join-command
正如我可以弄清楚的那样,我的kubernetes控制平面不知何故崩溃了,没有响应请求。
我甚至不能ping从主节点到工人节点或反之亦然使用他们的真实的IP
我使用weaver 2.8.1作为守护进程。
我怎么能解决这个问题。

yfwxisqw

yfwxisqw1#

您的问题应该与可能的证书不匹配有关。
请检查您的$HOME/.kube/config文件是否包含有效的证书,并在必要时按照官方故障排除页面重新生成证书:
使用以下命令取消设置KUBECONFIG环境变量

unset KUBECONFIG

字符串
或者,将KUBECONFIG设置为默认的KUBECONFIG位置:

export KUBECONFIG=/etc/kubernetes/admin.conf


或者,覆盖“admin”用户的现有kubeconfig:

mv  $HOME/.kube $HOME/.kube.bak
mkdir $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

相关问题