Kubernetes工作节点无法连接到DNS

b4lqfgs4  于 2023-06-28  发布在  Kubernetes
关注(0)|答案(1)|浏览(145)

在过去的两天里,我已经阅读了数十篇类似问题的帖子,但我无法解决这个DNS问题。
基本上,worker节点上的pod无法解析任何主机名,因为它们无法连接到kube-dns地址10.96.0.10(连接超时)。
我提供了一些命令的结果,我用来尝试调试这个问题。如果还有什么可以帮助的,请在评论中询问,我会很快添加它。
以下是我的设置:

  1. Ubuntu 22.04的3个示例
    1.其中一个是控制平面节点,其他的是工作节点
    1.我使用以下命令初始化集群:kubeadm init --control-plane-endpoint=94.250.248.250 --cri-socket=unix:///var/run/cri-dockerd.sock
    1.我使用Weave作为CNI(我以前尝试过法兰绒,也有同样的问题,所以我切换到Weave,看看它是否会有帮助,它没有)

节点

NAME                STATUS   ROLES           AGE   VERSION
feedgerald.com      Ready    control-plane   92m   v1.27.3
n1.feedgerald.com   Ready    <none>          90m   v1.27.3
n2.feedgerald.com   Ready    <none>          90m   v1.27.3

豆荚

beluc@feedgerald:~/workspace/feedgerald/worker/kubernetes$ kubectl get po --all-namespaces -o wide
NAMESPACE     NAME                                     READY   STATUS             RESTARTS         AGE   IP               NODE                NOMINATED NODE   READINESS GATES
default       dnsutils                                 1/1     Running            0                75m   10.40.0.3        n2.feedgerald.com   <none>           <none>
default       scraper-deployment-56f5fbb68b-67cqq      0/1     Completed          21 (5m24s ago)   86m   10.32.0.3        n1.feedgerald.com   <none>           <none>
default       scraper-deployment-56f5fbb68b-hcrmj      0/1     Completed          21 (5m24s ago)   86m   10.32.0.2        n1.feedgerald.com   <none>           <none>
default       scraper-deployment-56f5fbb68b-m6ltp      0/1     CrashLoopBackOff   21 (67s ago)     86m   10.40.0.2        n2.feedgerald.com   <none>           <none>
default       scraper-deployment-56f5fbb68b-pfvlx      0/1     CrashLoopBackOff   21 (18s ago)     86m   10.40.0.1        n2.feedgerald.com   <none>           <none>
kube-system   coredns-5d78c9869d-g4zzk                 1/1     Running            0                93m   172.17.0.2       feedgerald.com      <none>           <none>
kube-system   coredns-5d78c9869d-xg5fk                 1/1     Running            0                93m   172.17.0.4       feedgerald.com      <none>           <none>
kube-system   etcd-feedgerald.com                      1/1     Running            0                93m   94.250.248.250   feedgerald.com      <none>           <none>
kube-system   kube-apiserver-feedgerald.com            1/1     Running            0                93m   94.250.248.250   feedgerald.com      <none>           <none>
kube-system   kube-controller-manager-feedgerald.com   1/1     Running            0                93m   94.250.248.250   feedgerald.com      <none>           <none>
kube-system   kube-proxy-7f4w2                         1/1     Running            0                92m   92.63.105.188    n2.feedgerald.com   <none>           <none>
kube-system   kube-proxy-jh959                         1/1     Running            0                91m   82.146.44.93     n1.feedgerald.com   <none>           <none>
kube-system   kube-proxy-jwwkt                         1/1     Running            0                93m   94.250.248.250   feedgerald.com      <none>           <none>
kube-system   kube-scheduler-feedgerald.com            1/1     Running            0                93m   94.250.248.250   feedgerald.com      <none>           <none>
kube-system   weave-net-fllvh                          2/2     Running            1 (89m ago)      89m   92.63.105.188    n2.feedgerald.com   <none>           <none>
kube-system   weave-net-kdd9p                          2/2     Running            1 (89m ago)      89m   82.146.44.93     n1.feedgerald.com   <none>           <none>
kube-system   weave-net-x5ksv                          2/2     Running            1 (89m ago)      89m   94.250.248.250   feedgerald.com      <none>           <none>

CoreDNS日志(以防万一)

beluc@feedgerald:~/workspace/feedgerald/worker/kubernetes$ kubectl logs -n kube-system coredns-5d78c9869d-g4zzk
.:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.10.1
linux/amd64, go1.20, 055b2c3
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:43929->185.60.132.11:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:40076->82.146.59.250:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:36699->185.60.132.11:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:57545->82.146.59.250:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:36760->185.60.132.11:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:53409->188.120.247.2:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:60134->188.120.247.2:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:54812->82.146.59.250:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:44563->188.120.247.2:53: i/o timeout
[ERROR] plugin/errors: 2 2971729299988687576.7504631273068998690. HINFO: read udp 172.17.0.2:36629->188.120.247.2:53: i/o timeout
[ERROR] plugin/errors: 2 checkpoint-api.weave.works.domains. A: read udp 172.17.0.2:35531->188.120.247.2:53: i/o timeout
[ERROR] plugin/errors: 2 checkpoint-api.weave.works. AAAA: read udp 172.17.0.2:33150->82.146.59.250:53: i/o timeout
[ERROR] plugin/errors: 2 checkpoint-api.weave.works. A: read udp 172.17.0.2:42371->185.60.132.11:53: i/o timeout
[ERROR] plugin/errors: 2 checkpoint-api.weave.works. A: read udp 172.17.0.2:44653->185.60.132.11:53: i/o timeout

其中一个Pod上的nslookup

beluc@feedgerald:~/workspace/feedgerald/worker/kubernetes$ kubectl exec -ti dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached

command terminated with exit code 1

打印该Pod上的resolv.conf

beluc@feedgerald:~/workspace/feedgerald$ kubectl exec -ti dnsutils -- cat /etc/resolv.conf 
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local DOMAINS
options ndots:5

显示kube-dns正在运行

beluc@feedgerald:~/workspace/feedgerald$ kubectl get svc --all-namespaces
NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP                  97m
kube-system   kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   97m

下面是iptables的配置(stackoverflow不允许在问题中使用如此大的粘贴,因此使用pastebin):https://pastebin.com/raw/XTpWaeCb

bqjvbblv

bqjvbblv1#

这解决了问题。但我还是不明白为什么一开始就有问题。

iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -F

相关问题