为什么mesos master会断开k8s框架并关闭错误的文件描述符?

6yt4nkrj  于 2021-06-21  发布在  Mesos
关注(0)|答案(1)|浏览(257)

我正在尝试按照本教程来部署“ kubernetes on Mesos “在本地计算机上: k8s 是最新的主分支 Mesos0.26 版本。
运行后 Mesos 主机(ip:15.242.100.56), Mesos 从(ip:15.242.100.16)和 k8s (ip:15.242.100.60),我可以从 Mesos 硕士学位:

I1228 21:56:55.591568 27255 hierarchical.cpp:344] Added slave 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-S0 (pqsfc016.ftc.rdlabs.hpecorp.net) with cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000] (allocated: )
I1228 21:56:55.591601 27240 replica.cpp:700] Replica learned TRUNCATE action at position 4
I1228 21:56:55.593670 27233 master.cpp:4269] Received update of slave 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-S0 at slave(1)@15.242.100.16:5051 (pqsfc016.ftc.rdlabs.hpecorp.net) with total oversubscribed resources
I1228 21:56:55.594622 27239 hierarchical.cpp:400] Slave 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-S0 (pqsfc016.ftc.rdlabs.hpecorp.net) updated with oversubscribed resources  (total: cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000], allocated: )
I1228 21:57:11.060005 27256 http.cpp:334] HTTP GET for /master/state.json from 15.242.100.60:40727 with User-Agent='Go-http-client/1.1'
I1228 21:57:12.053403 27244 http.cpp:334] HTTP GET for /master/state.json from 15.242.100.60:40754 with User-Agent='Go-http-client/1.1'
I1228 21:57:12.084724 27256 http.cpp:334] HTTP GET for /master/state.json from 15.242.100.60:40771 with User-Agent='Go-http-client/1.1'
I1228 21:57:13.130113 27251 http.cpp:334] HTTP GET for /master/state.json from 15.242.100.60:40779 with User-Agent='Go-http-client/1.1'
I1228 21:57:13.136896 27249 master.cpp:2176] Received SUBSCRIBE call for framework 'Kubernetes' at scheduler(1)@15.242.100.60:49163
I1228 21:57:13.137248 27249 master.cpp:2247] Subscribing framework Kubernetes with checkpointing enabled and capabilities [  ]
E1228 21:57:13.138357 27257 process.cpp:1911] Failed to shutdown socket with fd 17: Transport endpoint is not connected
I1228 21:57:13.138389 27255 hierarchical.cpp:195] Added framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000
I1228 21:57:13.138842 27249 master.cpp:1122] Framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163 disconnected
I1228 21:57:13.138898 27249 master.cpp:2472] Disconnecting framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163
I1228 21:57:13.138943 27249 master.cpp:2496] Deactivating framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163
E1228 21:57:13.138975 27257 process.cpp:1911] Failed to shutdown socket with fd 17: Transport endpoint is not connected
I1228 21:57:13.139091 27249 master.cpp:1146] Giving framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163 7625.14222623576weeks to failover
I1228 21:57:13.139468 27255 hierarchical.cpp:273] Deactivated framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000
W1228 21:57:13.139472 27236 master.cpp:4840] Master returning resources offered to framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 because the framework has terminated or is inactive
I1228 21:57:13.140090 27246 hierarchical.cpp:744] Recovered cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000] (total: cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000], allocated: ) on slave 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-S0 from framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000

我的问题是:
(1) 为什么 Mesos 主机将断开 k8s 框架:

I1228 21:57:13.136896 27249 master.cpp:2176] Received SUBSCRIBE call for framework 'Kubernetes' at scheduler(1)@15.242.100.60:49163
I1228 21:57:13.137248 27249 master.cpp:2247] Subscribing framework Kubernetes with checkpointing enabled and capabilities [  ]
I1228 21:57:13.138389 27255 hierarchical.cpp:195] Added framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000
I1228 21:57:13.138842 27249 master.cpp:1122] Framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163 disconnected
I1228 21:57:13.138898 27249 master.cpp:2472] Disconnecting framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163
I1228 21:57:13.138943 27249 master.cpp:2496] Deactivating framework 5de231c9-993c-4ac7-8ffb-c3fbff2c61cd-0000 (Kubernetes) at scheduler(1)@15.242.100.60:49163

(2) 从 sudo lsof -p 27219 -P -n 命令:

lt-mesos- 27219  nan    0u   CHR  136,2       0t0       5 /dev/pts/2
lt-mesos- 27219  nan    1u   CHR  136,2       0t0       5 /dev/pts/2
lt-mesos- 27219  nan    2u   CHR  136,2       0t0       5 /dev/pts/2
lt-mesos- 27219  nan    3u  0000   0,10         0    8938 anon_inode
lt-mesos- 27219  nan    4u  0000   0,10         0    8938 anon_inode
lt-mesos- 27219  nan    5u  IPv4  85594       0t0     TCP 15.242.100.56:5050 (LISTEN)
lt-mesos- 27219  nan    6w   REG  252,3       360 2099579 /var/lib/mesos/replicated_log/LOG
lt-mesos- 27219  nan    7uW  REG  252,3         0 2099580 /var/lib/mesos/replicated_log/LOCK
lt-mesos- 27219  nan    8u  IPv4 107697       0t0     TCP 15.242.100.56:5050->15.242.100.16:53987 (ESTABLISHED)
lt-mesos- 27219  nan    9u   REG  252,3     65536 2099584 /var/lib/mesos/replicated_log/MANIFEST-000002
lt-mesos- 27219  nan   10u   REG  252,3     65536 2099581 /var/lib/mesos/replicated_log/000004.log
lt-mesos- 27219  nan   11u  IPv4  88952       0t0     TCP 15.242.100.56:59746->15.242.100.16:5051 (ESTABLISHED)
lt-mesos- 27219  nan   12u  IPv4 106756       0t0     TCP 15.242.100.56:5050->15.242.100.60:40727 (ESTABLISHED)
lt-mesos- 27219  nan   13u  IPv4 104980       0t0     TCP 15.242.100.56:5050->15.242.100.60:40754 (ESTABLISHED)
lt-mesos- 27219  nan   14u  IPv4 105876       0t0     TCP 15.242.100.56:5050->15.242.100.60:40771 (ESTABLISHED)
lt-mesos- 27219  nan   15u  IPv4 104981       0t0     TCP 15.242.100.56:5050->15.242.100.60:40779 (ESTABLISHED)
lt-mesos- 27219  nan   16u  IPv4  95212       0t0     TCP 15.242.100.56:5050->15.242.100.60:40780 (ESTABLISHED)

我看不出文件描述符是谁 17 ,为什么 Mesos 大师尝试关闭它:

E1228 21:57:13.138975 27257 process.cpp:1911] Failed to shutdown socket with fd 17: Transport endpoint is not connected
cs7cruho

cs7cruho1#

已发现问题:删除的所有规则 iptablesk8s 服务器:

iptables -F

那就行了!

相关问题