我用Kafka0.11.0.3
我有一个kafka代理和一个远程zookeeper集群。我启动了kafka服务器,它成功地在zookeeper中注册了它的id,甚至可以使用kafka-topic.sh命令获取主题列表。问题是我反复观察Kafka日志中的以下行:
[2019-01-08 10:51:09,138] WARN Attempting to send response via channel for which there is no open connection, connection id 192.168.0.201:9092-192.168.0.201:58292 (kafka.network.Processor)
[2019-01-08 10:51:09,198] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,226] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,306] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,327] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,382] WARN Attempting to send response via channel for which there is no open connection, connection id 192.168.0.201:9092-192.168.0.201:58296 (kafka.network.Processor)
[2019-01-08 10:51:09,408] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,446] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,559] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2019-01-08 10:51:09,602] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
代理尝试连接到同一台计算机(kafka服务器正在运行)上的端口58292,但无法建立连接。我还检查了zookeeper上的controller dir,它是空的。更奇怪的是,当我在kafka服务器节点上建立tcp连接时,我观察到了这么多的timewait连接:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 192.168.0.201:55572 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56290 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55442 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55512 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56074 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56286 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55460 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55904 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55488 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56308 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55502 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56326 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55960 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55930 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56300 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56004 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55470 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55474 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55432 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55412 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56304 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55858 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55860 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56324 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55388 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56168 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55898 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55820 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55676 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56202 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55756 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56278 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55658 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55628 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56038 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56108 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55988 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55894 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55428 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55424 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56128 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56146 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55884 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56280 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55798 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56120 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55888 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55708 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55696 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56298 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55646 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56150 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55376 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55980 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55556 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56208 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55752 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55982 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55864 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55760 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56056 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56002 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55536 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55576 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55392 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55726 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55426 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55710 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56042 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56264 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55606 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55972 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56176 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55780 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56342 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55534 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55438 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56114 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56068 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55880 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56350 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55970 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55404 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55672 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55454 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55946 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56126 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55538 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56124 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55712 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56084 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55992 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56302 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55984 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55394 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55550 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56094 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55936 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55530 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55868 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:56294 192.168.0.201:9092 TIME_WAIT -
tcp 0 0 192.168.0.201:55876 192.168.0.201:9092 TIME_WAIT -
tcp 0 31 192.168.0.201:57552 192.168.0.204:2181 ESTABLISHED 1015/java
唯一成功建立的连接是zookeeper(在最后一行)。我还检查了远程节点的端口9092,它是打开的:
Starting Nmap 7.01 ( https://nmap.org ) at 2019-01-08 11:32 +0330
Nmap scan report for (192.168.0.201)
Host is up (0.0027s latency).
PORT STATE SERVICE
9092/tcp open unknown
Nmap done: 1 IP address (1 host up) scanned in 0.08 seconds
一些要点:
经纪人在大约2个月的时间里工作正常,错误突然发生。
zookeeper集群运行良好,因为其他一些组件(如hdfs)正在使用它,并且没有错误。
操作系统是centos7,没有启用防火墙。
以下是kafka服务器配置:
broker.id=100
listeners=PLAINTEXT://192.168.0.201:9092
num.partitions=24
delete.topic.enable=true
log.dirs=/data/esb
zookeeper.connect=co1:2181,co2:2181
log.retention.hours=168
zookeeper.session.timeout.ms=40000
什么原因会导致连接等待时间过长?
暂无答案!
目前还没有任何答案,快来回答吧!