我正在Erlangwww.example.com上使用带有RabbitMQ3.6.5的OpenstackPike18.3.4.4作为消息队列。nova服务之一(大多数是nova-scheduler)会在日志中显示如下错误。它说服务无法连接到rabbitmq,并且错误会一直出现,直到它重新启动。示例启动也会因为这个错误而失败。服务openstack-nova-start命令成功退出。
2018-05-31 06:19:44.737 5510 INFO nova.service [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Starting scheduler node (version 16.1.1-1.el7)
2018-05-31 06:19:44.745 5510 DEBUG nova.service [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Creating RPC server for service scheduler start /usr/lib/python2.7/site-packages/nova/service.py:179
2018-05-31 06:19:44.749 5510 DEBUG oslo.messaging._drivers.pool [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Pool creating new connection create /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/pool.py:143
2018-05-31 06:19:44.753 5510 DEBUG oslo.messaging._drivers.impl_rabbit [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] [57700d60-c761-4072-b4a3-d33706864086] Connecting to AMQP server on localhost:5672 __init__ /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py:597
2018-05-31 06:19:44.761 5510 DEBUG nova.scheduler.host_manager [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Found 2 cells: 00000000-0000-0000-0000-000000000000, 4963dd3d-246d-4f9b-9d4b-0cb1ab3032ae _load_cells /usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py:642
2018-05-31 06:19:44.761 5510 DEBUG nova.scheduler.host_manager [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] START:_async_init_instance_info _async_init_instance_info /usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py:421
2018-05-31 06:19:44.763 5510 DEBUG oslo_concurrency.lockutils [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Lock "00000000-0000-0000-0000-000000000000" acquired by "nova.context.get_or_set_cached_cell_and_set_connections" :: waited 0.000s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:270
2018-05-31 06:19:44.764 5510 DEBUG oslo_concurrency.lockutils [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] Lock "00000000-0000-0000-0000-000000000000" released by "nova.context.get_or_set_cached_cell_and_set_connections" :: held 0.001s inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:282
2018-05-31 06:19:49.761 5510 DEBUG oslo.messaging._drivers.impl_rabbit [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] [57700d60-c761-4072-b4a3-d33706864086] Received recoverable error from kombu: on_error /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py:744
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 494, in _ensured
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit return fun(*args,**kwargs)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 569, in __call__
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self.revive(self.connection.default_channel)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 819, in default_channel
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self.connection
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 802, in connection
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self._connection = self._establish_connection()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 757, in _establish_connection
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit conn = self.transport.establish_connection()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 130, in establish_connection
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit conn.connect()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 300, in connect
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self.drain_events(timeout=self.connect_timeout)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 464, in drain_events
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit return self.blocking_read(timeout)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 468, in blocking_read
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit frame = self.transport.read_frame()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 237, in read_frame
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit frame_header = read(7, True)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/amqp/transport.py", line 377, in _read
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit s = recv(n - len(rbuf))
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 354, in recv
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit return self._recv_loop(self.fd.recv, b'', bufsize, flags)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 348, in _recv_loop
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit self._read_trampoline()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 319, in _read_trampoline
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit timeout_exc=socket.timeout("timed out"))
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/greenio/base.py", line 203, in _trampoline
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit mark_as_closed=self._mark_as_closed)
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/hubs/__init__.py", line 162, in trampoline
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit return hub.switch()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit return self.greenlet.switch()
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit timeout: timed out
2018-05-31 06:19:49.761 5510 ERROR oslo.messaging._drivers.impl_rabbit
2018-05-31 06:19:49.763 5510 ERROR oslo.messaging._drivers.impl_rabbit [req-ba58e560-cca2-43ee-a770-5bc94bc14624 - - - - -] [57700d60-c761-4072-b4a3-d33706864086] AMQP server on 127.0.0.1:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
rabbitmq日志显示TCP接受和关闭错误。
=INFO REPORT==== 31-May-2018::06:39:41 === accepting AMQP connection <0.957.0> (127.0.0.1:33456 -> 127.0.0.1:5672)
=ERROR REPORT==== 31-May-2018::06:40:11 === closing AMQP connection <0.948.0> (127.0.0.1:33450 -> 127.0.0.1:5672): {handshake_timeout,frame_header}
经过调查,我发现rabbitmq错误表明客户端没有发送合适的帧头,因此连接被关闭。但是我找不到一种方法来查看帧头有什么问题,或者是否是由于其他错误。有人能告诉我如何进一步调试这个错误吗?谢谢。
1条答案
按热度按时间jobtbby31#
我猜这是TCP/buffer等的问题。尝试增加rabbitmq的内存缓冲区和TCP连接的数量
网络地址:
/proc/sys/net/ipv4/tcp_sack/网络协议包的配置文件