rabbitmq集群的一个节点偶尔会消耗大量内存(直到OOM)

relj7zay  于 2022-11-08  发布在  RabbitMQ
关注(0)|答案(2)|浏览(313)

使用环境:

*Openstack Train,由kolla-ansible部署
*RabbitMQ 3.7.10Erlang 20.2.2上

  • 三个控制节点(还运行其他组件)

问题:

*node-34rabbitmq消耗大量内存(30 G)在04- 20 16:31到04-20 16:46期间(重新启动rabbitmq进程手动,否则将consume memory until it triggerOOM-killer,尽管将vm_memory_high_watermark设置为0.1[* 具有相同环境的另一个群集 *])
*节点-33rabbitmq消耗15 G虚拟内存,但只有很少的物理内存在04-20 16:26到04-20 16:28期间

  • 修复了这个问题,只需要重新启动node-34 rabbitmq**
    问题
  • 这个问题的根本原因是什么?
  • 我如何完全修复它,但在问题发生时不重新启动?

组件日志:

节点-33只兔子

2022-04-20 16:20:00.731 [info] <0.30576.499> connection <0.30576.499> (1.1.1.45:33314 -> 1.1.1.33:5672 - nova-compute:7:1dab7694-168e-491c-8aa5-5e5a9f993750): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:25:25.678 [info] <0.14459.449> closing AMQP connection <0.14459.449> (1.1.1.32:53356 -> 1.1.1.33:5672 - nova-compute:7:facf1224-83df-4e48-8189-d78213ee5bc2, vhost: '/', user: 'openstack')
2022-04-20 16:25:25.679 [info] <0.21656.462> closing AMQP connection <0.21656.462> (1.1.1.32:58944 -> 1.1.1.33:5672 - nova-compute:7:9c706aca-9db6-4e61-bebd-568a6f282307, vhost: '/', user: 'openstack')
2022-04-20 16:25:25.679 [error] <0.3679.462> Supervisor {<0.3679.462>,rabbit_channel_sup_sup} had child channel_sup started with rabbit_channel_sup:start_link() at undefined exit with reason shutdown in context shutdown_error
2022-04-20 16:25:25.683 [info] <0.13987.330> closing AMQP connection <0.13987.330> (1.1.1.32:35890 -> 1.1.1.33:5672 - nova-compute:7:5fdd2029-8f50-4a81-b861-06b071fffc98, vhost: '/', user: 'openstack')
2022-04-20 16:25:41.101 [info] <0.1613.508> accepting AMQP connection <0.1613.508> (1.1.1.33:54246 -> 1.1.1.33:5672)
2022-04-20 16:25:41.104 [info] <0.1613.508> Connection <0.1613.508> (1.1.1.33:54246 -> 1.1.1.33:5672) has a client-provided name: nova-conductor:24:71983386-ad60-4608-8186-a4aef8644d9d
2022-04-20 16:25:41.104 [info] <0.1613.508> connection <0.1613.508> (1.1.1.33:54246 -> 1.1.1.33:5672 - nova-conductor:24:71983386-ad60-4608-8186-a4aef8644d9d): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:25:42.000 [warning] <0.32.0> lager_error_logger_h dropped 2 messages in the last second that exceeded the limit of 1000 messages/sec
2022-04-20 16:27:36.137 [info] <0.24964.510> accepting AMQP connection <0.24964.510> (1.1.1.33:38314 -> 1.1.1.33:5672)
2022-04-20 16:27:36.141 [info] <0.24964.510> Connection <0.24964.510> (1.1.1.33:38314 -> 1.1.1.33:5672) has a client-provided name: nova-compute:7:be0a8525-b04c-465d-a938-e90599bd54d3
2022-04-20 16:27:36.142 [info] <0.24964.510> connection <0.24964.510> (1.1.1.33:38314 -> 1.1.1.33:5672 - nova-compute:7:be0a8525-b04c-465d-a938-e90599bd54d3): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:34:29.549 [error] <0.2946.6153> closing AMQP connection <0.2946.6153> (1.1.1.35:58822 -> 1.1.1.33:5672 - nova-conductor:21:e037e12d-2911-47d1-90ef-d00c3c288380):
missed heartbeats from client, timeout: 60s
2022-04-20 16:34:30.557 [info] <0.414.521> accepting AMQP connection <0.414.521> (1.1.1.35:38810 -> 1.1.1.33:5672)
2022-04-20 16:34:30.558 [info] <0.414.521> Connection <0.414.521> (1.1.1.35:38810 -> 1.1.1.33:5672) has a client-provided name: nova-conductor:21:e037e12d-2911-47d1-90ef-d00c3c288380
2022-04-20 16:34:30.559 [info] <0.414.521> connection <0.414.521> (1.1.1.35:38810 -> 1.1.1.33:5672 - nova-conductor:21:e037e12d-2911-47d1-90ef-d00c3c288380): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:34:32.117 [error] <0.13587.486> closing AMQP connection <0.13587.486> (1.1.1.31:36248 -> 1.1.1.33:5672 - nova-compute:7:e592f063-ff69-4387-86a5-d552bb43572e):
missed heartbeats from client, timeout: 60s
2022-04-20 16:40:36.440 [error] <0.13109.8083> closing AMQP connection <0.13109.8083> (1.1.1.35:47356 -> 1.1.1.33:5672 - cinder-volume:32:2a7ba690-3b2a-486d-888b-bc4bb19962ee):
missed heartbeats from client, timeout: 60s
2022-04-20 16:40:36.537 [error] <0.31800.280> closing AMQP connection <0.31800.280> (1.1.1.33:60648 -> 1.1.1.33:5672 - cinder-volume:32:c7fddb16-bb8c-4646-8535-980ba5900508):
missed heartbeats from client, timeout: 60s
2022-04-20 16:40:43.139 [error] <0.5296.525> closing AMQP connection <0.5296.525> (1.1.1.33:47884 -> 1.1.1.33:5672 - nova-conductor:24:71983386-ad60-4608-8186-a4aef8644d9d):
missed heartbeats from client, timeout: 60s
========== a lot of above "[info]" "[error]" "missed heartbeats" log, until restart the node-34 rabbitmq process
2022-04-20 16:46:19.528 [info] <0.16487.538> connection <0.16487.538> (1.1.1.34:51432 -> 1.1.1.33:5672 - nova-scheduler:59:d19d2307-c177-490f-9f36-9709f6f86345): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:46:19.562 [info] <0.20786.0> Mirrored queue 'q-l3-plugin.node-35' in vhost '/': Master <rabbit@node-33.1.1271.0> saw deaths of mirrors <rabbit@node-34.1.1245.0>
2022-04-20 16:46:19.563 [info] <0.23194.0> Mirrored queue 'q-plugin_fanout_3e006483c59744de91c4607550a2ea75' in vhost '/': Master <rabbit@node-33.1.3305.0> saw deaths of mirrors <rabbit@node-34.1.3374.0>

节点-34只兔子mq

2022-04-20 16:20:48.095 [info] <0.20912.3993> Connection <0.20912.3993> (1.1.1.31:55326 -> 1.1.1.34:5672) has a client-provided name: nova-compute:7:a480ff3a-1e36-4797-b1dc-cc1d7eff8d8f
2022-04-20 16:20:49.000 [warning]  lager_file_backend dropped 1 messages in the last second that exceeded the limit of 50 messages/sec
2022-04-20 16:25:25.676 [info] <0.22127.3978> closing AMQP connection <0.22127.3978> (1.1.1.32:52030 -> 1.1.1.34:5672 - nova-compute:7:310dcd66-11e7-485e-8099-1e4ab9e1c05d, vhost: '/', user: 'openstack')
2022-04-20 16:27:36.116 [info] <0.19371.3997> accepting AMQP connection <0.19371.3997> (1.1.1.33:58880 -> 1.1.1.34:5672)
2022-04-20 16:27:36.138 [info] <0.19371.3997> Connection <0.19371.3997> (1.1.1.33:58880 -> 1.1.1.34:5672) has a client-provided name: nova-compute:7:012d81c4-a5eb-4b38-8a39-d577cef8c12a
2022-04-20 16:27:36.142 [info] <0.19371.3997> connection <0.19371.3997> (1.1.1.33:58880 -> 1.1.1.34:5672 - nova-compute:7:012d81c4-a5eb-4b38-8a39-d577cef8c12a): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:34:32.645 [error] <0.25171.3412> closing AMQP connection <0.25171.3412> (1.1.1.35:60358 -> 1.1.1.34:5672 - nova-conductor:23:7f5f17b1-e86e-476a-9047-b57317c02723):
missed heartbeats from client, timeout: 60s
2022-04-20 16:34:33.653 [info] <0.23357.4653> accepting AMQP connection <0.23357.4653> (1.1.1.35:44456 -> 1.1.1.34:5672)
2022-04-20 16:34:33.657 [info] <0.23357.4653> Connection <0.23357.4653> (1.1.1.35:44456 -> 1.1.1.34:5672) has a client-provided name: nova-conductor:23:7f5f17b1-e86e-476a-9047-b57317c02723
2022-04-20 16:34:33.658 [info] <0.23357.4653> connection <0.23357.4653> (1.1.1.35:44456 -> 1.1.1.34:5672 - nova-conductor:23:7f5f17b1-e86e-476a-9047-b57317c02723): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:34:34.484 [error] <0.3180.3713> closing AMQP connection <0.3180.3713> (1.1.1.33:41126 -> 1.1.1.34:5672 - nova-conductor:22:6de6e8f9-10c8-48c6-8aac-55489aa24d9b):
missed heartbeats from client, timeout: 60s
2022-04-20 16:34:35.492 [info] <0.19068.4664> accepting AMQP connection <0.19068.4664> (1.1.1.33:48246 -> 1.1.1.34:5672)
2022-04-20 16:34:35.493 [info] <0.19068.4664> Connection <0.19068.4664> (1.1.1.33:48246 -> 1.1.1.34:5672) has a client-provided name: nova-conductor:22:6de6e8f9-10c8-48c6-8aac-55489aa24d9b
2022-04-20 16:34:35.494 [info] <0.19068.4664> connection <0.19068.4664> (1.1.1.33:48246 -> 1.1.1.34:5672 - nova-conductor:22:6de6e8f9-10c8-48c6-8aac-55489aa24d9b): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:34:37.617 [error] <0.19797.3640> closing AMQP connection <0.19797.3640> (1.1.1.34:38380 -> 1.1.1.34:5672 - nova-conductor:24:af6de2d2-5fb4-43b8-aac7-eb363d60315c):
missed heartbeats from client, timeout: 60s
========== a lot of above "[info]" and "[error]" "missed heartbeats" log, until restart the this(node-34) rabbitmq process
2022-04-20 16:45:54.306 [info] <0.7671.7632> accepting AMQP connection <0.7671.7632> (1.1.1.31:38548 -> 1.1.1.34:5672)
2022-04-20 16:45:54.307 [info] <0.7671.7632> Connection <0.7671.7632> (1.1.1.31:38548 -> 1.1.1.34:5672) has a client-provided name: nova-compute:7:355e2e03-3d83-4d95-a8bf-0165643a40fd
2022-04-20 16:45:54.307 [info] <0.7671.7632> connection <0.7671.7632> (1.1.1.31:38548 -> 1.1.1.34:5672 - nova-compute:7:355e2e03-3d83-4d95-a8bf-0165643a40fd): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:45:55.359 [error] <0.22496.6560> closing AMQP connection <0.22496.6560> (1.1.1.33:46352 -> 1.1.1.34:5672 - nova-conductor:24:6341a486-5f31-4180-865e-49d9e6fef1fd):
missed heartbeats from client, timeout: 60s
2022-04-20 16:45:56.367 [info] <0.31031.7657> accepting AMQP connection <0.31031.7657> (1.1.1.33:38992 -> 1.1.1.34:5672)
2022-04-20 16:45:56.368 [info] <0.31031.7657> Connection <0.31031.7657> (1.1.1.33:38992 -> 1.1.1.34:5672) has a client-provided name: nova-conductor:24:6341a486-5f31-4180-865e-49d9e6fef1fd
2022-04-20 16:45:56.368 [info] <0.31031.7657> connection <0.31031.7657> (1.1.1.33:38992 -> 1.1.1.34:5672 - nova-conductor:24:6341a486-5f31-4180-865e-49d9e6fef1fd): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:45:58.115 [warning] <0.32550.7440> closing AMQP connection <0.32550.7440> (1.1.1.35:58740 -> 1.1.1.34:5672 - nova-conductor:22:bbe54f83-b7c4-452f-a120-9a220afc6c59, vhost: '/', user: 'openstack'):
client unexpectedly closed TCP connection
2022-04-20 16:45:59.123 [info] <0.5691.7687> accepting AMQP connection <0.5691.7687> (1.1.1.35:38818 -> 1.1.1.34:5672)
2022-04-20 16:45:59.124 [info] <0.5691.7687> Connection <0.5691.7687> (1.1.1.35:38818 -> 1.1.1.34:5672) has a client-provided name: nova-conductor:22:bbe54f83-b7c4-452f-a120-9a220afc6c59
2022-04-20 16:45:59.125 [info] <0.5691.7687> connection <0.5691.7687> (1.1.1.35:38818 -> 1.1.1.34:5672 - nova-conductor:22:bbe54f83-b7c4-452f-a120-9a220afc6c59): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:46:05.456 [warning] <0.1973.7449> closing AMQP connection <0.1973.7449> (1.1.1.36:52852 -> 1.1.1.34:5672 - nova-compute:7:37b137ea-7671-4ea9-ae86-f243e9a13606, vhost: '/', user: 'openstack'):
client unexpectedly closed TCP connection
2022-04-20 16:46:06.643 [warning] <0.2825.7450> closing AMQP connection <0.2825.7450> (1.1.1.33:33106 -> 1.1.1.34:5672 - nova-conductor:21:7bd7d6dc-7367-4690-aed8-81c3989f5c74, vhost: '/', user: 'openstack'):
client unexpectedly closed TCP connection
2022-04-20 16:46:08.968 [warning] <0.16541.7452> closing AMQP connection <0.16541.7452> (1.1.1.35:59814 -> 1.1.1.34:5672 - nova-conductor:25:2e3dd6d2-13a8-4661-96ad-d4fa6bbd2e72, vhost: '/', user: 'openstack'):
client unexpectedly closed TCP connection
2022-04-20 16:46:13.000 [warning]  lager_file_backend dropped 13 messages in the last second that exceeded the limit of 50 messages/sec
2022-04-20 16:46:13.038 [info] <0.19552.7774> RabbitMQ is asked to stop...
2022-04-20 16:46:13.806 [info] <0.27176.7775> accepting AMQP connection <0.27176.7775> (1.1.1.35:40350 -> 1.1.1.34:5672)
2022-04-20 16:46:13.807 [info] <0.27176.7775> Connection <0.27176.7775> (1.1.1.35:40350 -> 1.1.1.34:5672) has a client-provided name: nova-conductor:22:99215d08-ba56-44a1-be0e-2cfa9935a4c7
2022-04-20 16:46:13.808 [info] <0.27176.7775> connection <0.27176.7775> (1.1.1.35:40350 -> 1.1.1.34:5672 - nova-conductor:22:99215d08-ba56-44a1-be0e-2cfa9935a4c7): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:46:14.112 [info] <0.19552.7774> Stopping RabbitMQ applications and their dependencies in the following order:
    rabbitmq_management
    amqp_client
    rabbitmq_web_dispatch
    cowboy
    cowlib
    rabbitmq_management_agent
    rabbit
    mnesia
    rabbit_common
    os_mon
2022-04-20 16:46:14.113 [info] <0.19552.7774> Stopping application 'rabbitmq_management'
2022-04-20 16:46:14.223 [warning] <0.8143.0> RabbitMQ HTTP listener registry could not find context rabbitmq_management_tls
2022-04-20 16:46:14.237 [info] <0.33.0> Application rabbitmq_management exited with reason: stopped
2022-04-20 16:46:14.237 [info] <0.19552.7774> Stopping application 'amqp_client'
2022-04-20 16:46:14.265 [info] <0.33.0> Application amqp_client exited with reason: stopped
2022-04-20 16:46:14.265 [info] <0.19552.7774> Stopping application 'rabbitmq_web_dispatch'
2022-04-20 16:46:14.282 [info] <0.33.0> Application rabbitmq_web_dispatch exited with reason: stopped
2022-04-20 16:46:14.282 [info] <0.19552.7774> Stopping application 'cowboy'
2022-04-20 16:46:14.293 [warning] <0.25152.7431> closing AMQP connection <0.25152.7431> (1.1.1.34:42738 -> 1.1.1.34:5672 - nova-conductor:25:b31b82a2-e738-4cd9-805e-36c7b520531e, vhost: '/', user: 'openstack'):
client unexpectedly closed TCP connection
2022-04-20 16:46:14.301 [info] <0.19552.7774> Stopping application 'cowlib'
2022-04-20 16:46:14.301 [info] <0.19552.7774> Stopping application 'rabbitmq_management_agent'
2022-04-20 16:46:14.301 [info] <0.33.0> Application cowboy exited with reason: stopped
2022-04-20 16:46:14.302 [info] <0.33.0> Application cowlib exited with reason: stopped
2022-04-20 16:46:14.324 [info] <0.19552.7774> Stopping application 'rabbit'
2022-04-20 16:46:14.324 [info] <0.33.0> Application rabbitmq_management_agent exited with reason: stopped
2022-04-20 16:46:14.326 [info] <0.260.0> Peer discovery backend rabbit_peer_discovery_classic_config does not support registration, skipping unregistration.
2022-04-20 16:46:14.327 [info] <0.8135.0> stopped TCP listener on 1.1.1.34:5672
2022-04-20 16:46:14.337 [error] <0.19192.0> Error on AMQP connection <0.19192.0> (1.1.1.34:53718 -> 1.1.1.34:5672 - barbican-keystone-listener:7:b59ea871-cbb6-462e-b8e7-3454536978dd, vhost: '/', user: 'openstack', state: running), channel 0:
 operation none caused a connection exception connection_forced: "broker forced connection closure with reason 'shutdown'"
========== a lot of these "[error]" and "operation none caused" log
2022-04-20 16:46:14.338 [error] <0.18651.0> Error on AMQP connection <0.18651.0> (1.1.1.35:49664 -> 1.1.1.34:5672 - magnum-conductor:112:0a1e307e-90cb-4e5f-bc1c-a721cdb7f83e, vhost: '/', user: 'openstack', state: running), channel 0:
 operation none caused a connection exception connection_forced: "broker forced connection closure with reason 'shutdown'"
2022-04-20 16:46:21.680 [info] <0.33.0> Application lager started on node 'rabbit@node-34'
2022-04-20 16:46:21.685 [info] <0.5.0> Log file opened with Lager
2022-04-20 16:46:25.645 [info] <0.33.0> Application mnesia started on node 'rabbit@node-34'
2022-04-20 16:46:25.649 [info] <0.33.0> Application mnesia exited with reason: stopped
2022-04-20 16:46:25.988 [info] <0.33.0> Application recon started on node 'rabbit@node-34'
2022-04-20 16:46:25.989 [info] <0.33.0> Application inets started on node 'rabbit@node-34'
2022-04-20 16:46:25.989 [info] <0.33.0> Application jsx started on node 'rabbit@node-34'
2022-04-20 16:46:25.989 [info] <0.33.0> Application os_mon started on node 'rabbit@node-34'
2022-04-20 16:46:25.989 [info] <0.33.0> Application crypto started on node 'rabbit@node-34'
2022-04-20 16:46:25.989 [info] <0.33.0> Application cowlib started on node 'rabbit@node-34'
2022-04-20 16:46:26.078 [info] <0.33.0> Application mnesia started on node 'rabbit@node-34'
2022-04-20 16:46:26.078 [info] <0.33.0> Application xmerl started on node 'rabbit@node-34'
2022-04-20 16:46:26.078 [info] <0.33.0> Application asn1 started on node 'rabbit@node-34'
2022-04-20 16:46:26.078 [info] <0.33.0> Application public_key started on node 'rabbit@node-34'
2022-04-20 16:46:26.078 [info] <0.33.0> Application ssl started on node 'rabbit@node-34'
2022-04-20 16:46:26.078 [info] <0.33.0> Application ranch started on node 'rabbit@node-34'
2022-04-20 16:46:26.085 [info] <0.33.0> Application cowboy started on node 'rabbit@node-34'
2022-04-20 16:46:26.085 [info] <0.33.0> Application rabbit_common started on node 'rabbit@node-34'
2022-04-20 16:46:26.088 [info] <0.33.0> Application amqp_client started on node 'rabbit@node-34'
2022-04-20 16:46:26.088 [info] <0.247.0> 
 Starting RabbitMQ 3.7.10 on Erlang 20.2.2
 Copyright (C) 2007-2018 Pivotal Software, Inc.
 Licensed under the MPL.  See http://www.rabbitmq.com/
2022-04-20 16:46:26.089 [info] <0.247.0> 
 node           : rabbit@node-34
 home dir       : /var/lib/rabbitmq
 config file(s) : /etc/rabbitmq/rabbitmq.conf
 cookie hash    : i***Q==
 log(s)         : /var/log/kolla/rabbitmq/rabbit@node-34.log
                : /var/log/kolla/rabbitmq/rabbit@node-34_upgrade.log
 database dir   : /var/lib/rabbitmq/mnesia/rabbit@node-34
2022-04-20 16:46:26.385 [info] <0.331.0> Memory high watermark set to 618721 MiB (648776025702 bytes) of 1546802 MiB (1621940064256 bytes) total
2022-04-20 16:46:26.389 [info] <0.333.0> Enabling free disk space monitoring
2022-04-20 16:46:26.389 [info] <0.333.0> Disk free limit set to 50MB
2022-04-20 16:46:26.392 [info] <0.336.0> Limiting to approx 1048476 file handles (943626 sockets)
2022-04-20 16:46:26.392 [info] <0.337.0> FHC read buffering:  OFF
2022-04-20 16:46:26.392 [info] <0.337.0> FHC write buffering: ON
2022-04-20 16:46:26.400 [info] <0.247.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2022-04-20 16:46:26.410 [info] <0.247.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2022-04-20 16:46:26.450 [info] <0.247.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2022-04-20 16:46:26.450 [info] <0.247.0> Peer discovery backend rabbit_peer_discovery_classic_config does not support registration, skipping registration.
2022-04-20 16:46:26.451 [info] <0.247.0> Priority queues enabled, real BQ is rabbit_variable_queue
2022-04-20 16:46:26.477 [info] <0.454.0> Starting rabbit_node_monitor
2022-04-20 16:46:26.504 [info] <0.247.0> Management plugin: using rates mode 'basic'
2022-04-20 16:46:26.556 [info] <0.486.0> Making sure data directory '/var/lib/rabbitmq/mnesia/rabbit@node-34/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L' for vhost '/' exists
2022-04-20 16:46:26.574 [info] <0.486.0> Starting message stores for vhost '/'
2022-04-20 16:46:26.574 [info] <0.490.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_transient": using rabbit_msg_store_ets_index to provide index
2022-04-20 16:46:26.616 [info] <0.486.0> Started message store of type transient for vhost '/'
2022-04-20 16:46:26.617 [info] <0.493.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": using rabbit_msg_store_ets_index to provide index
2022-04-20 16:46:26.617 [warning] <0.493.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": rebuilding indices from scratch
2022-04-20 16:46:26.618 [info] <0.486.0> Started message store of type persistent for vhost '/'
2022-04-20 16:46:26.627 [info] <0.486.0> Mirrored queue 'q-agent-notifier-port-update_fanout_9407d5931f8a498cb6c0268d585ed732' in vhost '/': Adding mirror on node 'rabbit@node-34': <0.512.0>
2022-04-20 16:46:26.627 [info] <0.486.0> Mirrored queue 'magnum-conductor_fanout_dd3536ae0b8e4efe8329be0454ba75b6' in vhost '/': Adding mirror on node 'rabbit@node-34': <0.516.0>
========== a lot of different "Mirrored queue" log

节点-35只兔子毫克

2022-04-20 16:34:17.295 [error] <0.13948.7714> closing AMQP connection <0.13948.7714> (1.1.1.34:43322 -> 1.1.1.35:5672 - nova-conductor:23:3ca11891-5442-48ef-9b0f-f616ba13c1e3):
missed heartbeats from client, timeout: 60s
========== a lot of above "[info]" "[error]" "missed heartbeats" log, until restart the node-34 rabbitmq process
2022-04-20 16:34:57.656 [info] <0.12329.2581> accepting AMQP connection <0.12329.2581> (1.1.1.33:42474 -> 1.1.1.35:5672)
2022-04-20 16:34:57.657 [info] <0.12329.2581> Connection <0.12329.2581> (1.1.1.33:42474 -> 1.1.1.35:5672) has a client-provided name: nova-conductor:25:53cc4527-fa4b-41c0-b9e1-f3d24da7f31b
2022-04-20 16:34:57.658 [info] <0.12329.2581> connection <0.12329.2581> (1.1.1.33:42474 -> 1.1.1.35:5672 - nova-conductor:25:53cc4527-fa4b-41c0-b9e1-f3d24da7f31b): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:34:58.874 [error] <0.7604.103> closing AMQP connection <0.7604.103> (1.1.1.33:38718 -> 1.1.1.35:5672 - nova-conductor:21:b701bc54-c826-47bc-8c7d-803e27265e5f):
missed heartbeats from client, timeout: 60s
2022-04-20 16:34:59.531 [error] <0.24920.2100> closing AMQP connection <0.24920.2100> (1.1.1.35:41208 -> 1.1.1.35:5672 - nova-conductor:23:615ea986-452b-4943-ba4f-7d36d3b1536c):
missed heartbeats from client, timeout: 60s
========== less of "[error]" but lot of "[info]" log, until restart the node-34 rabbitmq process
2022-04-20 16:46:19.344 [info] <0.17721.2593> connection <0.17721.2593> (1.1.1.33:34162 -> 1.1.1.35:5672 - mod_wsgi:32:06e32cf4-2d5f-468e-9a50-a6c16b5f16bb): user 'openstack' authenticated and granted access to vhost '/'
2022-04-20 16:46:19.489 [info] <0.17996.2593> accepting AMQP connection <0.17996.2593> (1.1.1.33:34172 -> 1.1.1.35:5672)
2022-04-20 16:46:19.511 [info] <0.4731.0> Mirrored queue 'magnum-conductor_fanout_7af2012136fe49e88f5e561d2f03650f' in vhost '/': Slave <rabbit@node-35.3.4731.0> saw deaths of mirrors <rabbit@node-34.1.2887.0>
2022-04-20 16:46:19.511 [info] <0.25105.9> Mirrored queue 'scheduler_fanout_0a2d4e8a46b249018178164758b6736d' in vhost '/': Slave <rabbit@node-35.3.25105.9> saw deaths of mirrors <rabbit@node-34.1.14025.203>

节点-33新星-传导

2022-04-20 16:34:20.689 23 ERROR oslo.messaging._drivers.impl_rabbit [req-30ed34ff-70e5-4e6f-a09f-a00de95385a3 - - - - -] [5b4dd218-31d1-4eb6-bab1-2b5bc8f1bc11] AMQP server on 1.1.1.35:5672 is unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection
2022-04-20 16:34:21.700 23 INFO oslo.messaging._drivers.impl_rabbit [req-30ed34ff-70e5-4e6f-a09f-a00de95385a3 - - - - -] [5b4dd218-31d1-4eb6-bab1-2b5bc8f1bc11] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 38842.
========== duplicate above log
2022-04-20 16:46:12.294 21 INFO oslo.messaging._drivers.impl_rabbit [req-24421232-3a85-4ff4-a6fe-e2bb58188e65 - - - - -] [7bd7d6dc-7367-4690-aed8-81c3989f5c74] Reconnected to AMQP server on 1.1.1.34:5672 via [amqp] client with port 40340.
2022-04-20 16:46:14.356 25 ERROR oslo.messaging._drivers.impl_rabbit [req-beeac432-5509-4d6f-8709-9a8ea9b3d0ad - - - - -] [58b08ac8-7061-4969-8fae-de5acad3c23b] AMQP server on 1.1.1.34:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds.: ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-20 16:46:14.357 23 ERROR oslo.messaging._drivers.impl_rabbit [req-bd979e87-5a7f-4f6a-af79-9d1ccb99f944 - - - - -] [d5bff161-b9df-44e8-9fe3-292aec5b13f7] AMQP server on 1.1.1.34:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds.: ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-20 16:46:14.357 21 ERROR oslo.messaging._drivers.impl_rabbit [req-ca57569e-3328-48f8-9469-e78d6def839c - - - - -] [98444783-6694-4688-873e-066abf61932c] AMQP server on 1.1.1.34:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds.: ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-20 16:46:14.358 21 ERROR oslo.messaging._drivers.impl_rabbit [req-24421232-3a85-4ff4-a6fe-e2bb58188e65 - - - - -] [7bd7d6dc-7367-4690-aed8-81c3989f5c74] AMQP server on 1.1.1.34:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds.: ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-20 16:46:14.359 23 ERROR oslo.messaging._drivers.impl_rabbit [req-a69ce54a-1739-44b2-b169-12f030a744b1 - - - - -] [84c7fa9d-14d9-4664-be6c-7e6d43ae7e83] AMQP server on 1.1.1.34:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds.: ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-20 16:46:14.360 22 ERROR oslo.messaging._drivers.impl_rabbit [-] [09269999-5a4f-419a-899c-69deb49689fb] AMQP server on 1.1.1.34:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds.: ConnectionResetError: [Errno 104] Connection reset by peer
========== duplicate above log
2022-04-20 16:46:16.422 23 INFO oslo.messaging._drivers.impl_rabbit [req-a69ce54a-1739-44b2-b169-12f030a744b1 - - - - -] [84c7fa9d-14d9-4664-be6c-7e6d43ae7e83] Reconnected to AMQP server on 1.1.1.33:5672 via [amqp] client with port 49066.
2022-04-20 16:46:16.425 25 ERROR oslo.messaging._drivers.impl_rabbit [req-50d0d15a-96a4-48a4-b60e-f2103ca9aa59 - - - - -] Connection failed: [Errno 111] ECONNREFUSED (retrying in 0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED
2022-04-20 16:46:16.556 21 INFO oslo.messaging._drivers.impl_rabbit [req-beeac432-5509-4d6f-8709-9a8ea9b3d0ad - - - - -] [3d76c8aa-8e05-4ce0-a26e-db670ff2e48c] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33400.
2022-04-20 16:46:16.571 24 INFO oslo.messaging._drivers.impl_rabbit [req-beeac432-5509-4d6f-8709-9a8ea9b3d0ad - - - - -] [c7a51125-c7b5-4439-b5d5-36e82309b943] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33418.
2022-04-20 16:46:16.573 22 INFO oslo.messaging._drivers.impl_rabbit [req-30ed34ff-70e5-4e6f-a09f-a00de95385a3 - - - - -] [6de6e8f9-10c8-48c6-8aac-55489aa24d9b] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33410.
2022-04-20 16:46:16.634 21 INFO oslo.messaging._drivers.impl_rabbit [req-24421232-3a85-4ff4-a6fe-e2bb58188e65 - - - - -] [7bd7d6dc-7367-4690-aed8-81c3989f5c74] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33406.
2022-04-20 16:46:19.829 23 INFO oslo.messaging._drivers.impl_rabbit [-] [3132b985-89a4-4c5a-82e2-a0050d14fddf] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33412.
2022-04-20 16:46:20.373 22 INFO oslo.messaging._drivers.impl_rabbit [-] [09269999-5a4f-419a-899c-69deb49689fb] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33376.
2022-04-20 16:46:20.409 24 INFO oslo.messaging._drivers.impl_rabbit [-] [cd614243-a1db-4e57-996d-b17e9b3aea28] Reconnected to AMQP server on 1.1.1.35:5672 via [amqp] client with port 33426.
2022-04-20 16:46:26.326 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2022-04-20 16:46:26.334 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
2022-04-20 16:46:26.340 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 111] ECONNREFUSED (retrying in 0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED
2022-04-20 16:46:26.347 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 111] ECONNREFUSED (retrying in 0 seconds): ConnectionRefusedError: [Errno 111] ECONNREFUSED
2022-04-20 16:46:32.784 23 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: Server unexpectedly closed connection
2022-04-20 16:46:47.707 24 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: Server unexpectedly closed connection
2022-04-20 16:46:47.851 25 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: Server unexpectedly closed connection
2022-04-20 16:47:02.833 25 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: Server unexpectedly closed connection
2022-04-20 16:49:16.486 25 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to consume message from queue: Server unexpectedly closed connection: kombu.exceptions.OperationalError: Server unexpectedly closed connection
2022-04-20 16:49:16.494 25 ERROR oslo.messaging._drivers.impl_rabbit [-] Unable to connect to AMQP server on 1.1.1.35:5672 after inf tries: Server unexpectedly closed connection: kombu.exceptions.OperationalError: Server unexpectedly closed connection
2022-04-20 16:49:16.495 25 ERROR oslo_messaging._drivers.amqpdriver [-] Failed to process incoming message, retrying..: oslo_messaging.exceptions.MessageDeliveryFailure: Unable to connect to AMQP server on 1.1.1.35:5672 after inf tries: Server unexpectedly closed connection
hof1towb

hof1towb1#

我在RabbitMQ 3.7.9上也看到了类似的问题。升级到3.7.19和Erlang 21.X后,这个问题就消失了。

gtlvzcf8

gtlvzcf82#

如果升级解决方案不工作.你可以使用仲裁队列代替经典的镜像队列在rabbit-mq.它的推荐用于大规模.你可以在下面的链接中得到深入的理解
https://www.rabbitmq.com/quorum-queues.html

相关问题