您现在的位置是:首页 >技术交流 >RabbitMQ集群镜像模式崩溃网站首页技术交流
RabbitMQ集群镜像模式崩溃
简介RabbitMQ集群镜像模式崩溃
一、问题
集群在部署上是没问题的,正常运行了好长一段时间。但在我后续搭建其他服务的时候集群突然崩溃了,不知道是不是消息队列太多导致?
[root@controller01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@controller01
[{nodes,[{disc,[rabbit@controller01]},
{ram,[rabbit@controller03,rabbit@controller02]}]},
{running_nodes,[rabbit@controller01]},
{cluster_name,<<"rabbit@controller01">>},
{partitions,[{rabbit@controller01,[rabbit@controller02,
rabbit@controller03]}]},
{alarms,[{rabbit@controller01,[]}]}]
集群状态中显示,集群处在分裂状态,只剩下controller01节点还在运行着,剩下两个节点干啥去了?
使用systemctl status rabbitmq-server命令检测另外两台掉了的节点,也妹问题啊。查看日志,没提供什么有效信息。
[root@controller01 ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-01-17 22:37:09 EST; 3 weeks 5 days ago
Main PID: 21937 (beam.smp)
Status: "Initialized"
CGroup: /system.slice/rabbitmq-server.service
├─21937 /usr/lib64/erlang/erts-8.3.5.3/bin/beam.smp -W w -A 96 -P 1048576 -t 5000000 -stbt db -zdbbl ...
├─22170 erl_child_setup 1024
├─22186 inet_gethost 4
└─22187 inet_gethost 4
Jan 17 22:37:08 controller01 systemd[1]: Starting RabbitMQ broker...
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: RabbitMQ 3.6.16. Copyright (C) 2007-2018 Pivotal Softwa...Inc.
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: ## ## Licensed under the MPL. See http://www.rab...com/
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: ## ##
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: ########## Logs: /var/log/rabbitmq/rabbit@controller01.log
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: ###### ## /var/log/rabbitmq/rabbit@controller01....log
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: ##########
Jan 17 22:37:08 controller01 rabbitmq-server[21937]: Starting broker...
Jan 17 22:37:09 controller01 systemd[1]: Started RabbitMQ broker.
Jan 17 22:37:09 controller01 rabbitmq-server[21937]: completed with 0 plugins.
Hint: Some lines were ellipsized, use -l to show in full.
[root@controller02 ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-01-17 22:49:18 EST; 3 weeks 5 days ago
Main PID: 21800 (beam.smp)
Status: "Initialized"
CGroup: /system.slice/rabbitmq-server.service
├─21800 /usr/lib64/erlang/erts-8.3.5.3/bin/beam.smp -W w -A 96 -P 1048576 -t 5000000 -stbt db -zdbbl ...
├─22033 erl_child_setup 1024
├─22049 inet_gethost 4
└─22050 inet_gethost 4
Jan 17 22:49:18 controller02 systemd[1]: Started RabbitMQ broker.
Jan 17 22:49:19 controller02 rabbitmq-server[21800]: completed with 0 plugins.
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: RabbitMQ 3.6.16. Copyright (C) 2007-2018 Pivotal Softwa...Inc.
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: ## ## Licensed under the MPL. See http://www.rab...com/
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: ## ##
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: ########## Logs: /var/log/rabbitmq/rabbit@controller02.log
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: ###### ## /var/log/rabbitmq/rabbit@controller02....log
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: ##########
Jan 17 22:51:43 controller02 rabbitmq-server[21800]: Starting broker...
Jan 17 22:51:44 controller02 rabbitmq-server[21800]: completed with 0 plugins.
Hint: Some lines were ellipsized, use -l to show in full.
[root@controller03 ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2025-01-17 22:49:21 EST; 3 weeks 5 days ago
Main PID: 21384 (beam.smp)
Status: "Initialized"
CGroup: /system.slice/rabbitmq-server.service
├─21384 /usr/lib64/erlang/erts-8.3.5.3/bin/beam.smp -W w -A 96 -P ...
├─21617 erl_child_setup 1024
├─21633 inet_gethost 4
└─21634 inet_gethost 4
Jan 17 22:49:21 controller03 systemd[1]: Started RabbitMQ broker.
Jan 17 22:49:21 controller03 rabbitmq-server[21384]: completed with 0 plugins.
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: RabbitMQ 3.6.16. Copyrig...
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: ## ## Licensed und...
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: ## ##
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: ########## Logs: /var/l...
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: ###### ## /var/l...
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: ##########
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: Starting broker...
Jan 17 22:52:09 controller03 rabbitmq-server[21384]: completed with 0 plugins.
Hint: Some lines were ellipsized, use -l to show in full.
因为做的镜像模式,所以不敢直接关闭服务,害怕数据不统一,所以这时需要谨慎操作。
二、解决过程
集群状态中,集群还是存在的,所以,我们需要在全部集群上,先把任务停止掉
这条命令会停止RabbitMQ接受和处理消息,但不关闭RabbitMQ服务
[root@controller01 ~]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@controller01
[root@controller02 ~]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@controller02
[root@controller03 ~]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@controller03
再看看集群状态?
[root@controller01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@controller01
[{nodes,[{disc,[rabbit@controller01]},
{ram,[rabbit@controller02,rabbit@controller03]}]},
{alarms,[]}]
耶!集群还存在就好
因为之前已经成功把集群搭起来了,所以这时不用做什么搭建集群的配置,直接在全部节点开启RabbitMQ,让它接受和处理消息
[root@controller01 ~]# rabbitmqctl start_app
Starting node rabbit@controller01
[root@controller02 ~]# rabbitmqctl start_app
Starting node rabbit@controller02
Error: unable to connect to node rabbit@controller02: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit@controller02]
rabbit@controller02:
* connected to epmd (port 4369) on controller02
* epmd reports: node 'rabbit' not running at all
no other nodes on controller02
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-85@controller02'
- home dir: /var/lib/rabbitmq
- cookie hash: V+zquSQpuK8W6GX64HNaIQ==
[root@controller03 ~]# rabbitmqctl start_app
Starting node rabbit@controller03
欸!2号节点叛逆,有不对劲的输出
先等等,查看集群状态判断是不是只有2号节点不正常
[root@controller01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@controller01
[{nodes,[{disc,[rabbit@controller01]},
{ram,[rabbit@controller03,rabbit@controller02]}]},
{running_nodes,[rabbit@controller03,rabbit@controller01]},
{cluster_name,<<"rabbit@controller01">>},
{partitions,[]},
{alarms,[{rabbit@controller03,[]},{rabbit@controller01,[]}]}]
还真是,这时3号节点也加入到集群运行中来了,之前是只有1号节点的
回过去看刚才2号节点的输出,说rabbit@controller02节点没有启动,错误建议:需要启动该节点
先查看服务状态
[root@controller02 ~]# systemctl status rabbitmq-server
● rabbitmq-server.service - RabbitMQ broker
Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2025-02-13 01:47:46 EST; 1min 21s ago
Process: 21800 ExecStart=/usr/lib/rabbitmq/bin/rabbitmq-server (code=exited, status=1/FAILURE)
Main PID: 21800 (code=exited, status=1/FAILURE)
Status: "Initialized"
Feb 13 01:47:43 controller02 rabbitmq-server[21800]: ## ##
Feb 13 01:47:43 controller02 rabbitmq-server[21800]: ########## Logs: /var/log/rabbitmq/rabbit@controller02.log
Feb 13 01:47:43 controller02 rabbitmq-server[21800]: ###### ## /var/log/rabbitmq/rabbit@controller02-sasl.log
Feb 13 01:47:43 controller02 rabbitmq-server[21800]: ##########
Feb 13 01:47:43 controller02 rabbitmq-server[21800]: Starting broker...
Feb 13 01:47:45 controller02 rabbitmq-server[21800]: {"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{failed_to_cluster_with,[rabbit@controller01,rabbit@...rmal,[]]}}}"}
Feb 13 01:47:45 controller02 rabbitmq-server[21800]: Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{failed_to_cluster_with,[rabbit@controller01,rabbit@c...."},{rabbit,s
Feb 13 01:47:46 controller02 systemd[1]: rabbitmq-server.service: main process exited, code=exited, status=1/FAILURE
Feb 13 01:47:46 controller02 systemd[1]: Unit rabbitmq-server.service entered failed state.
Feb 13 01:47:46 controller02 systemd[1]: rabbitmq-server.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
那我们就给他重启一下
[root@controller02 ~]# systemctl restart rabbitmq-server
再查看集群状态
[root@controller01 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@controller01
[{nodes,[{disc,[rabbit@controller01]},
{ram,[rabbit@controller03,rabbit@controller02]}]},
{running_nodes,[rabbit@controller02,rabbit@controller03,rabbit@controller01]},
{cluster_name,<<"rabbit@controller01">>},
{partitions,[]},
{alarms,[{rabbit@controller02,[]},
{rabbit@controller03,[]},
{rabbit@controller01,[]}]}]
好!修好咯,其他靠RabbitMQ进行消息传递的服务也正常不报错了
风语者!平时喜欢研究各种技术,目前在从事后端开发工作,热爱生活、热爱工作。





QT多线程的5种用法,通过使用线程解决UI主界面的耗时操作代码,防止界面卡死。...
U8W/U8W-Mini使用与常见问题解决
stm32使用HAL库配置串口中断收发数据(保姆级教程)
分享几个国内免费的ChatGPT镜像网址(亲测有效)
Allegro16.6差分等长设置及走线总结