| Commit message (Collapse) | Author | Age | Files | Lines |
| |\ |
|
| | |\
| | |
| | | |
Use systemd-notify(1) shell helper as fallback
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently external erlang library `sd_notify` is used to make systemd
unit with `Type=notify` to work correctly. This library contains some C
code and thus cannot be built into architecture-independent package.
But it is not actually needed, as systemd provides systemd-notify(1)
helper for shell scripts which serves exactly the same purpose.
The only thing is that you need to add `NotifyAccess=all` to your unit
file to make everything work well.
|
| |\ \
| |/ |
|
| | |\
| | |
| | | |
Fix usage of uninitialized variable in OCF script
|
| | |/ |
|
| |\ \
| |/ |
|
| | |\
| | |
| | |
| | |
| | | |
binarin/rabbitmq-server-ocf-list-channels-diagnostics
Improve OCF script diagnostics for timed-out 'list_channels'
|
| | | |
| | |
| | | |
timeout(1) manpage mentions 124 as another valid return code from, in addition to 128 + signal-number.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently time-out when running 'rabbitmqctl list_channels' is treated
as a sign that current node is unhealthy. But it could not be the
case, as the hanging channel could be actually on some other
node. Given that currently we have more than one bug related to
'list_channels', it makes sense to improve diagnostics here.
This patch doesn't change any behaviour, only improves logging after
time-out happens. If time-outs continue to occur (even with latest
rabbitmq versions or with backported fixes), we could switch to this
improved list_channels and kill rabbitmq only if stuck channels are
located on current node. But I hope that all related rabbitmq bugs
were already closed.
|
| |\ \ \
| |/ / |
|
| | |\ \
| | | |
| | | | |
Remove duplicate code in pre_publish and publish functions
|
| | | |/ |
|
| |\ \ \
| |/ / |
|
| | |\ \
| | |/
| |/| |
Reset master score if we decide to restart RabbitMQ on timeout
|
| | |/
| |
| |
| |
| | |
Doing otherwise might not trigger the restart while it is clearly
needed.
|
| |\ \
| |/ |
|
| | |\
| | |
| | | |
Add optional prefix for RabbitMQ node FQDNs
|
| | | |
| | |
| | |
| | |
| | | |
It would allow to instantiate multiple rabbit clusters constructed
from prefix-based instances of rabbit nodes.
|
| | | | |
|
| | | | |
|
| |\ \ \
| |/ / |
|
| | |\ \
| | |/
| |/| |
Improve diagnostics in 'rabbitmq-server' script
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While errors are detected with '-e' shell option in this script,
diagnostics messages leave a lot to be desired.
E.g. when trying to write pid file to full partition, the only message
in log is:
sh: echo: I/O error
Which is definitely insufficient
|
| |\ \ \
| | | |
| | | | |
Update calls to 'rabbitmqctl rotate_logs'
|
| |/ / /
| | |
| | |
| | |
| | |
| | | |
After switch to lager there is no need to to call 'rabbitmqctl
rotate_logs' from logrotate. Also 'rotate_logs' no longer accepts
optional argument. Update documentation accordingly.
|
| |\ \ \
| |/ / |
|
| | |\ \
| | | |
| | | | |
Fix rabbitMQ OCF monitor detection of running master
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When monitor detected the node as OCF_RUNNING_MASTER, this may be
lost while the monitor checks in progress.
* Rework the prev_rc by the rc_check to fix this.
* Also add info log if detected as running master.
* Break the monitor check loop early, if it shall be exiting to be
restarted by pacemaker.
* Do not recheck the master status and do not update the master score,
if the node was already detected by monitor as OCF_RUNNING_MASTER.
By that point, the running and healthy master shall not be checked
against other nodes uptime as it is pointless and only takes more
time and resources for the action monitor to finish.
* Fail early, if monitor detected the node as OCF_RUNNING_MASTER, but
the rabbit beam process is not running
* For OCF_CHECK_LEVEL>20, exclude the current node from the check
loop as we already checked it before
Related Fuel bug:
https://launchpad.net/bugs/1531838
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
|
| |\ \ \ \
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | | |
Conflicts:
src/rabbit.erl
src/rabbit_error_logger_file_h.erl
src/rabbit_sasl_report_file_h.erl
|
| | |\ \ \
| | | | |
| | | | | |
Fix 'rabbitmqctl rotate_logs' behaviour
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When 'rabbitmqctl rotate_logs' is called without any parameters, it
clears logs unconditionally. And given that this form is used in
logrotate config files, this could result in data loss.
This could be reproduced with following scenario:
1) 'max_size' is set globally in lograte config
2) One of two rabbitmq logs is greater than that limit
3) Daily logrotate run was already performed today, and now we
are calling it manually. In this case logrotate will copy only file
that is bigger than max_size, but 'rabbitmqctl rotate_logs' will
clear both of them - leading to data loss.
|
| |\ \ \ \ \
| |/ / / / |
|
| | |\ \ \ \
| | |_|/ /
| |/| | | |
Limit number of unique node names for rabbitmqctl
|
| | |/ / /
| | | |
| | | |
| | | |
| | | |
| | | | |
It prevents atom table overflow in a long running broker.
Fixes #549
|
| |\ \ \ \
| | | | |
| | | | | |
Switch to Lager for logging
|
| |/ / / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
By default, RabbitMQ now logs messages to a single file
($RABBITMQ_LOGS). The $RABBITMQ_SASL_LOGS variable is unused. To
configure how and which messages are logged, it's recommended to do it
from rabbitmq.config, not from the environment variable.
The old `log_levels` parameter is unsupported and categories are
replaced by Lager extra sinks. If you had in your rabbitmq.config:
{rabbit, [
{log_levels, [{connection, info}]}
]}
You can now configure Lager like this:
{lager, [
{extra_sinks, [
{rabbit_connection_lager_event, [
{handlers, [{lager_forwarder_backend, [lager_event, info]}]}
]}
]}
]}
rabbitmq-build.mk from rabbitmq-common is included in the top-level
Makefile. It sets the appropriate compiler options to enable Lager's
lager_transform parse_transform module.
rabbit_log calls are now converted by this parse_transform to direct
calls to lager:log(). To keep backward compatibility with other plugins,
the rabbit_log module still implements all the <level>() functions.
Compared to the parse_transformed calls, the main difference is the
logged message does not carry the file:line metadata.
Fixes #94.
|
| |\ \ \ \
| |/ / / |
|
| | |\ \ \
| | |/ /
| |/| | |
Make number of Ranch acceptors configurable
|
| | | |\ \
| | |/ /
| |/| | |
|
| | | | | |
|
| |\ \ \ \
| |/ / / |
|
| | |\ \ \
| | | | |
| | | | | |
OCF: Fuel bug 1529897
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* Fix get status() to catch beam state and output errors
* Fix action_stop() to force name-based mathcing then no
pidfile and the beam's unresponsive
* Fix proc_stop to use name based matching if no pidfile
found
* Fix proc_stop to retry sending the signal when using the name
based match as well
W/o this patch, the situation is possible when:
- beam's running and cannot process signals, but is reported "not running"
by the get_status(), while in fact it shall be reported as generic error
- which_applications() returned error, while its output is still
being parsed for the "what" match, while it shall not.
- action stop and proc_stop gives up then there is no pidfile and the beam's
running unresponsive.
The solution is to make get_status to return generic error and action
stop to use the rabbit process name matching for killing it.
Related Fuel bug:
https://bugs.launchpad.net/fuel/+bug/1529897
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
W/o this fix, the rabbit OCF cannot make
proc_stop to try to kill the pid-less beam process
by its name matching because the proc_kill()'s
1st parameter cannot be passed empty.
The fix is to use the "none" value then the pid-less
process must be matched by the service_name instead.
Also, fix the proc_kill to deal with Multi process
pid files as well (there are many pids, a space separated).
Related Fuel bugs:
https://launchpad.net/bugs/1529897
https://launchpad.net/bugs/1532723
Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
|
| |\ \ \ \ \
| | | | | |
| | | | | | |
392: Add cluster-wide resource alarm status to `cluster_status` output
|
| | | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
After this change `rabbitmqctl cluster_status` will print information
about alarms raised across a cluster.
|
| |\ \ \ \ \ \
| |/ / / / /
|/| / / / /
| |/ / / / |
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
After this change `rabbitmqctl cluster_status` will print information
about alarms raised across a cluster.
|
| |\ \ \ \ \
| |/ / / / |
|