summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Fix cluster membership check for running masterBogdan Dobrelya2016-02-021-0/+5
| | | | The running master is always inside of its own cluster. Fix the cluster membership check when a node is the master.
* Fix uninitialized status_masterBogdan Dobrelya2016-02-021-0/+1
| | | | | | | | Fix multiple nodes may be reported in logs as the running master Related Fuel bug https://bugs.launchpad.net/bugs/1540936 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* Merge pull request #597 from binarin/rabbitmq-server-ocf-quiet-curlMichael Klishin2016-02-021-1/+1
|\ | | | | Suppress curl progress indicator in rabbit OCF
| * Suppress curl progress indicator in rabbit OCFAlexey Lebedeff2016-02-021-1/+1
|/ | | | | | | | | | | | | curl is used by OCF script for fetching definitions (queues etc.), but results of that invocation is shown as garbage in pacemaker logs - progress indicator doesn't make any sense in logs. According to curl manpage the following combination of options should be used "--silent --show-error" - this will suppress only progress indicator, errors will still be shown. Also other short curl options are replaced with their long counterparts - for improved readability.
* Merge pull request #594 from rabbitmq/rabbitmq-server-592Michael Klishin2016-02-011-1/+6
|\ | | | | Use -r with sed on Linux
| * Use -r with sed on Linux, fixes #592Michael Klishin2016-02-011-1/+6
|/ | | | | | | | We previously did the same change in #273 (PR: #275), but the file in which it was done was removed in 231e90cacf3daec5f43b3307867129e61496b123. Note that #592 recommends using `-r` unconditionally but that option is not recognised by sed which ships with OS X.
* Merge branch 'rabbitmq-server-541' into stableMichael Klishin2016-01-291-4/+6
|\ | | | | | | | | | | | | | | This is only a part of what #541 is supposed to cover but it already helped in a particular node shutdown lockup we've observed => worth merging earlier. Per discussion with @dcorbacho.
| * Merge branch 'stable' into rabbitmq-server-541Michael Klishin2016-01-271-1/+9
| |\
| * | Increase supervisor timeout to net_ticktime + 10sDiana Corbacho2016-01-271-1/+1
| | |
| * | Introduce timeout in rabbit_channel_supDiana Corbacho2016-01-271-4/+6
| | |
* | | Merge pull request #588 from rabbitmq/rabbitmq-management-117Michael Klishin2016-01-291-4/+12
|\ \ \ | |_|/ |/| | Specify hash algorithm in change_password_hash
| * | Trailing wsMichael Klishin2016-01-281-3/+3
| | |
| * | Specify hash algorithm in change_password_hashDaniil Fedotov2016-01-281-3/+11
|/ /
* | Merge pull request #584 from rabbitmq/rabbitmq-server-581Michael Klishin2016-01-271-1/+9
|\ \ | |/ |/| Unblock receive after 15s
| * Unblock receive after 15sDiana Corbacho2016-01-271-1/+9
| |
* | rabbit.erl: Do not run systemd-notify on WindowsJean-Sébastien Pédron2016-01-271-4/+7
| | | | | | | | This silences a warning logged during RabbitMQ startup.
* | Merge pull request #535 from rabbitmq/rabbitmq-server-307Jean-Sébastien Pédron2016-01-271-0/+1
|\ \ | | | | | | Ignore duplicate down_from_ch
| * | Ignore duplicate down_from_chDaniil Fedotov2016-01-081-0/+1
| | |
* | | Merge branch 'rabbitmq-server-493' into stableMichael Klishin2016-01-276-24/+121
|\ \ \ | |_|/ |/| |
| * | Create directories and files on Windows before conversion to short filenamesJean-Sébastien Pédron2016-01-262-2/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the directory or file does not exist before RabbitMQ starts, we can't let RabbitMQ create it, otherwise, it's created with its short filename, not its long one. With this new correction, we can "escape" all variables instead of only RABBITMQ_BASE. Fixes #493.
| * | rabbitmq-server.bat: Honor RABBITMQ_LOGS=- to log to stdoutJean-Sébastien Pédron2016-01-261-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | Note that at the time of this commit, Lager does not support logging to stdout on Windows. This commit still improves consistency between Unix and Windows. References #493.
| * | Use RABBITMQ_HOME to set the path to RabbitMQ ebin directoryJean-Sébastien Pédron2016-01-264-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Compared to the script's parent directory (stored in TDP0), RABBITMQ_HOME is converted to a short filename to avoid non-ASCII in the path. Fixes #493.
| * | Use short filenames in Windows startup scriptsJean-Sébastien Pédron2016-01-262-17/+50
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Windows, cmd.exe and batch scripts do not support Uniode apparently. However, Windows uses UTF-16 to encode filenames one disk. In batch scripts, filenames are converted to some one-byte-wide charset. Once passed to Erlang and RabbitMQ, those filenames are incorrect. In particular, the management UI is unhappy because filenames obviously contain invalid UTF-8 characters. Using short filenames makes sure filename only contain US-ASCII characters. To convert them, we use "for" expansion. At the same time, filenames are made absolute. It works even better than realpath.exe because the latter also converts filenames to another charset again. Fixe #493.
* | Merge pull request #573 from binarin/rabbitmq-server-systemd-notify-zero-depsMichael Klishin2016-01-221-1/+1
|\ \ | | | | | | Use systemd-notify(1) shell helper as fallback
| * | Use systemd-notify(1) shell helper as fallbackAlexey Lebedeff2016-01-221-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | Currently external erlang library `sd_notify` is used to make systemd unit with `Type=notify` to work correctly. This library contains some C code and thus cannot be built into architecture-independent package. But it is not actually needed, as systemd provides systemd-notify(1) helper for shell scripts which serves exactly the same purpose. The only thing is that you need to add `NotifyAccess=all` to your unit file to make everything work well.
* | Merge pull request #571 from binarin/rabbitmq-server-ocf-shell-quotingMichael Klishin2016-01-211-1/+1
|\ \ | | | | | | Fix usage of uninitialized variable in OCF script
| * | Fix usage of uninitialized variable in OCF scriptAlexey Lebedeff2016-01-211-1/+1
|/ /
* | Merge pull request #563 from ↵Michael Klishin2016-01-211-0/+109
|\ \ | | | | | | | | | | | | binarin/rabbitmq-server-ocf-list-channels-diagnostics Improve OCF script diagnostics for timed-out 'list_channels'
| * | Improve 'list_channels' diagnostics in OCFAlexey Lebedeff2016-01-201-1/+1
| | | | | | | | | timeout(1) manpage mentions 124 as another valid return code from, in addition to 128 + signal-number.
| * | Improve rabbitmq OCF script diagnosticsAlexey Lebedeff2016-01-201-0/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently time-out when running 'rabbitmqctl list_channels' is treated as a sign that current node is unhealthy. But it could not be the case, as the hanging channel could be actually on some other node. Given that currently we have more than one bug related to 'list_channels', it makes sense to improve diagnostics here. This patch doesn't change any behaviour, only improves logging after time-out happens. If time-outs continue to occur (even with latest rabbitmq versions or with backported fixes), we could switch to this improved list_channels and kill rabbitmq only if stuck channels are located on current node. But I hope that all related rabbitmq bugs were already closed.
* | | Merge pull request #566 from rabbitmq/rabbitmq-server-319Michael Klishin2016-01-201-32/+20
|\ \ \ | | | | | | | | Remove duplicate code in pre_publish and publish functions
| * | | Remove duplicate code in pre_publish and publish functionsLoïc Hoguin2016-01-201-32/+20
| |/ /
* | | Merge pull request #560 from dmitrymex/reset-master-scoreMichael Klishin2016-01-201-0/+3
|\ \ \ | |/ / |/| | Reset master score if we decide to restart RabbitMQ on timeout
| * | Reset master score if we decide to restart RabbitMQ on timeoutDmitry Mescheryakov2016-01-191-0/+3
|/ / | | | | | | | | Doing otherwise might not trigger the restart while it is clearly needed.
* | Merge pull request #558 from galanoff/stableMichael Klishin2016-01-191-4/+17
|\ \ | | | | | | Add optional prefix for RabbitMQ node FQDNs
| * | Add optional prefix for RabbitMQ node FQDNsKyrylo Galanov2016-01-181-4/+17
| | | | | | | | | | | | | | | It would allow to instantiate multiple rabbit clusters constructed from prefix-based instances of rabbit nodes.
* | | Merge pull request #527 from binarin/rabbitmq-server-better-startup-diagnosticsJean-Sébastien Pédron2016-01-151-2/+15
|\ \ \ | |/ / |/| | Improve diagnostics in 'rabbitmq-server' script
| * | Improve diagnostics in 'rabbitmq-server' scriptAlexey Lebedeff2015-12-311-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While errors are detected with '-e' shell option in this script, diagnostics messages leave a lot to be desired. E.g. when trying to write pid file to full partition, the only message in log is: sh: echo: I/O error Which is definitely insufficient
* | | Merge pull request #547 from bogdando/bug/1531838Michael Klishin2016-01-151-18/+24
|\ \ \ | | | | | | | | Fix rabbitMQ OCF monitor detection of running master
| * | | Fix rabbitMQ OCF monitor detection of running masterBogdan Dobrelya2016-01-141-18/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When monitor detected the node as OCF_RUNNING_MASTER, this may be lost while the monitor checks in progress. * Rework the prev_rc by the rc_check to fix this. * Also add info log if detected as running master. * Break the monitor check loop early, if it shall be exiting to be restarted by pacemaker. * Do not recheck the master status and do not update the master score, if the node was already detected by monitor as OCF_RUNNING_MASTER. By that point, the running and healthy master shall not be checked against other nodes uptime as it is pointless and only takes more time and resources for the action monitor to finish. * Fail early, if monitor detected the node as OCF_RUNNING_MASTER, but the rabbit beam process is not running * For OCF_CHECK_LEVEL>20, exclude the current node from the check loop as we already checked it before Related Fuel bug: https://launchpad.net/bugs/1531838 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* | | | Merge pull request #543 from binarin/rabbitmq-server-rotate-logs-data-lossMichael Klishin2016-01-154-5/+14
|\ \ \ \ | | | | | | | | | | Fix 'rabbitmqctl rotate_logs' behaviour
| * | | | Fix 'rabbitmqctl rotate_logs' behaviourAlexey Lebedeff2016-01-124-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When 'rabbitmqctl rotate_logs' is called without any parameters, it clears logs unconditionally. And given that this form is used in logrotate config files, this could result in data loss. This could be reproduced with following scenario: 1) 'max_size' is set globally in lograte config 2) One of two rabbitmq logs is greater than that limit 3) Daily logrotate run was already performed today, and now we are calling it manually. In this case logrotate will copy only file that is bigger than max_size, but 'rabbitmqctl rotate_logs' will clear both of them - leading to data loss.
* | | | | Merge pull request #552 from binarin/rabbitmq-server-549Michael Klishin2016-01-141-3/+26
|\ \ \ \ \ | |_|/ / / |/| | | | Limit number of unique node names for rabbitmqctl
| * | | | Limit number of unique node names for rabbitmqctlAlexey Lebedeff2016-01-141-3/+26
|/ / / / | | | | | | | | | | | | | | | | | | | | It prevents atom table overflow in a long running broker. Fixes #549
* | | | Merge pull request #542 from rabbitmq/rabbitmq-server-528Michael Klishin2016-01-123-10/+21
|\ \ \ \ | |/ / / |/| | | Make number of Ranch acceptors configurable
| * | | Merge branch 'stable' into rabbitmq-server-528Michael Klishin2016-01-125-90/+213
| |\ \ \ | |/ / / |/| | |
* | | | Merge pull request #540 from bogdando/bug/1529897Michael Klishin2016-01-121-43/+77
|\ \ \ \ | | | | | | | | | | OCF: Fuel bug 1529897
| * | | | Fix get_status, action_stop, proc_stop then beam's unresponsiveBogdan Dobrelya2016-01-111-29/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fix get status() to catch beam state and output errors * Fix action_stop() to force name-based mathcing then no pidfile and the beam's unresponsive * Fix proc_stop to use name based matching if no pidfile found * Fix proc_stop to retry sending the signal when using the name based match as well W/o this patch, the situation is possible when: - beam's running and cannot process signals, but is reported "not running" by the get_status(), while in fact it shall be reported as generic error - which_applications() returned error, while its output is still being parsed for the "what" match, while it shall not. - action stop and proc_stop gives up then there is no pidfile and the beam's running unresponsive. The solution is to make get_status to return generic error and action stop to use the rabbit process name matching for killing it. Related Fuel bug: https://bugs.launchpad.net/fuel/+bug/1529897 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
| * | | | Fix proc_kill then there is no pid foundBogdan Dobrelya2016-01-111-14/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | W/o this fix, the rabbit OCF cannot make proc_stop to try to kill the pid-less beam process by its name matching because the proc_kill()'s 1st parameter cannot be passed empty. The fix is to use the "none" value then the pid-less process must be matched by the service_name instead. Also, fix the proc_kill to deal with Multi process pid files as well (there are many pids, a space separated). Related Fuel bugs: https://launchpad.net/bugs/1529897 https://launchpad.net/bugs/1532723 Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
* | | | | Include alarm information in output for cluster statusJoseph Yiasemides2016-01-111-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | After this change `rabbitmqctl cluster_status` will print information about alarms raised across a cluster.