summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* WordingMichael Klishin2017-08-012-7/+7
|
* Merge branch 'master' into rabbitmq-server-1314Michael Klishin2017-08-0115-107/+272
|\ | | | | | | | | Conflicts: test/dynamic_ha_SUITE.erl
| * Merge pull request #1315 from rabbitmq/rabbitmq-server-1310Michael Klishin2017-08-0110-67/+210
| |\ | | | | | | Check if vhost supervisor is running when starting mirrors
| | * Restore delays used earlierMichael Klishin2017-08-011-3/+3
| | |
| | * Make connection tracking handler more defensiveMichael Klishin2017-08-011-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a vhost is deleted, both vhost_deleted and vhost_down will be emitted, resulting in double deletion. rabbit_networking:close_connection/1 therefore can throw an exception about an unknown connection pid. The handler needs to account for that. While at it, change log messages to be less alarming and simply mention vhost shutdown as opposed to "a database failure". References 7a82b43bf12b737250957081d0b0d84b21b3bf72. Signed-off-by: Luke Bakken <lbakken@pivotal.io>
| | * Merge branch 'master' into rabbitmq-server-1310Michael Klishin2017-07-318-38/+374
| | |\ | | |/ | |/| | | | | | | Conflicts: test/vhost_SUITE.erl
| * | Merge pull request #1309 from rabbitmq/rabbitmq-server-1303Michael Klishin2017-07-316-41/+63
| |\ \ | | | | | | | | Set queue state to 'stopped' when terminating.
| | * \ Merge branch 'master' into rabbitmq-server-1303Michael Klishin2017-07-283-4/+4
| | |\ \
| | * | | Do not try to perform a queue operation in `rabbit_amqqueue:with` if the ↵Daniil Fedotov2017-07-272-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | queue is stopped. The only operation in `with` which can be performed on a stopped queue is `delete`. It will call `delete_crashed` based on the `{absent, Q, stopped}` result from `with`.
| | * | | Merge branch 'master' into rabbitmq-server-1303Michael Klishin2017-07-263-2203/+96
| | |\ \ \
| | * | | | Test that list_queues return stopped state for queues taken down by vhost ↵Daniil Fedotov2017-07-262-39/+33
| | | | | | | | | | | | | | | | | | | | | | | | supervisors
| | * | | | Set queue state to 'stopped' when terminating.Daniil Fedotov2017-07-263-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a queue process is stopped by a vhsot supervision tree, it should be visible in the management UI and the `list_queues` command output. Setting the queue state to `stopped` is easier to reason about than checking the vhost aliveness status on a remote node. If queue will be restarted or migrated to a different node, the status will be set to `live`. If a node was not stopped normally the queues will have `live` state. If we cannot recover the queues thay should be marked `stopped` to be reported to rabbitmqctl and management UI Fixes #1303 [#148409695]
| | | | | * WordingMichael Klishin2017-07-311-2/+2
| | | | | |
| | | | | * Do not start multiple vhost supervisors for a single vhostDaniil Fedotov2017-07-311-1/+6
| | | | | |
| | | | | * Test that a node can start with a dead vhost and potential slaves on itDaniil Fedotov2017-07-311-57/+107
| | | | | |
| | | | | * Consider vhost can be deleted when cleaning vhost limits.Daniil Fedotov2017-07-311-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the vhost delete operation runtime parameters cleanup happens after vhost is deleted, so `notify_clear` should not fail if vhost does not exist anymore.
| | | | | * Delete vhost data even if the vhost supervisor is not running.Daniil Fedotov2017-07-311-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Of a vhost supervisor is down it can still be deleted and with all the data. This is useful if an operator wants to delete a bad vhost.
| | | | | * Test that a queue master do not fail when adding a dead-vhost nodeDaniil Fedotov2017-07-311-2/+17
| | | | | |
| | | | | * Check if vhost supervisor is running when starting slaves.Daniil Fedotov2017-07-311-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not fail a node boot process if cannot create slaves on down vhosts.
| | | | | * Refactor vhost supervisor access functions.Daniil Fedotov2017-07-315-34/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not start a vhost supervisor when accessing it. Rename functions to be more descriptive.
* | | | | | Fix compile warningLuke Bakken2017-07-311-1/+1
| | | | | |
* | | | | | Start slave queues after vhost recover, instead of node start.Daniil Fedotov2017-07-314-15/+90
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vhost supervisors can crash and restart without crashing the node, so the slave queues on this vhosts should be started after the vhsot recovery instead of node boot process. Fixes #1314 [#149484151]
* | | | | Merge pull request #1287 from rabbitmq/fix-travis-ci-buildLuke Bakken2017-07-312-25/+298
|\ \ \ \ \ | |_|_|_|/ |/| | | | Travis CI build
| * | | | Bump Elixir, only test against one OTP version on TravisMichael Klishin2017-07-281-4/+2
| | | | |
| * | | | Merge branch 'master' into fix-travis-ci-buildMichael Klishin2017-07-2839-2835/+1578
| |\ \ \ \ | |/ / / / |/| | | |
* | | | | Merge branch 'stable'Michael Klishin2017-07-282-3/+3
|\ \ \ \ \
| * \ \ \ \ Merge pull request #1312 from rabbitmq/rabbitmq-server-1306rabbitmq_v3_6_11_rc2rabbitmq_v3_6_11_rc1Gerhard Lazu2017-07-272-3/+3
| |\ \ \ \ \ | | | | | | | | | | | | | | Bump DEFAULT_DISTRIBUTION_BUFFER_SIZE to 128 MB
| | * | | | | Bump DEFAULT_DISTRIBUTION_BUFFER_SIZE to 128 MBMichael Klishin2017-07-272-3/+3
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After 24 hours of testing we haven't observed any anomalities or regressions. With this change multicast (mirroring) processes should be suspended less frequently, resulting in less variable throughput for mirroring (with link throughput of 1 GBit/s or greater). Closes #1306. [#149220393]
* | | | | | Merge pull request #1311 from rabbitmq/rabbitmq-server-1307Michael Klishin2017-07-271-0/+10
|\ \ \ \ \ \ | |_|_|_|_|/ |/| | | | | Log a more sensible error message when running on an outdated Erlang version
| * | | | | Correct commentMichael Klishin2017-07-271-1/+1
| | | | | |
| * | | | | Log this to stderrMichael Klishin2017-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per suggestion from @lukebakken. References #1307. [Finishes #149635897]
| * | | | | Log a more sensible error message when running on an Erlang version that's ↵Michael Klishin2017-07-271-0/+10
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | too old Closes #1307. [#149635897#]
* | | | | Naming + export typekjnilsson2017-07-272-2/+2
| | | | | | | | | | | | | | | | | | | | rabbit_types:connection_name() is used elsewhere but was not exported.
* | | | | Merge pull request #1308 from rabbitmq/rabbitmq-server-1305Jean-Sébastien Pédron2017-07-271-2/+2
|\ \ \ \ \ | |_|_|_|/ |/| | | | Bump Erlang versions in .travis.yml
| * | | | Bump Erlang versions in .travis.ymlMichael Klishin2017-07-261-2/+2
| | |_|/ | |/| | | | | | | | | | | | | | Part of #1305. [#149563549]
* | | | Cleanup test variable queue for channel operation timeout test.Daniil Fedotov2017-07-261-2201/+74
| | | | | | | | | | | | | | | | | | | | | | | | The file included all variable queue code, which was changed several times. It doesn't make sense to maintain two identical files, so now all the callbacks are called from rabbit_variable_queue.
* | | | Update new style config example as wellMichael Klishin2017-07-261-1/+11
| | | |
* | | | Merge branch 'stable'Michael Klishin2017-07-261-1/+11
|\ \ \ \ | |/ / / |/| / / | |/ /
| * | Improve docsMichael Klishin2017-07-261-1/+11
| | |
* | | Update rabbitmq-components.mkMichael Klishin2017-07-261-1/+1
| | |
* | | Merge branch 'stable'Michael Klishin2017-07-260-0/+0
|\ \ \ | |/ / | | | | | | | | | Conflicts: rabbitmq-components.mk
| * | Update rabbitmq-components.mkMichael Klishin2017-07-261-1/+1
| | |
* | | Merge pull request #1304 from rabbitmq/rabbitmq-management-446rabbitmq_v3_7_0_milestone18Karl Nilsson2017-07-251-1/+5
|\ \ \ | | | | | | | | Report vhost status on vhost info
| * \ \ Merge branch 'master' into rabbitmq-management-446Daniil Fedotov2017-07-2513-150/+638
| |\ \ \ | |/ / / |/| | |
* | | | Merge branch 'stable'Michael Klishin2017-07-251-10/+24
|\ \ \ \ | | |/ / | |/| |
| * | | Merge pull request #1302 from ↵rabbitmq_v3_6_11_milestone5Michael Klishin2017-07-251-10/+24
| |\ \ \ | | | | | | | | | | | | | | | | | | | | rabbitmq/stable-gm-mem-usage-during-constant-redelivery Stable GM memory usage during constant redelivery
| | * | | Run garbage collection in GM every 250msJean-Sébastien Pedron2017-07-251-7/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In high throughput scenarios, e.g. `basic.reject` or `basic.nack`, messages which belong to a mirrored queue and are replicated within a GM group, are quickly promoted to the old heap. This means that garbage collection happens only when the Erlang VM is under memory pressure, which might be too late. When a process is under pressure, garbage collection slows it down even further, to the point of RabbitMQ nodes running out of memory and crashing. To avoid this scenario, We want the GM process to garbage collect binaries regularly, i.e. every 250ms. The variable queue does the same for a similar reason: rabbitmq/rabbitmq-server#289 Initially, we wanted to use the number of messages as the trigger for garbage collection, but we soon discovered that different workloads (e.g. small vs large messages) would result in unpredictable and sub-optimal GC schedules. Before setting `fullsweep_after` to 0, memory usage was 2x higher (400MB vs 200MB) and throughput was 0.1x lower (18k vs 20k). With this `spawn_opt` setting, the general collection algorithm is disabled, meaning that all live data is copied at every garbage collection: http://erlang.org/doc/man/erlang.html#spawn_opt-3 The RabbitMQ deployment used for testing this change: * AWS, c4.2xlarge, bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3421.11 * 3 RabbitMQ nodes running OTP 20.0.1 * 3 durable & auto-delete queues with 3 replicas each * each queue master was defined on a different RabbitMQ node * every RabbitMQ node was running 1 queue master & 2 queue slaves * 1 consumer per queue with QOS 100 * 100 durable messages @ 1KiB each * `basic.reject` operations ``` | Node | Message throughput | Memory usage | | ------ | -------------------- | -------------- | | rmq0 | 12K - 20K msg/s | 400 - 900 MB | | rmq1 | 12K - 20K msg/s | 500 - 1000 MB | | rmq2 | 12K - 20K msg/s | 500 - 800 MB | ``` [#148892851] Signed-off-by: Gerhard Lazu <gerhard@rabbitmq.com>
| | * | | Remove hibernate from GMJean-Sébastien Pédron2017-07-241-3/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't want to use the backoff/hibernate feature because we have observed that the GM process is suspended half of the time. We really wanted to replace gen_server2 with gen_server, but it was more important to keep changes in 3.6 to a minimum. GM will eventually be replaced, so switching it from gen_server2 to gen_server will be soon redundant. We simply do not understand some of the gen_server2 trade-offs well enough to feel strongly about this change. [#148892851] Signed-off-by: Gerhard Lazu <gerhard@rabbitmq.com>
* | | | Fix unused variablekjnilsson2017-07-251-1/+1
| | | |
* | | | SpellingMichael Klishin2017-07-251-1/+1
| | | |