| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
rabbitmq-server#1346
|
| |
|
|
|
|
|
|
|
|
|
| |
If the process crashes unexpectedly, the autoheal process gets into a
deadlock in the 'restarting' state, ignoring any new request from the winner.
If the crash happens before the process is registered, no logs are generated.
Thus, we need to link that process and abort the autoheal if the process finishes
with a reason different from normal.
rabbitmq-server#1346
[#150707017]
|
| | |
|
| |\
| |
| | |
Do not GC channel-queue metrics on mirror migration
|
| | |
| |
| |
| |
| | |
rabbitmq-server#1340
[#150442817]
|
| |\ \
| |/
|/| |
Read rabbitmq-env.conf a bit earlier to pick up two variables
|
| | |
| |
| |
| |
| |
| |
| |
| |
| | |
The `bats` target will clone and install the bats command in a
manner similar to `kerl` or `hex.pm`, i.e. to ERLANG_MK_TMP
[#150452491]
Signed-off-by: Luke Bakken <lbakken@pivotal.io>
|
| |/
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this change, setting RABBITMQ_DISTRIBUTION_BUFFER_SIZE
and/or RABBITMQ_SCHEDULER_BIND_TYPE would not be added to
SERVER_ERL_ARGS because the latter variable was built prior to
reading rabbitmq-env.conf
Fixes #1338
[#150452491]
Signed-off-by: Luke Bakken <lbakken@pivotal.io>
|
| |\
| |
| | |
Start/stop windows service using `net` utility instead of `erlsrv`
|
| | |
| |
| |
| |
| |
| | |
`erlsrv` doc suggests to use windows system control tools to control
services. It prints weird error when error is occured
(for example access is denied), so using `net` is prefered.
|
| | | |
|
| | |
| |
| |
| |
| |
| | |
Like we do in partition_SUITE.
References #1323.
|
| |\ \
| |/
|/| |
Clean up orphaned exclusive queues
|
| | | |
|
| | | |
|
| | | |
|
| |/ |
|
| |\
| |
| | |
Bump DEFAULT_DISTRIBUTION_BUFFER_SIZE to 128 MB
|
| |/
|
|
|
|
|
|
|
|
| |
After 24 hours of testing we haven't observed any anomalities or
regressions. With this change multicast (mirroring) processes
should be suspended less frequently, resulting in less variable
throughput for mirroring (with link throughput of 1 GBit/s or greater).
Closes #1306.
[#149220393]
|
| | |
|
| | |
|
| |\
| |
| |
| |
| | |
rabbitmq/stable-gm-mem-usage-during-constant-redelivery
Stable GM memory usage during constant redelivery
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In high throughput scenarios, e.g. `basic.reject` or `basic.nack`,
messages which belong to a mirrored queue and are replicated within a GM
group, are quickly promoted to the old heap. This means that garbage
collection happens only when the Erlang VM is under memory pressure,
which might be too late. When a process is under pressure, garbage
collection slows it down even further, to the point of RabbitMQ nodes
running out of memory and crashing. To avoid this scenario, We want the
GM process to garbage collect binaries regularly, i.e. every 250ms. The
variable queue does the same for a similar reason:
rabbitmq/rabbitmq-server#289
Initially, we wanted to use the number of messages as the trigger for
garbage collection, but we soon discovered that different workloads
(e.g. small vs large messages) would result in unpredictable and
sub-optimal GC schedules.
Before setting `fullsweep_after` to 0, memory usage was 2x higher (400MB
vs 200MB) and throughput was 0.1x lower (18k vs 20k). With this
`spawn_opt` setting, the general collection algorithm is disabled,
meaning that all live data is copied at every garbage collection:
http://erlang.org/doc/man/erlang.html#spawn_opt-3
The RabbitMQ deployment used for testing this change:
* AWS, c4.2xlarge, bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3421.11
* 3 RabbitMQ nodes running OTP 20.0.1
* 3 durable & auto-delete queues with 3 replicas each
* each queue master was defined on a different RabbitMQ node
* every RabbitMQ node was running 1 queue master & 2 queue slaves
* 1 consumer per queue with QOS 100
* 100 durable messages @ 1KiB each
* `basic.reject` operations
```
| Node | Message throughput | Memory usage |
| ------ | -------------------- | -------------- |
| rmq0 | 12K - 20K msg/s | 400 - 900 MB |
| rmq1 | 12K - 20K msg/s | 500 - 1000 MB |
| rmq2 | 12K - 20K msg/s | 500 - 800 MB |
```
[#148892851]
Signed-off-by: Gerhard Lazu <gerhard@rabbitmq.com>
|
| |/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't want to use the backoff/hibernate feature because we have
observed that the GM process is suspended half of the time.
We really wanted to replace gen_server2 with gen_server, but it was more
important to keep changes in 3.6 to a minimum. GM will eventually be
replaced, so switching it from gen_server2 to gen_server will be soon
redundant. We simply do not understand some of the gen_server2
trade-offs well enough to feel strongly about this change.
[#148892851]
Signed-off-by: Gerhard Lazu <gerhard@rabbitmq.com>
|
| | |
|
| |\
| |
| | |
Do not show queues with deleted vhost.
|
| |/
|
|
|
|
|
|
|
| |
When vhost does not exist for a queue, which reside on
a stopped node, the queue will not be displayed in the 'list_queues'
command output.
When a node goes back up, queues without vhost will be deleted.
[Fixes #1300]
[#149292743]
|
| |\
| |
| | |
Report a difference between RSS and erlang total memory only if RSS is bigger.
|
| |/
|
|
|
|
|
|
|
|
| |
RSS can be either bigger than memory reported by `erlang:memory(total)`
or smaller. Make sure we only add positive difference to "other" fraction
in memory breakdown.
Related to 0872a15a050176ab7598bc3d1b21a2d5c5af3052
[addresses #1294]
[#149066379]
|
| |
|
|
|
| |
This hopefully further improves commit
6f374fb48a5cc9972ce0bfcf747faa380e1d1e4d.
|
| |
|
|
|
|
|
|
|
| |
How total amount of memory is computed is an important
piece of information: what the runtime reports and what tools
such as 'ps' report will differ.
http://erlang.org/pipermail/erlang-questions/2012-September/069337.html is
a good source of information on some of the discrepancies.
|
| |
|
|
|
| |
To make it more likely to pass on resource constrained
environments.
|
| |
|
|
|
| |
This code runs on server nodes, we can avoid
using ctl commands.
|
| |\
| |
| | |
Add test for set_vm_memory_high_watermark
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is meant to add a missing integration test that wraps up #1285
It's not a unit test and it can't run in paralle, but
set_disk_free_limit does the same wrong thing. We should either remove
them both or leave them as they are.
[#148470947]
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As discussed with @michaelklishin:
We discovered that `erlang:memory(system).` can be almost as large
as OsTotal when swapping is in effect. This means that total
(processes + system) will be larger than the OsTotal, therefore
OsTotal - ErlangTotal cannot be assumed to be non-negative. I think
having some "unaccounted" memory is better than having it
"accounted" as negative.
re #1223
[finishes #148435813]
|
| |\ \
| |/
|/| |
Only preserve stats for local queues
|
| | |\
| |/
|/| |
|
| | |
| |
| |
| | |
list_hashes commands
|
| | | |
|
| | | |
|
| | |
| |
| |
| | |
There was three single quotes, so one extra.
|
| | | |
|
| | |
| |
| |
| |
| |
| |
| |
| | |
This change handles all non-crash termination cases. The assumption here
is that once an amqqueue_process terminates the master is no longer on
the current node.
[#147753285]
|
| | | |
|
| | |
| |
| |
| |
| | |
Previously, GC would not clear out metrics for non-local queues,
so stale data remained in the metrics tables.
|
| | | |
|
| | | |
|
| | | |
|
| | |
| |
| |
| | |
Simplifies code formatting by using intermediate variables in test suite setup.
|