summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Initial work to permit low priority background tasks to be catered for.Matthew Sackman2009-07-095-78/+236
|
* merging in from 21087. In testing, observed oscillation - with lots of ↵Matthew Sackman2009-07-092-16/+17
|\ | | | | | | | | | | queues, give 2 queues the same length such that they can't both fit in memory, and slowly trickle in messages. As each one gets a message, it'll force the other one out to disk (the other one will either be in the hibernating or lowrate groups). This is bad. Therefore, adjusted the conditions under which we bring a queue back in from disk to exclude queues that are either hibernating or low rate (don't forget, even a list_queues will wake up a queue and cause it to report memory). If you have two fast queues then neither of them will be in the groups of low rate or hibernating queues, so neither will be candidates for eviction so the problem doesn't exist there, instead, if they need more memory and can't fit in ram then they'll evict themselves to disk rather than anyone else. Also realised that a million queues isn't unreasonable, so minimum number of tokens in the system should be more like 1e7 if not higher.
| * *idiot*.Matthew Sackman2009-07-091-12/+11
| |
* | length is never used in disk_queue, so removedMatthew Sackman2009-07-091-21/+12
| |
* | minor documentation fix.Matthew Sackman2009-07-091-2/+1
| |
* | Fixes from removing the non-contiguous sequences support from the disk queue ↵Matthew Sackman2009-07-091-31/+20
| | | | | | | | that I failed to spot last night, but apparently came to me during my dreams. I have no idea how the tests managed to pass last night...
* | The mixed queue contains in its queue knowledge of whether the next message ↵Matthew Sackman2009-07-084-190/+83
| | | | | | | | is on disk or not. It does not use any sequence numbers, nor does it try to correllate queue position with sequence numbers in the disk_queue. Therefore, there is absolutely no reason for the disk_queue to have all the necessary complexity associated with being able to cope with non-contiguous sequence ids. Thus all removed. This has made the disk_queue a good bit simpler and slightly faster in a few cases too. All tests pass.
* | added requeue_next_n to disk_queue and made use of it in ↵Matthew Sackman2009-07-083-28/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | mixed_queue:to_disk_only_mode. This function puts the next N messages at the front of the queue to the back and is MUCH more efficient than calling phantom_deliver and then requeue_with_seqs. This means that a queue which has been sent to disk, then converted back to mixed mode, some minor been done, and then sent back to disk takes almost no time in transitions beyond the first transition. The test of this: 1) declare durable queue 2) send 100,000 persistent messages to it 3) send 100,000 non-persistent messages to it 4) send 100,000 persistent messages to it 5) now pin it to disk - it'll make two calls to requeue_next_n and should be rather quick as it's only the middle 100,000 messages that actually have to be written, the other 200,000 don't even get sent between the disk_queue and mixed_queue in either direction. A total of 100,003 calls are necessary for this transition: 2 requeue_next_n, 100,000 tx_publish, 1 tx_commit 6) now unpin it from disk and list the queues to wake it up. The transition to mixed_mode is one call, zero reads, and instantaneous 7) now repin it to disk. The mixed queue knows everything is still on disk, so it makes one call to requeue_next_n with N = 300,000. The disk_queue sees this is the whole queue and so doesn't need to do any work at all and so is instant. All tests pass.
* | found a bug in the memory reports in combination with hibernation in that if ↵Matthew Sackman2009-07-083-52/+78
| | | | | | | | | | | | a process comes out of hibernation, then does under 10 second's work before hibernating again then it'll only issue a memory report when it goes back to hibernation, thus it'll always claim to the queue_mode_manager that it's hibernating. Thus now, when hibernating, or when receiving the report_memory message, we set state so that when the next normal message comes in, we always send a memory report after that message. This ensures that when a process wakes up and does some real work, the queue_mode_manager will be informed. Applied this, and the ability to hibernate to the disk_queue too. Plus some minor refactoring and better state field names. All tests pass, and the disk_queue really does hibernate with the binary backoff as I wanted.
* | Sorted out rabbitmqctl so that it sends pinning commands to the ↵Matthew Sackman2009-07-073-33/+100
| | | | | | | | queue_mode_manager rather than directly talking to the queues. This means the queues and the queue manager can't disagree on the mode a queue should be in.
* | lots of tuning and testing. Totally rewrote the to_disk_only_mode in ↵Matthew Sackman2009-07-076-57/+103
| | | | | | | | | | | | mixed_queue so that it does batching. This means that it won't just flood the disk_queue with a billion messages, thus exhausting memory. Instead it does batching and uses tx_commit to demarkate the batches. This means the conversion happens as quickly as possible and does not exhaust memory. Dropped the memory alarms to 0.8. This is a good idea because converting queues between modes transiently takes a fair chunk of memory, and leaving the alarms up at 0.95 was proving too high making the mode transitions exhaust ram and swap to buggery. However, there is a problem when going to disk_mode in mixed queue where messages in the queue are already on disk. A million calls to phantom deliver is not a good idea, and locks a CPU core at 100% for a very long time.
* | oh look, another merge in from 21087Matthew Sackman2009-07-061-7/+8
|\ \ | |/
| * a) reverted the change to do_send which had come in when I updated ↵Matthew Sackman2009-07-061-7/+8
| | | | | | | | | | | | gen_server2 to R13B1 - this was a change originally made by matthias to ensure that messages cast to remote nodes are done so in order b) Add guards and slightly relax name/1 so that it works in R11B5. All tests pass in R11B5 and manual testing of the binary backoff hibernation shows that too works.
* | (another) merge from 21087 (not 21097 as mentioned in previous commit)Matthew Sackman2009-07-061-28/+49
|\ \ | |/
| * All cosmetic (line length)Matthew Sackman2009-07-061-28/+49
| |
* | merge from bug21097Matthew Sackman2009-07-063-110/+173
|\ \ | |/
| * Pushed the binary exponential timeout / hibernate system into gen_server2. ↵Matthew Sackman2009-07-062-92/+117
| | | | | | | | Adjusted amqqueue_process to use it. Added documentation. Tested thoroughly with explicit test module (not added), and full test suite, which all passed. Existing tests further up in this bug similarly pass and demonstrate code is functioning correctly.
| * updating gen_server2 with latest from R13B01 in order to ensure this doesn't ↵Matthew Sackman2009-07-063-28/+71
| | | | | | | | slip badly behind the shipped version.
* | Added documentationMatthew Sackman2009-07-061-0/+66
| |
* | testing shows these values work well. The whole thing works pretty well. ↵Matthew Sackman2009-07-063-4/+4
| | | | | | | | Obviously, converting a mixed queue to disk does take some time and the values are deliberately set low to save memory because on this transition, the disk_queue mailbox will go insane and eat lots of memory very quickly. But it seems about the right balance. I'll add documentation next
* | Reworked. Because the disk->mixed transition doesn't eat up any ram, there ↵Matthew Sackman2009-07-038-212/+435
| | | | | | | | | | | | is no need for the emergency tokens, nor any need for the weird doubling. So it's basically got much simpler. We hold two queues, one of hibernating queues (ordered by when they hibernated) and another priority_queue of lowrate queues (ordered by the amount of memory allocated to them). We evict to disk from the hibernated and then the lowrate queues in their relevant orders. Seems to work. Oh and disk_queue is now managed by the tokens too.
* | report memory:Matthew Sackman2009-07-031-5/+5
| | | | | | | | | | | | a) when we're not hibernating, every 10 seconds b) immediately prior to hibernating c) as soon as we stop hibernating
* | wip, dnc.Matthew Sackman2009-07-022-27/+89
| |
* | cosmeticMatthew Sackman2009-07-021-2/+2
| |
* | well if we're going to not actually pull messages off disk when going to ↵Matthew Sackman2009-07-023-84/+29
| | | | | | | | mixed mode, we may as well do it really lazily and not bother with any communication with the disk_queue. We just have a token in the queue which indicates how many messages we are expecting to get from the disk queue. This makes disk -> mixed almost instantaneous. This also means that performance is not initially brilliant. Maybe we need some way of the queue knowing that both it and the disk_queue are idle and deciding to prefetch. Even batching could work well. It's an endless trade off between getting operations to happen quickly and being able to get good performance. Dunno what the third thing is, probably not necessary, as you can't even have both of those, let alone pick 2 from 3!
* | Sorted out the timer versus hibernate binary backoff. The trick is to use ↵Matthew Sackman2009-07-021-10/+17
| | | | | | | | apply_after, not apply interval, and then after reporting memory use, don't set a new timer going (but do set a new timer going on every other message (other than timeouts)). This means that if nothing is going on, after a memory report, the process can wait as long as it needs to before the hibernate timeout fires.
* | merge in from 21087. Behaviour is now broken because the timeout can get > ↵Matthew Sackman2009-07-029-60/+136
|\ \ | |/ | | | | 10seconds which means the memory_report timer will always fire and reset the timeout - thus the queue process will never hibernate.
| * Done. In order to keep the code simple, the detection of naptime is done in ↵Matthew Sackman2009-07-021-6/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reply and noreply functions. This means that the now() value there includes computation relating to the last message in. This is maybe not desirable, but the alternative is to wrap all of handle_cast, handle_call and handle_info. Nevertheless, testing shows this works: in the erlang client: Conn = amqp_connection:start("guest", "guest", "localhost"), Chan = lib_amqp:start_channel(Conn), [begin Q = list_to_binary(integer_to_list(R)), Q = lib_amqp:declare_queue(Chan, Q) end || R <- lists:seq(1,1000)], Props = (amqp_util:basic_properties()). [begin Q = list_to_binary(integer_to_list(R)), ok = lib_amqp:publish(Chan, <<"">>, Q, <<0:(8*1024)>>, Props) end || _ <- lists:seq(1,1500), R <- lists:seq(1,1000)]. Then, after that lot's gone in, in a shell do: watch -n 2 "time ./scripts/rabbitmqctl list_queues | tail" The times for me start off at about 2.3 seconds, then drop rapidly to 1.4 and then 0.2 seconds and stay there.
| * merge bug21060 into defaultMatthias Radestock2009-07-012-13/+54
| |\
| | * better exception tagMatthias Radestock2009-07-011-1/+1
| | |
| | * fix another off-by-one errorMatthias Radestock2009-07-011-1/+1
| | |
| | * foldMatthias Radestock2009-07-011-9/+6
| | |
| | * Fix off-by-one error (discovered by Matthias)Tony Garnock-Jones2009-07-011-1/+1
| | |
| | * cosmeticMatthias Radestock2009-07-011-23/+19
| | |
| | * Convenience rabbit_basic functions.Tony Garnock-Jones2009-06-222-11/+59
| |/
| * Updated portfile for 1.6.0 releaseTim Clark2009-06-181-4/+4
| |
| * tabs -> spacesMatthias Radestock2009-06-186-25/+25
| |
* | When converting to disk mode, use tx_publish and tx_commit instead of ↵Matthew Sackman2009-07-011-9/+16
| | | | | | | | publish. This massively reduces the number of sync calls to disk_queue, potentially to one, if every message in the queue is non persistent (or the queue is non durable).
* | Well after all that pain, simply doing the disk queue tests first seems to ↵Matthew Sackman2009-07-011-3/+1
| | | | | | | | solve the problems. I don't quite buy this though, as all I was doing was stopping and starting the app so I don't understand why this was affecting the clustering configuration or causing issues _much_ further down the test line. But still, it seems to be repeatedly passing for me atm.
* | merge, but it still doesn't work. Sometimes it blows up on clustering with ↵Matthew Sackman2009-06-301-5/+14
|\ \ | | | | | | | | | "All replicas on diskfull nodes are not active yet".
| * | Well, this seems to work.Matthew Sackman2009-06-301-4/+13
| | |
| * | and now clustering seems to work again...Matthew Sackman2009-06-301-4/+4
| | |
* | | changed disk -> mixed mode so that messages stay on disk and don't get read. ↵Matthew Sackman2009-06-303-50/+71
| | | | | | | | | | | | This means the conversion is much faster than it was which is a good thing, at the cost of slower initial delivery.
* | | just adding timing to the dump testMatthew Sackman2009-06-301-7/+6
| | |
* | | doh!Matthew Sackman2009-06-301-1/+1
| | |
* | | mmmm. It maybe sort of works. Needs work thoughMatthew Sackman2009-06-296-85/+181
| | |
* | | merging in from 20470Matthew Sackman2009-06-261-14/+24
|\ \ \
| * | | Had been thinking about this optimisation for a while but someone mentioned ↵Matthew Sackman2009-06-261-16/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it to me yesterday at the Erlang Factory conference. When you sync, you know that everything up to the current state of the file is sync'd. Given that we're always appending, we know that any message before the current length of the file is available. Thus when we're reading messages from the current write file, even if the file is dirty, we don't need to sync unless the message we're reading is beyond the length of the file at the last sync. This can be very effective, for example, if there are a few hundred messages in the queue and then you're reading and writing to the queue at the same rate, then this will mean that rather than doing a sync for every read, we now only sync once per size of queue (altitude or ramp size). Sure enough, my publish_one_in_one_out_receive(1000) (altitude of 1000, then 5000 @ one in, one out) reduces from 6089 calls to fsync to 21, and from 15.4 seconds to 3.6. It's also possible to apply the same optimisation in tx_commit - not only do we now return immediately if the current file is not dirty or if none of the messages in the txn are in the current file, but we can also return immediately if the current file is dirty and messages are in the current file, but they're all below the last sync file size. Surprising very little extra code needed.
* | | | some more scaffolding for tokensMatthew Sackman2009-06-242-40/+99
| | | |
* | | | Changed reports so that we get bytes gained and lost since the last report.Matthew Sackman2009-06-245-99/+136
| | | | | | | | | | | | | | | | | | | | | | | | Also, the sync version of publish is unnecessary as we were only ever using it in one place where we threw away the result. Thus even when publishing a message and marking it delivered in one up (as opposed to publish_delivered, which is quite different ;) ), we can make it cast, not call, as we don't need the acktag. Also, the memory accounting was wrong for requeue in mixed_queue because requeue doesn't actually change the memory sizes (memory goes up on (tx_)publish, and down on ack/tx_cancel. Requeue has no effect. Nor does deliver.).