diff options
| author | Matthew Sackman <matthew@lshift.net> | 2010-01-21 15:07:06 +0000 |
|---|---|---|
| committer | Matthew Sackman <matthew@lshift.net> | 2010-01-21 15:07:06 +0000 |
| commit | 4361c141f301b766b5411bd400a4ee240db11138 (patch) | |
| tree | b1e0c65df66e089aecfb7b196229badf87aa674c /src | |
| parent | 686708b6ee829b5c2e9d2863eddbeab05d0ac8b1 (diff) | |
| download | rabbitmq-server-git-4361c141f301b766b5411bd400a4ee240db11138.tar.gz | |
Wrote new documentation for VQ
Diffstat (limited to 'src')
| -rw-r--r-- | src/rabbit_variable_queue.erl | 122 |
1 files changed, 92 insertions, 30 deletions
diff --git a/src/rabbit_variable_queue.erl b/src/rabbit_variable_queue.erl index 42a01577b0..c11489d1b8 100644 --- a/src/rabbit_variable_queue.erl +++ b/src/rabbit_variable_queue.erl @@ -39,6 +39,98 @@ flush_journal/1, status/1]). %%---------------------------------------------------------------------------- +%% Definitions: + +%% alpha: this is a message where both the message itself, and its +%% position within the queue are held in RAM +%% +%% beta: this is a message where the message itself is only held on +%% disk, but its position within the queue is held in RAM. +%% +%% gamma: this is a message where the message itself is only held on +%% disk, but its position is both in RAM and on disk. +%% +%% delta: this is a collection of messages, represented by a single +%% term, where the messages and their position are only held on +%% disk. +%% +%% Note that for persistent messages, the message and its position +%% within the queue are always held on disk, *in addition* to being in +%% one of the above classifications. + +%% Also note that within this code, the term gamma never +%% appears. Instead, gammas are defined by betas who have had their +%% queue position recorded on disk. + +%% In general, messages move q1 -> q2 -> delta -> q3 -> q4, though +%% many of these steps are frequently skipped. q1 and q4 only hold +%% alphas, q2 and q3 hold both betas and gammas (as queues of queues, +%% using the bpqueue module where the block prefix determines whether +%% they're betas or gammas). When a message arrives, its +%% classification is determined. It is then added to the rightmost +%% appropriate queue. + +%% If a new message is determined to be a beta or gamma, q1 is +%% empty. If a new message is determined to be a delta, q1 and q2 are +%% empty (and actually q4 too). + +%% When removing messages from a queue, if q4 is empty then q3 is read +%% directly. If q3 becomes empty then the next segment's worth of +%% messages from delta are read into q3, reducing the size of +%% delta. If the queue is non empty, either q4 or q3 contain +%% entries. It is never permitted for delta to hold all the messages +%% in the queue. + +%% The duration indicated to us by the memory_monitor is used to +%% calculate, given our current ingress and egress rates, how many +%% messages we should hold in RAM. When we need to push alphas to +%% betas or betas to gammas, we favour writing on messages that are +%% further from the head of the queue. This minimises writes to disk, +%% as the messages closer to the tail of the queue stay in the queue +%% for longer, thus do not need to be replaced as quickly by sending +%% other messages to disk. + +%% Whilst messages are pushed to disk and forgotten from RAM as soon +%% as requested by a new setting of the queue RAM duration, the +%% inverse is not true: we only load messages back into RAM as +%% demanded as the queue is read from. Thus only publishes to the +%% queue will take up available spare capacity. + +%% If a queue is full of transient messages, then the transition from +%% betas to deltas will be potentially very expensive as millions of +%% entries must be written to disk by the queue_index module. This can +%% badly stall the queue. In order to avoid this, the proportion of +%% gammas / (betas+gammas) must not be lower than (betas+gammas) / +%% (alphas+betas+gammas). Thus as the queue grows, and the proportion +%% of alphas shrink, the proportion of gammas will grow, thus at the +%% point at which betas and gammas must be converted to deltas, there +%% should be very few betas remaining, thus the transition is fast (no +%% work needs to be done for the gamma -> delta transition). + +%% The conversion of betas to gammas is done on publish, in batches of +%% exactly ?RAM_INDEX_BATCH_SIZE. This value should not be too small, +%% otherwise the frequent operations on the queues of q2 and q3 will +%% not be effectively amortised, nor should it be too big, otherwise a +%% publish will take too long as it attempts to do too much work and +%% thus stalls the queue. Therefore, it must be just right. This +%% approach is preferable to doing work on a new queue-duration +%% because converting all the indicated betas to gammas at that point +%% can be far too expensive, thus requiring batching and segmented +%% work anyway, and furthermore, if we're not getting any publishes +%% anyway then the queue is either being drained or has no +%% consumers. In the latter case, an expensive beta to delta +%% transition doesn't matter, and in the former case the queue's +%% shrinking length makes it unlikely (though not impossible) that the +%% duration will become 0. + +%% In the queue we only keep track of messages that are pending +%% delivery. This is fine for queue purging, but can be expensive for +%% queue deletion: for queue deletion we must scan all the way through +%% all remaining segments in the queue index (we start by doing a +%% purge) and delete messages from the msg_store that we find in the +%% queue index. + +%%---------------------------------------------------------------------------- -record(vqstate, { q1, @@ -89,36 +181,6 @@ %%---------------------------------------------------------------------------- -%% WRONG - UPDATE ME! - -%% Basic premise is that msgs move from q1 -> q2 -> delta -> q3 -> q4 -%% but they can only do so in the right form. q1 and q4 only hold -%% alphas (msgs in ram), q2 and q3 only hold betas (msg on disk, index -%% in ram), and delta is just a count of the number of index entries -%% on disk at that stage (msg on disk, index on disk). -%% -%% When a msg arrives, we decide in which form it should be. It is -%% then added to the right-most appropriate queue, maintaining -%% order. Thus if the msg is to be an alpha, it will be added to q1, -%% unless all of q2, delta and q3 are empty, in which case it will go -%% to q4. If it is to be a beta, it will be added to q2 unless delta -%% is empty, in which case it will go to q3. -%% -%% The major invariant is that if the msg is to be a beta, q1 will be -%% empty, and if it is to be a delta then both q1 and q2 will be empty. -%% -%% When taking msgs out of the queue, if q4 is empty then we read -%% directly from q3, or delta, if q3 is empty. If q3 and delta are -%% empty then we have an invariant that q2 must be empty because q2 -%% can only grow if delta is non empty. -%% -%% A further invariant is that if the queue is non empty, either q4 or -%% q3 contains at least one entry. I.e. we never allow delta to -%% contain all msgs in the queue. Also, if q4 is non empty and delta -%% is non empty then q3 must be non empty. - -%%---------------------------------------------------------------------------- - -ifdef(use_specs). -type(bpqueue() :: any()). |
