diff options
author | Kim van der Riet <kpvdr@apache.org> | 2013-02-28 16:14:30 +0000 |
---|---|---|
committer | Kim van der Riet <kpvdr@apache.org> | 2013-02-28 16:14:30 +0000 |
commit | 9c73ef7a5ac10acd6a50d5d52bd721fc2faa5919 (patch) | |
tree | 2a890e1df09e5b896a9b4168a7b22648f559a1f2 /cpp/design_docs | |
parent | 172d9b2a16cfb817bbe632d050acba7e31401cd2 (diff) | |
download | qpid-python-asyncstore.tar.gz |
Update from trunk r1375509 through r1450773asyncstore
git-svn-id: https://svn.apache.org/repos/asf/qpid/branches/asyncstore@1451244 13f79535-47bb-0310-9956-ffa450edef68
Diffstat (limited to 'cpp/design_docs')
-rw-r--r-- | cpp/design_docs/broker-acl-work.txt | 152 | ||||
-rw-r--r-- | cpp/design_docs/new-ha-design.txt | 119 |
2 files changed, 178 insertions, 93 deletions
diff --git a/cpp/design_docs/broker-acl-work.txt b/cpp/design_docs/broker-acl-work.txt new file mode 100644 index 0000000000..e89e446a56 --- /dev/null +++ b/cpp/design_docs/broker-acl-work.txt @@ -0,0 +1,152 @@ +-*-org-*- +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +The broker is accumulating ACL features and additions. This document describes the features and some of the strategies and decisions made along the way. + +These changes are not coordinated with the Java Broker. + +Queue Limit Property Settings +============================= + +Customer Goal: Prevent users from making queues too small or too big +in memory and on disk. + +* Add property limit settings to CREATE QUEUE Acl rules. + +User Option Acl Limit Property Units +--------------- ---------------------- --------------- +qpid.max_size queuemaxsizelowerlimit bytes + queuemaxsizeupperlimit bytes +qpid.max_count queuemaxcountlowerlimit messages + queuemaxcountupperlimit messages +qpid.file_size filemaxsizelowerlimit pages (64Kb per page) + filemaxsizeupperlimit pages (64Kb per page) +qpid.file_count filemaxcountlowerlimit files + filemaxcountupperlimit files + + +* Change rule match behavior to accomodate limit settings + +** Normal properties upon seeing a mismatch cause the Acl rule processor to go on to the next rule. Property limit settings do not cause a rule mismatch. +** When property limit checks are violated the effect is to demote an allow rule into a deny rule. Property limit checks are ignored in deny rules. + +Routingkey Wildcard Match +========================= + +Customer Goal: Allow users to bind, unbind, access, and publish with wildcards in the routingkey property. A single trailing * wildcard match is insufficient. + +* Acl rule processing uses the broker's topic exchange match logic when matching any exchange rule with a "routingkey" property. + +* Acl rule writers get to use the same rich matching logic that the broker uses when, for instance, it decides which topic exchange binding keys satisfy an incoming message's routing key. + +User Name and Domain Name Symbol Substitution +============================================= + +Customer Goal: Create rules that allow users to access resources only when the user's name is embedded in the resource name. + +* The Acl rule processor defines keywords which are substituted with the user's user and domain name. + +* User name substitution is allowed in the Acl file anywhere that text is supplied for a property value. + +In the following table an authenticated user bob@QPID.COM has his substitution keywords expanded. + +| Keyword | Expansion | +|---------------+--------------| +| ${userdomain} | bob_QPID_COM | +| ${user} | bob | +| ${domain} | QPID_COM | + +* User names are normalized by changing asterisk '*' and period '.' to underscores. This allows substitution to work with routingkey specfications. + +* The Acl processor matches ${userdomain} before matching either ${user} or ${domain}. Rules that specify ${user}_${domain} will never match. + +Resource Quotas +=============== + +The Acl module provides broker command line switches to limit users' access to queues and connections. + +| Command Line Option | Specified Quota | Default | +|------------------------------+--------------------------+---------| +| --max-connections-per-user N | connections by user name | 0 | +| --max-connections-per-IP N | connections by host name | 0 | +| --max-queues-per-user N | queues by user name | 0 | + +* Allowed values for N are 0..65535 + +* An option value of zero (0) disables that limit check. + +* Connections per-user are counted using the authenticated user name. The user may be logged in from any location but resource counts are aggregated under the user's name. + +* Connections per-IP are identified by the <broker-ip><broker-port>-<client-ip><client-port> tuple. This is the same string used by broker management to index connections. + +** With this scheme hosts may be identified by several names such as localhost, 127.0.0.1, or ::1. A separate counted set of connections is allowed for each name. + +** Connections per-ip are counted regardless of the credentials provided with each connection. A user may be allowed 20 connections but if the per-ip limit is 5 then that user may connect from any single host only five times. + +Acl Management Interface +======================== + +* Acl Lookup Query Methods + +The Acl module provides two QMF management methods that allow users to query the Acl authorization interface. + + Method: Lookup + Argument Type Direction Unit Description + ======================================================== + userId long-string I + action long-string I + object long-string I + objectName long-string I + propertyMap field-table I + result long-string O + + Method: LookupPublish + Argument Type Direction Unit Description + ========================================================= + userId long-string I + exchangeName long-string I + routingKey long-string I + result long-string O + +The Lookup method is a general query for any action, object, and set of properties. +The LookupPublish method is the optimized, per-message fastpath query. + +In both methods the result is one of: allow, deny, allow-log, or deny-log. + +Example: + +The upstream Jira https://issues.apache.org/jira/browse/QPID-3918 has several attachment files that demonstrate how to use the query feature. + + acl-test-01.rules.acl is the Acl file to run in the qpidd broker. + acl-test-01.py is the test script that queries the Acl. + acl-test-01.log is what the console prints when the test script runs. + +The script performs 355 queries using the Acl Lookup query methods. + +* Management Properties and Statistics + +The following properties and statistics have been added to reflect command line settings in effect and Acl quota denial activity. + +Element Type Access Unit Notes Description +================================================================================================== +maxConnections uint16 ReadOnly Maximum allowed connections +maxConnectionsPerIp uint16 ReadOnly Maximum allowed connections +maxConnectionsPerUser uint16 ReadOnly Maximum allowed connections +maxQueuesPerUser uint16 ReadOnly Maximum allowed queues +connectionDenyCount uint64 Number of connections denied +queueQuotaDenyCount uint64 Number of queue creations denied diff --git a/cpp/design_docs/new-ha-design.txt b/cpp/design_docs/new-ha-design.txt index acca1720b4..df6c7242eb 100644 --- a/cpp/design_docs/new-ha-design.txt +++ b/cpp/design_docs/new-ha-design.txt @@ -84,12 +84,6 @@ retry on a single address to fail over. Alternatively we will also support configuring a fixed list of broker addresses when qpid is run outside of a resource manager. -Aside: Cold-standby is also possible using rgmanager with shared -storage for the message store (e.g. GFS). If the broker fails, another -broker is started on a different node and and recovers from the -store. This bears investigation but the store recovery times are -likely too long for failover. - ** Replicating configuration New queues and exchanges and their bindings also need to be replicated. @@ -109,13 +103,9 @@ configuration. Explicit exchange/queue qpid.replicate argument: - none: the object is not replicated - configuration: queues, exchanges and bindings are replicated but messages are not. -- messages: configuration and messages are replicated. - -TODO: provide configurable default for qpid.replicate +- all: configuration and messages are replicated. -[GRS: current prototype relies on queue sequence for message identity -so selectively replicating certain messages on a given queue would be -challenging. Selectively replicating certain queues however is trivial.] +Set configurable default all/configuration/none ** Inconsistent errors @@ -125,12 +115,13 @@ eliminates the need to stall the whole cluster till an error is resolved. We still have to handle inconsistent store errors when store and cluster are used together. -We have 2 options (configurable) for handling inconsistent errors, +We have 3 options (configurable) for handling inconsistent errors, on the backup that fails to store a message from primary we can: - Abort the backup broker allowing it to be re-started. - Raise a critical error on the backup broker but carry on with the message lost. -We can configure the option to abort or carry on per-queue, we -will also provide a broker-wide configurable default. +- Reset and re-try replication for just the affected queue. + +We will provide some configurable options in this regard. ** New backups connecting to primary. @@ -156,8 +147,8 @@ In backup mode, brokers reject connections normal client connections so clients will fail over to the primary. HA admin tools mark their connections so they are allowed to connect to backup brokers. -Clients discover the primary by re-trying connection to the client URL -until the successfully connect to the primary. In the case of a +Clients discover the primary by re-trying connection to all addresses in the client URL +until they successfully connect to the primary. In the case of a virtual IP they re-try the same address until it is relocated to the primary. In the case of a list of IPs the client tries each in turn. Clients do multiple retries over a configured period of time @@ -168,12 +159,6 @@ is a separate broker URL for brokers since they often will connect over a different network. The broker URL has to be a list of real addresses rather than a virtual address. -Brokers have the following states: -- connecting: Backup broker trying to connect to primary - loops retrying broker URL. -- catchup: Backup connected to primary, catching up on pre-existing configuration & messages. -- ready: Backup fully caught-up, ready to take over as primary. -- primary: Acting as primary, serving clients. - ** Interaction with rgmanager rgmanager interacts with qpid via 2 service scripts: backup & @@ -190,8 +175,8 @@ the primary state. Backups discover the primary, connect and catch up. *** Failover -primary broker or node fails. Backup brokers see disconnect and go -back to connecting mode. +primary broker or node fails. Backup brokers see the disconnect and +start trying to re-connect to the new primary. rgmanager notices the failure and starts the primary service on a new node. This tells qpidd to go to primary mode. Backups re-connect and catch up. @@ -225,71 +210,30 @@ to become a ready backup. ** Interaction with the store. -Clean shutdown: entire cluster is shut down cleanly by an admin tool: -- primary stops accepting client connections till shutdown is complete. -- backups come fully up to speed with primary state. -- all shut down marking stores as 'clean' with an identifying UUID. - -After clean shutdown the cluster can re-start automatically as all nodes -have equivalent stores. Stores starting up with the wrong UUID will fail. - -Stored status: clean(UUID)/dirty, primary/backup, generation number. -- All stores are marked dirty except after a clean shutdown. -- Generation number: passed to backups and incremented by new primary. - -After total crash must manually identify the "best" store, provide admin tool. -Best = highest generation number among stores in primary state. - -Recovering from total crash: all brokers will refuse to start as all stores are dirty. -Check the stores manually to find the best one, then either: -1. Copy stores: - - copy good store to all hosts - - restart qpidd on all hosts. -2. Erase stores: - - Erase the store on all other hosts. - - Restart qpidd on the good store and wait for it to become primary. - - Restart qpidd on all other hosts. - -Broker startup with store: -- Dirty: refuse to start -- Clean: - - Start and load from store. - - When connecting as backup, check UUID matches primary, shut down if not. -- Empty: start ok, no UUID check with primary. +Needs more detail: -* Current Limitations +We want backup brokers to be able to user their stored messages on restart +so they don't have to download everything from priamary. +This requires a HA sequence number to be stored with the message +so the backup can identify which messages are in common with the primary. -(In no particular order at present) +This will work very similarly to the way live backups can use in-memory +messages to reduce the download. -For message replication: +Need to determine which broker is chosen as initial primary based on currency of +stores. Probably using stored generation numbers and status flags. Ideally +automated with rgmanager, or some intervention might be reqiured. -LM1a - On failover, backups delete their queues and download the full queue state from the -primary. There was code to use messags already on the backup for re-synchronisation, it -was removed in early development (r1214490) to simplify the logic while getting basic -replication working. It needs to be re-introduced. +* Current Limitations -LM1b - This re-synchronisation does not handle the case where a newly elected primary is *behind* -one of the other backups. To address this I propose a new event for restting the sequence -that the new primary would send out on detecting that a replicating browser is ahead of -it, requesting that the replica revert back to a particular sequence number. The replica -on receiving this event would then discard (i.e. dequeue) all the messages ahead of that -sequence number and reset the counter to correctly sequence any subsequently delivered -messages. +(In no particular order at present) -LM2 - There is a need to handle wrap-around of the message sequence to avoid -confusing the resynchronisation where a replica has been disconnected -for a long time, sufficient for the sequence numbering to wrap around. +For message replication (missing items have been fixed) LM3 - Transactional changes to queue state are not replicated atomically. -LM4 - Acknowledgements are confirmed to clients before the message has been -dequeued from replicas or indeed from the local store if that is -asynchronous. - -LM5 - During failover, messages (re)published to a queue before there are -the requisite number of replication subscriptions established will be -confirmed to the publisher before they are replicated, leaving them -vulnerable to a loss of the new primary before they are replicated. +LM4 - (No worse than store) Acknowledgements are confirmed to clients before the message +has been dequeued from replicas or indeed from the local store if that is asynchronous. LM6 - persistence: In the event of a total cluster failure there are no tools to automatically identify the "latest" store. @@ -323,21 +267,11 @@ case (b) can be addressed in a simple manner through tooling but case (c) would require changes to the broker to allow client to simply determine when the command has fully propagated. -LC3 - Queues that are not in the query response received when a -replica establishes a propagation subscription but exist locally are -not deleted. I.e. Deletion of queues/exchanges while a replica is not -connected will not be propagated. Solution is to delete any queues -marked for propagation that exist locally but do not show up in the -query response. - LC4 - It is possible on failover that the new primary did not previously receive a given QMF event while a backup did (sort of an analogous situation to LM1 but without an easy way to detect or remedy it). -LC5 - Need richer control over which queues/exchanges are propagated, and -which are not. - LC6 - The events and query responses are not fully synchronized. In particular it *is* possible to not receive a delete event but @@ -356,12 +290,11 @@ LC6 - The events and query responses are not fully synchronized. LC7 Federated links from the primary will be lost in failover, they will not be re-connected on the new primary. Federation links to the primary can fail over. -LC8 Only plain FIFO queues can be replicated. LVQs and ring queues are not yet supported. - LC9 The "last man standing" feature of the old cluster is not available. * Benefits compared to previous cluster implementation. +- Allows per queue/exchange control over what is replicated. - Does not depend on openais/corosync, does not require multicast. - Can be integrated with different resource managers: for example rgmanager, PaceMaker, Veritas. - Can be ported to/implemented in other environments: e.g. Java, Windows |