summaryrefslogtreecommitdiff
path: root/cpp/src/qpid/cluster/Cluster.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Race condition in cluster+management, inconsistent errors like:Alan Conway2010-07-231-23/+44
| | | | | | | | | | | | "confirmed < (2097+0) but only sent < (2096+0)" Management messages are generated if a managed objects properties have changed since the last update. Properties of the cluster object (members and status) were sometimes being changed outside the delivery context which could create inconsistencies in the cluster. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@966933 13f79535-47bb-0310-9956-ffa450edef68
* Fix bug in cluster with authentication: nodes exit with "unauthorized-access"Alan Conway2010-07-201-1/+1
| | | | | | | | | | Adding a node to a cluster on which authentication is enabled and on which there are existing connections authenticated with mechanisms other than anonymous, may result in nodes exiting the cluster with inconsistent authorisation errors. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@965979 13f79535-47bb-0310-9956-ffa450edef68
* Defer delivery of messages in cluster-unsafe context.Alan Conway2010-07-051-2/+33
| | | | | | | | | | | | | | | Messages enqueued in a cluster-safe context are synchronized across the cluster. However some messages are delivered in a cluster-unsafe context, for example raising a link established event occurs the connection thread of the establishing connection. This fix deferrs such messages by multicasting them so they can be re-delived in a cluster safe context. See https://bugzilla.redhat.com/show_bug.cgi?id=611543 git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@960681 13f79535-47bb-0310-9956-ffa450edef68
* Fix cluster broker crashes when management is active.Alan Conway2010-06-221-1/+12
| | | | | | | | | | | | Cluser brokers were exiting with errors "modified cluster state outside cluster context" and "confirmed < (50+0) but only sent < (49+0)" Fix was to: - delay completion of incoming update till update connection closes. - delay addding new connections to managment until connection is announced. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@956882 13f79535-47bb-0310-9956-ffa450edef68
* Bug 603835 - cluster_tests.test_management failing.Alan Conway2010-06-161-2/+1
| | | | | | | | | | | Clean up connections causing extra connection objects in the mangement agent map. - update connection was not being closed. - connections belonging to members that left the cluster were not fully cleaned up Also fixed test errors making failover_soak fail sporadically. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@955370 13f79535-47bb-0310-9956-ffa450edef68
* This change is a partial fix for the problem of long cluster failovers. ↵Michael Goulish2010-06-151-1/+3
| | | | | | | | | | | This change addresses part of that problem: long newbie-broker catchup times. At this time, it looks as though there are two distinct mechanisms causing these long catchup times. They roughly divide into a population of times less than 20 seconds, and a population of much longer times (up to hundreds of seconds). This fix addresses only the shorter times. In a test of 25 failovers before and after the change, it reduced those times by an average of nearly 50%. A T-test on the times indicates that the likelihood of the observed change happening randomly is less than 2%. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@954895 13f79535-47bb-0310-9956-ffa450edef68
* Fix "mismatched cluster-id" errors during start up.Alan Conway2010-05-251-20/+27
| | | | | | | | | Intermittent failure when starting a persistent cluster with all clean stores. Some brokers fail with: critical Unexpected error: Cluster-ID mismatch. Stores belong to different clusters. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@948143 13f79535-47bb-0310-9956-ffa450edef68
* Fix broker core dump during start-up caused by un-initialized mAgent pointer.Alan Conway2010-05-211-0/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@947081 13f79535-47bb-0310-9956-ffa450edef68
* Delay generating URL in cluster till global constructors to handle ↵Alan Conway2010-05-121-10/+13
| | | | | | multi-protocol URLs. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@943488 13f79535-47bb-0310-9956-ffa450edef68
* Correct brokertest.retry logic.Alan Conway2010-05-061-1/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@941852 13f79535-47bb-0310-9956-ffa450edef68
* Code cleanup Ted Ross2010-04-231-1/+0
| | | | | | | | | | - Removed IdAllocator (it's no longer needed) - Cleaned up the calls to ManagementAgent::addObject to handle durable objects - Removed the deferred call to addObject for durable objects - Removed unneeded calls to self._checkClosed() in qmf.console.Agent git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@937516 13f79535-47bb-0310-9956-ffa450edef68
* QPID-2527: Remove Thread::id member as its uses are better implemented by ↵Andrew Stitcher2010-04-211-3/+3
| | | | | | | | comparison operators. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@936537 13f79535-47bb-0310-9956-ffa450edef68
* Update cluster store status with a single atomic write() operation.Alan Conway2010-04-011-7/+13
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@930055 13f79535-47bb-0310-9956-ffa450edef68
* Cluster: remove un-necessary code for config-seq.Alan Conway2010-03-311-5/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@929616 13f79535-47bb-0310-9956-ffa450edef68
* Joining a cluster: don't push an empty store.Alan Conway2010-03-301-0/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@929278 13f79535-47bb-0310-9956-ffa450edef68
* Fix hanging failover_soak test.Alan Conway2010-03-301-0/+2
| | | | | | | | | Explictly turn off PollableQueue bypass mode during initialize. Implicitly turning it off in PollableQueue::start() led to incorrectly turning off bypass in a PRE_INIT client when an offer was processed. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@929276 13f79535-47bb-0310-9956-ffa450edef68
* Cluster logging improvements: log config changes in the deliver thread.Alan Conway2010-03-301-36/+41
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@929274 13f79535-47bb-0310-9956-ffa450edef68
* Raise ClusterTimer lateness threshold to reduce noisy warnings.Alan Conway2010-03-291-1/+3
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@928779 13f79535-47bb-0310-9956-ffa450edef68
* New cluster member pushes store when joining an active cluster.Alan Conway2010-03-121-24/+67
| | | | | | | | | | | | | | | | | | | Previously a broker with a clean store would not be able to join an active cluster because the shtudown-id did not match. This commit ensures that when a broker joins an active cluster, it always pushes its store regardless of status. Clean/dirty status is only compared when forming an initial cluster. This change required splitting initialization into two phases: PRE_INIT: occurs in the Cluster ctor during early-initialize. This phase determines whether or not to push the store. INIT: occurs after Cluster::initialize and does the remaining initialization chores. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@922412 13f79535-47bb-0310-9956-ffa450edef68
* QPID-2436: Fix cluster update of remote agents.Alan Conway2010-03-081-2/+6
| | | | | | | | | The v2key of cluster agents was not being passed as part of a cluster update. This meant they were not being associated with the correct shadow connections on the updatee. This caused inconsistencies that shut down the new broker. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@920414 13f79535-47bb-0310-9956-ffa450edef68
* Minor cleanup: removed unused parameter of initial-status in cluster.xml.Alan Conway2010-03-051-5/+3
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@919662 13f79535-47bb-0310-9956-ffa450edef68
* Don't generate debug snapshot messages unless debug logging enabled.Alan Conway2010-03-051-7/+6
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@919523 13f79535-47bb-0310-9956-ffa450edef68
* QPID-2412: Support for EXTERNAL mechanism on client-authenticated SSL ↵Gordon Sim2010-03-051-2/+7
| | | | | | | | | | | | connections. On SSL connection where the clients certificate is authenticated (requires the --ssl-require-client-authentication option at present), the clients identity will be taken from that certificate (it will be the CN with any DCs present appended as the domain, e.g. CN=bob,DC=acme,DC=com would result in an identity of bob@acme.com). This will enable the EXTERNAL mechanism when cyrus sasl is in use. The client can still negotiate their desired mechanism. There is a new option on the ssl module (--ssl-sasl-no-dict) that allows the options on ssl connections to be restricted to those that are not vulnerable to dictionary attacks (EXTERNAL being the primary example). git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@919487 13f79535-47bb-0310-9956-ffa450edef68
* Amended cluster error message for decoding failures.Alan Conway2010-03-031-1/+1
| | | | | | | | New message says "Error decoding events, may indicate a broker version mismatch" since version mismatch is the most likely causes of a decoding error. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@918679 13f79535-47bb-0310-9956-ffa450edef68
* Fix cluster abort on shutdown in ClusterTimer::fire.Alan Conway2010-02-261-1/+1
| | | | | | | | The cluster destructor was not deleting the ClusterTimer set on the broker, so this timer had a dangling pointer to the cluster. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@916751 13f79535-47bb-0310-9956-ffa450edef68
* Last member of a cluster always has clean store.Alan Conway2010-02-251-0/+6
| | | | | | | | | | When a cluster is reduced to a single broker, it marks its store as clean regardless of how it is shut down. If we're down to a single member we know we want to use its store to recover as there are no others. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@916475 13f79535-47bb-0310-9956-ffa450edef68
* Consistent connection names across a cluster.Alan Conway2010-02-051-3/+1
| | | | | | | | | - use the same host:port for connections and their shadows. - add shadow property to managment connection to identify shadows. - updated qpid-stat and qpid-cluster to filter on shadow property. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@907123 13f79535-47bb-0310-9956-ffa450edef68
* Synchronize management agent lists during cluster update.Alan Conway2010-02-051-2/+10
| | | | | | | | | - replicate management agent lists during cluster update. - suppress management agent output during update. - on join all members force full output at next periodic processing. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@907030 13f79535-47bb-0310-9956-ffa450edef68
* Fix cluster bug introduced in r905674, incorrect frame-sequence count.Alan Conway2010-02-021-1/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@905794 13f79535-47bb-0310-9956-ffa450edef68
* Cluster: fix update of failover exchange.Alan Conway2010-02-021-3/+3
| | | | | | | During update the cluster was sending an extra update to the failover exchange. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@905676 13f79535-47bb-0310-9956-ffa450edef68
* Cluster: debug snapshots of queue depth at broker join, help find ↵Alan Conway2010-02-021-4/+29
| | | | | | inconsistencies. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@905674 13f79535-47bb-0310-9956-ffa450edef68
* Replace PeriodicTimer with ClusterTimer, which inherits from Timer.Alan Conway2010-01-291-20/+28
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@904656 13f79535-47bb-0310-9956-ffa450edef68
* QPID_2634 Management updates in timer create inconsistencies in a cluster.Alan Conway2010-01-271-8/+9
| | | | | | | | | Cluster plugin provides a PeriodicTimer implementation to the broker which executes tasks in the cluster dispatch thread simultaneously across the cluster. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@903869 13f79535-47bb-0310-9956-ffa450edef68
* Cluster implementation of PeriodicTimer.Alan Conway2010-01-271-1/+18
| | | | | | | | | The cluster implementation multicast periodic-timer controls and executes the task when those controls are delivered, which is in the cluster delivery thread context and so consistent across the cluster. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@903867 13f79535-47bb-0310-9956-ffa450edef68
* In clustered broker: move construction of broker::Connections to the cluster ↵Alan Conway2010-01-271-16/+6
| | | | | | | | | | dispatch thread. Constructing a connection can involve sending management information so needs to be in the cluster dispatch context. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@903864 13f79535-47bb-0310-9956-ffa450edef68
* Fix cluster elder calculation to ensure unique elder.Alan Conway2010-01-271-5/+9
| | | | | | | | Race condition in the previous algorithm allowed several cluster members to consider themselves the elder. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@903826 13f79535-47bb-0310-9956-ffa450edef68
* Cluster-safe assertions.Alan Conway2010-01-201-0/+6
| | | | | | | | | Assert that replicated data structures are modified in a cluster-safe context - in cluster delivery thread or during update. Assertions added to Queue.cpp and SemanticState.cpp. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@901282 13f79535-47bb-0310-9956-ffa450edef68
* Fix broker crash "confirmed N but only sent M" with managed agents running.Alan Conway2010-01-111-1/+1
| | | | | | | | | | | | | | The broker's ManagementAgent caches schemas from managed agents. This cache was not being replicated to new cluster members. If an agent such as sesame was running and connected to a newly-joined broker, that broker could send schema request messages which were not sent by other brokers that had the schema in cache. This resulted in the other brokers exiting with a "confirmed N but only sent M" message. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@897955 13f79535-47bb-0310-9956-ffa450edef68
* Added config-seq counter to track config changes since cluster init.Alan Conway2010-01-061-22/+19
| | | | | | | | Config-seq is recorded persitently to help identify best store when recovering from total failure. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@896538 13f79535-47bb-0310-9956-ffa450edef68
* Exception handling for URL parsing in cluster.Alan Conway2010-01-061-5/+9
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@896537 13f79535-47bb-0310-9956-ffa450edef68
* Don't hold up broker initialization for cluster initialization.Alan Conway2010-01-061-32/+8
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@896536 13f79535-47bb-0310-9956-ffa450edef68
* QPID-2266: error sending update: Enqueue capacity threshold exceededAlan Conway2009-12-111-0/+2
| | | | | | | | | Fix for the problem with a test to verify that messages going to the store have the same headers and content-size for an updatee or a broker that receives the publish directly. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@889813 13f79535-47bb-0310-9956-ffa450edef68
* QPID-2253 - Cluster node shuts down with inconsistent error.Alan Conway2009-12-091-5/+4
| | | | | | | | | | Add a missing memberUpdate on the transition to CATCHUP mode. The inconsistent error was caused because the newly updated member did not have its membership updated and so was missing an failover update message that the existing members sent to a new client. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@888874 13f79535-47bb-0310-9956-ffa450edef68
* Cluster consistency: check for no clean store condition.Alan Conway2009-11-261-0/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@884612 13f79535-47bb-0310-9956-ffa450edef68
* Consistency checks for persistent cluster startup.Alan Conway2009-11-251-21/+12
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@884226 13f79535-47bb-0310-9956-ffa450edef68
* Verify stored cluster-id matches agreed cluster-id when joining a persistent ↵Alan Conway2009-11-241-20/+28
| | | | | | cluster. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@883910 13f79535-47bb-0310-9956-ffa450edef68
* Support for restarting a persistent cluster.Alan Conway2009-11-241-35/+101
| | | | | | | | | | | Option --cluster-size=N: members wait for N members before recovering store. Stores marked as clean/dirty. Automatically recover from clean store on restart. Stores marked with UUID to detect errors. Not yet implemented: consistency checks, manual recovery from all dirty stores. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@883842 13f79535-47bb-0310-9956-ffa450edef68
* Added cluster option --cluster-size.Alan Conway2009-11-181-5/+4
| | | | | | | | --cluster-size=N means that during start-up the cluster waits to have N members before accepting any clients. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@881839 13f79535-47bb-0310-9956-ffa450edef68
* Integrated InitialStatusMap into cluster code.Alan Conway2009-11-171-47/+71
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@881423 13f79535-47bb-0310-9956-ffa450edef68
* cluster::InitialStatusMap and unit tests, support for improved cluster join ↵Alan Conway2009-11-171-3/+14
| | | | | | protocol. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk/qpid@881420 13f79535-47bb-0310-9956-ffa450edef68