| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we are (wrongly) marked down, we need to go into the waiting-for-healthy
state and verify that our network interfaces are working before trying to
rejoin the cluster.
- make _is_healthy() check require positive proof of pings working
- do heartbeat checks and updates in this state
- reset the random peers every heartbeat_interval, in case we keep picking
bad ones
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
| |
is_unhealthy() will assume they are healthy for some period after we
send our first ping attempt. is_healthy() is now a strict check that we
know they are healthy.
Switch the failure report check to use is_unhealthy(); use is_healthy()
everywhere else, including the waiting-for-healthy pre-boot checks.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
| |
If a (say, random) peer goes down, filter it out.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
| |
We will soon be in this method for the waiting-for-healthy state. As
a consequence, we need to remove any down peers.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
| |
- always include our neighbors to ensure we have a fully-connected
graph
- include some random neighbors to get at least some min number of peers.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
| |
For now we still only look at the internal heartbeats.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
| |
sub_want() returns true if this is a new sub; only renew then.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
| |
This has a slight behavior change in that we ask the mon for the latest
osdmap if our internal heartbeat is failing. That isn't useful yet, but
will be shortly.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
| |
Grr.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Otherwise, the links might be ordered after the in progress
operation tag write. We need the in progress operation tag to
correctly recover from an interrupted merge, split, or col_split.
Fixes: #5180
Backport: cuttlefish, bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |\
| | |
| | |
| | |
| | | |
Fixes: #5159
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes: #5152
When iterating through usage entries, and when user id was
provided, we started at the user's first entry and not from
the entry indexed by the request start time.
This commit fixes the issue.
Backport: bobtail
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |\ \
| | | |
| | | | |
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We don't need it after all. If we are in the middle of some proposal,
then we guarantee that said proposal is likely to be retried. If we
haven't yet proposed, then it's forever more likely that a client will
eventually retry the message that triggered this proposal.
Basically, this mechanism attempted at fixing a non-problem, and was in
fact triggering some unforeseen issues that would have required increasing
the code complexity for no good reason.
Fixes: #5102
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
By finishing these Contexts, we make sure the Contexts they enclose (to be
called once the proposal goes through) will behave as their were initially
planned: for instance, a C_Command() may retry the command if a -EAGAIN
is passed to 'finish_contexts', while a C_Trimmed() will simply set
'going_to_trim' to false.
This aims at fixing at least a bug in which Paxos will stop trimming if an
election is triggered while a trim is queued but not yet finished. Such
happens because it is the C_Trimmed() context that is responsible for
resetting 'going_to_trim' back to false. By clearing all the contexts on
the proposal list instead of finishing them, we stay forever unable to
trim Paxos again as 'going_to_trim' will stay True till the end of time as
we know it.
Fixes: #4895
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | | |
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fix bug introduced in 27381c0c6259ac89f5f9c592b4bfb585937a1cfc.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fix a few bugs introduced by 27381c0c6259ac89f5f9c592b4bfb585937a1cfc:
- check against both front and back cons; either one may have failed.
- close *both* front and back before reopening either. this is
overkill, but slightly simpler code.
- fix leak of con when marking down
- handle race against osdmap update and note_down_osd
Fixes: #5172
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \ \ \
| | | | | |
| | | | | | |
Fix some smaller Python issues
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Cast output of _check_output() to str() to be able to use
str.split().
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | | |
kv_flat_btree_async.cc: fix AioCompletion resource leak
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Call AioCompletion::release() if the completion is no longer needed.
CID 727978 (#1-2 of 2): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable "obj_aioc" going out of scope leaks the
storage it points to.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | | |
kv_flat_btree_async.cc: fix AioCompletion resource leak
|
| |/ / / / / /
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Call AioCompletion::release() if the completion is no longer
needed.
CID 727980 (#1-4 of 4): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable "aioc" going out of scope leaks
the storage it points to.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
|\ \ \ \ \ \ \
| |_|_|/ / / /
|/| | | | | | |
kv_flat_btree_async.cc: fix AioCompletion resource leak
|
| |/ / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Call AioCompletion::release() if the completion is no longer needed.
CID 727979 (#1-2 of 2): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable "a" going out of scope leaks the storage
it points to.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The front hb addr entry may not be present.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \ \ \ \
| |/ / / / /
|/| | | | | |
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If ceph-mon segfault, socket file isn't removed.
By adding a remove in post-stop, upstart clean run directory properly.
Signed-off-by: Guilhem Lettron <guilhem@lettron.fr>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Now that the default pool flags have changed.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | | |
kv_flat_btree_async.cc: fix AioCompletion resource leak
|
| | |/ / / /
| |/| | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Call AioCompletion::release() if the completion is no longer
needed to free the resources.
CID 727981 (#3 of 3): Resource leak (RESOURCE_LEAK)
leaked_storage: Variable "top_aioc" going out of scope leaks the
storage it points to.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
|\ \ \ \ \ \
| |_|/ / / /
|/| | | | | |
kv_flat_btree_async.cc: fix resource leak
|
| |/ / / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Call AioCompletion::release() if the completion is no longer
needed to free the resources.
CID 727983 : Resource leak (RESOURCE_LEAK)
leaked_storage: Variable "aioc" going out of scope leaks the
storage it points to.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
|/ / / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fixes: #5160
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|\ \ \ \
| | | | |
| | | | | |
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Send ping requests to both the front and back hb addrs for peer osds. If
the front hb addr is not present, do not send it and interpret a reply
as coming from both. This handles the transition from old to new OSDs
seamlessly.
Note both the front and back rx times. Both need to be up to date in order
for the peer to be healthy.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This allows us to get the messenger associated with a connection.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We used to only need to avoid 2 ports; now we need 3. Make it a set so we
don't have this problem later.
Signed-off-by: Sage Weil <sage@inktank.com>
|