| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
merge_log and friends all take care of dirtying the log
as necessary.
Fixes: #5238
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 5deece1d034749bf72b7bd04e4e9c5d97e5ad6ce)
|
| |
|
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ce67c58db7d3e259ef5a8222ef2ebb1febbf7362)
Fixes: #5255
|
|
|
|
|
|
|
|
| |
uint64_t is passed in, but int was extracted. This fails on 32-bit builds.
Fixes: #5220
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit 17029b270dee386e12e5f42c2494a5feffd49b08)
|
|
|
|
|
|
|
|
| |
We were double-incrementing p, both in the for statement and in the
body. While we are here, drop the unnecessary else's.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit eb6d5fcf994d2a25304827d7384eee58f40939af)
|
|
|
|
|
|
| |
This was part of commit 27381c0c6259ac89f5f9c592b4bfb585937a1cfc.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the scenario:
- leader wins, peons lose
- leader sees it is too far behind on paxos and bootstraps
- leader tries to sync with someone, waits for a quorum of the others
- peons sit around forever waiting
The problem is that they never time out because paxos never issues a lease,
which is the normal timeout that lets them detect a leader failure.
Avoid this by starting the lease timeout as soon as we lose the election.
The timeout callback just does a bootstrap and does not rely on any other
state.
I see one possible danger here: there may be some "normal" cases where the
leader takes a long time to issue its first lease that we currently
tolerate, but won't with this new check in place. I hope that raising
the lease interval/timeout or reducing the allowed paxos drift will make
that a non-issue. If it is problematic, we will need a separate explicit
"i am alive" from the leader while it is getting ready to issue the lease
to prevent a live-lock.
Backport: cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit f1ccb2d808453ad7ef619c2faa41a8f6e0077bd9)
|
|
|
|
|
|
|
|
|
|
| |
If the client is not connected, discard the message. They will
reconnect and resend anyway, so there is no point in processing it
twice (now and later).
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit fb3cd0c2a8f27a1c8d601a478fd896cc0b609011)
|
|
|
|
|
|
|
| |
This allows us to get the messenger associated with a connection.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 92a558bf0e5fee6d5250e1085427bff22fe4bbe4)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- trim more at a time (by an order of magnitude)
- rename fields to paxos_trim_{min,max}; only trim when there are min items
that are trimmable, and trim at most max items at a time.
- adjust the paxos_service_trim_{min,max} values up by a factor of 2.
Since we are compacting every time we trim, adjusting these up mean less
frequent compactions and less overall work for the monitor.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 6b8e74f0646a7e0d31db24eb29f3663fafed4ecc)
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit a284c9ece85f11d020d492120be66a9f4c997416)
|
|
|
|
|
|
|
| |
Need to pass in cct.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 446e0770c77de5d72858dcf7a95c5b19f642cf98)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Put it on the heap so that we can destroy it before the g_ceph_context
cct that it references. This fixes a crash like
*** Caught signal (Segmentation fault) **
in thread 4034a80
ceph version 0.63-204-gcf9aa7a (cf9aa7a0037e56eada8b3c1bb59d59d0bfe7bba5)
1: ceph-mon() [0x59932a]
2: (()+0xfcb0) [0x4e41cb0]
3: (Mutex::Lock(bool)+0x1b) [0x6235bb]
4: (PerfCountersCollection::remove(PerfCounters*)+0x27) [0x6a0877]
5: (LevelDBStore::~LevelDBStore()+0x1b) [0x582b2b]
6: (LevelDBStore::~LevelDBStore()+0x9) [0x582da9]
7: (main()+0x1386) [0x48db16]
8: (__libc_start_main()+0xed) [0x658076d]
9: ceph-mon() [0x4909ad]
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit df2d06db6f3f7e858bdadcc8cd2b0ade432df413)
|
|
|
|
|
|
|
|
| |
Switch to using regular pointers here. The lifecycle of these services is
very simple such that refcounting is overkill.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c888d1d3f1b77e62d1a8796992e918d12a009b9d)
|
|
|
|
|
|
|
|
| |
This lets us run all the locally-scoped dtors so that leak checking will
work.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3c5706163b72245768958155d767abf561e6d96d)
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 7802292e0a49be607d7ba139b44d5ea1f98e07e6)
|
|
|
|
|
|
|
|
|
|
| |
When we trim items N to M, compact over range (N-1) to M so that the
items in the queue will share bounds and get merged. There is no harm in
compacting over a larger range here when the lower bound is a key that
doesn't exist anyway.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit a47ca583980523ee0108774b466718b303bd3f46)
|
|
|
|
|
|
|
|
| |
If we get behind and multiple adjacent ranges end up in the queue, merge
them so that we fire off compaction on larger ranges.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit f628dd0e4a5ace079568773edfab29d9f764d4f0)
|
|
|
|
|
|
|
|
|
|
|
| |
This will reduce the work that leveldb is asked to do by only triggering
compaction of the keys that were just trimmed.
We ma want to further reduce the work by compacting less frequently, but
this is at least a step in that direction.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6da4b20ca53fc8161485c8a99a6b333e23ace30e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a transaction to describe the compaction of a range of keys. Do this
in a backward compatible say, such that older code will interpret the
compaction of a prefix + range as compaction of the entire prefix. This
allows us to avoid introducing any new feature bits.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ab09f1e5c1305a64482ebbb5a6156a0bb12a63a4)
Conflicts:
src/mon/MonitorDBStore.h
|
|
|
|
|
| |
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit e20c9a3f79ccfeb816ed634ca25de29fc5975ea8)
|
|
|
|
|
|
|
|
|
|
|
| |
We generally do not want to block while compacting a range of leveldb.
Push the blocking+waiting off to a separate thread. (leveldb will do what
it can to avoid blocking internally; no reason for us to wait explicitly.)
This addresses part of #5176.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4af917d4478ec07734a69447420280880d775fa2)
|
|
|
|
|
|
|
|
| |
Some plana have non-world-readable crap in /usr/local/samba. Avoid
/usr/local entirely for that and any similar landmines.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 82211f2197241c4f3d3135fd5d7f0aa776eaeeb6)
|
|
|
|
|
|
| |
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d7e2ab1451e284cd4273cca47eec75e1d323f113)
|
|
|
|
|
|
| |
Fixes: #5216
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't actually need to write out the pg map epoch on every
activate_map as long as:
a) the osd does not trim past the oldest pg map persisted
b) the pg does update the persisted map epoch from time
to time.
To that end, we now keep a reference to the last map persisted.
The OSD already does not trim past the oldest live OSDMapRef.
Second, handle_activate_map will trim if the difference between
the current map and the last_persisted_map is large enough.
Fixes: #4731
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 2c5a9f0e178843e7ed514708bab137def840ab89)
Conflicts:
src/common/config_opts.h
src/osd/PG.cc
- last_persisted_osdmap_ref gets set in the non-static
PG::write_info
|
|
|
|
|
| |
Signed-off-by: Alexandre Marangone <alexandre.marangone@inktank.com>
(cherry picked from commit 851619ab6645967e5d7659d9b0eea63d5c402b15)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New pools won't be full. mon->pgmon()->pg_map.pg_pool_sum[poolid] will
implicitly create an entry for poolid causing register_new_pgs() to assume that
the newly created pgs in the new pool are in fact a result of a split
preventing MOSDPGCreate messages from being sent out.
Fixes: #4813
Backport: cuttlefish
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 0289c445be0269157fa46bbf187c92639a13db46)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #5209
Backport: bobtail, cuttlefish
If the head object wrongfully contains data, but according to the
manifest we don't read from the head, we shouldn't copy the prefetched
data. Also fix the length calculation for that data.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit c5fc52ae0fc851444226abd54a202af227d7cf17)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #5204
When copying object we ended up also copying the original
object idtag which overrode the newly generated one. When
refcount put is called with the wrong idtag the count
does't go down.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit b1312f94edc016e604f1d05ccfe2c788677f51d1)
|
|
|
|
|
|
|
|
|
|
| |
- do not use invoke-rc.d for upstart
- do not stop daemons on upgrade
- misc other cleanups
This corresponds to the state of master as of cf9aa7a.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a monitor is freshly created and for some reason its initial sync is
aborted, it will end up with an incorrect backup monmap. This monmap is
incorrect in the sense that it will not contain the monitor's names as
it will expect on the next run.
This results from us being using the quorum features to encode the monmap
when backing it up, instead of CEPH_FEATURES_ALL.
Fixes: #5203
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 626de387e617db457d6d431c16327c275b0e8a34)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a list-snaps operation on the snapdir, do not assume that the obc for the
head means the object exists. This fixes a race between a head deletion and
a list-snaps that wrongly returns ENOENT, triggered by the DiffItersateStress
test when thrashing OSDs.
Fixes: #5183
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 29e4e7e316fe3f3028e6930bb5987cfe3a5e59ab)
|
|
|
|
|
|
|
|
|
|
|
| |
If we use operator[] on a new int field its value is undefined; avoid
reading it or using |= et al until we initialize it.
Fixes: #4967
Backport: cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(cherry picked from commit 50ac8917f175d1b107c18ecb025af1a7b103d634)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, the links might be ordered after the in progress
operation tag write. We need the in progress operation tag to
correctly recover from an interrupted merge, split, or col_split.
Fixes: #5180
Backport: cuttlefish, bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5bca9c38ef5187c7a97916970a7fa73b342755ac)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need it after all. If we are in the middle of some proposal,
then we guarantee that said proposal is likely to be retried. If we
haven't yet proposed, then it's forever more likely that a client will
eventually retry the message that triggered this proposal.
Basically, this mechanism attempted at fixing a non-problem, and was in
fact triggering some unforeseen issues that would have required increasing
the code complexity for no good reason.
Fixes: #5102
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit e15d29094503f279d444eda246fc45c09f5535c9)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By finishing these Contexts, we make sure the Contexts they enclose (to be
called once the proposal goes through) will behave as their were initially
planned: for instance, a C_Command() may retry the command if a -EAGAIN
is passed to 'finish_contexts', while a C_Trimmed() will simply set
'going_to_trim' to false.
This aims at fixing at least a bug in which Paxos will stop trimming if an
election is triggered while a trim is queued but not yet finished. Such
happens because it is the C_Trimmed() context that is responsible for
resetting 'going_to_trim' back to false. By clearing all the contexts on
the proposal list instead of finishing them, we stay forever unable to
trim Paxos again as 'going_to_trim' will stay True till the end of time as
we know it.
Fixes: #4895
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 586e8c2075f721456fbd40f738dab8ccfa657aa8)
|
|
|
|
|
| |
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 2ff23fe784245f3b86bc98e0434b21a5318e0a7b)
|
|\
| |
| |
| |
| | |
Fixes: #5159
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes: #5152
When iterating through usage entries, and when user id was
provided, we started at the user's first entry and not from
the entry indexed by the request start time.
This commit fixes the issue.
Backport: bobtail
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
(cherry picked from commit 8b3a04dec8be13559716667d4b16cde9e9543feb)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
- prepend $local to the $allconf list at the top
- remove $local special case for all case
- fix the type prefix checks to explicitly check for prefixes
Fugly bash, but works!
Backport: cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit c80c6a032c8112eab4f80a01ea18e1fa2c7aa6ed)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We need to do df on the remote host, not locally.
Simlarly, the ceph command uses the osd key, which exists remotely; run it there.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d81d0ea5c442699570bd93a90bea0d97a288a1e9)
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We would need to do hostname -s on the remote node, not the local one.
But we already have $host; use it!
Reported-by: Xiaoxi Chen <xiaoxi.chen@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit caa15a34cb5d918c0c8b052cd012ec8a12fca150)
|
| |
| |
| |
| |
| |
| |
| | |
Put these in the cluster log; they are interesting.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 87767fb1fb9a52d11b11f0b641cebbd9998f089e)
|
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes: 5020
Backport: bobtail, cuttlefish
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(cherry picked from commit 72bf5f4813c273210b5ced7f7793bc1bf813690c)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In committed_thru, we use write_pos to reset the header.start value in cases
where seq is past the end of our journalq. It is therefore important that the
journalq be updated atomically with write_pos (that is, under the write_lock).
The call to align_bl() is moved into do_write in order to ensure that write_pos
is adjusted correctly prior to write_bl().
Also, we adjust pos at the end of write_bl() such that pos \in [get_top(),
header.max_size) after write_bl().
Fixes: #5020
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit eaf3abf3f9a7b13b81736aa558c9084a8f07fdbe)
|
| |
| |
| |
| |
| |
| |
| |
| | |
This will make for a simpler process for
http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c0268e27497a4d8228ef54da9d4ca12f3ac1f1bf)
|