| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
|
|
|
|
|
|
| |
See 96621bdb004e539a0186fb592f44d51cf49f1c31.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\
| |
| |
| |
| | |
mon: Early warning system for monitor stores growing over predefined threshold
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the store's size grows beyond what we believe to be reasonable, we must
let the user know that something fishy may be going on. This intends to
act as an early warning system for monitors suffering from leveldb
compaction issues. However, if the monitor's store is just growing a lot
due to normal cluster behaviour, we made sure that the warning threshold
is adjustable by tuning 'mon_leveldb_size_warn' (defaulting to 40GB).
Fixes: #5909
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
|
| |
| |
| |
| |
| |
| |
| | |
... and use it on DataHealthService.cc, instead of building our own
version of the classes' formatted output.
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On LevelDBStore, instead of using leveldb's GetApproximateSizes() function,
we will instead assess what's the store's raw size from the contents of
the store dir (this means .sst's, .log's, etc). The reason behind this
approach is that GetApproximateSizes() would expect us to provide a range
of keys for which to obtain an approximate size; on the other hand, what we
really want is to obtain the size of the store -- not the size of the
data (besides, with the compaction issues we've been seeing, we wonder
how reliable such approximation would be).
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
|
| |
| |
| |
| | |
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
|
|\ \
| | |
| | | |
List packages needed for RPM-based distros
|
| | |
| | |
| | |
| | | |
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Backport: dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|/ /
| |
| |
| |
| |
| | |
Backport: dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|\ \
| | |
| | |
| | | |
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
hasn't finished yet.
Signed-off-by: Simon Leinen <simon.leinen@switch.ch>
|
|\ \ \ |
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | | |
ceph_rest_api.py: create own default for log_file
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
common/config thinks the default log_file for non-daemons should be "".
Override that so that the default is
/var/log/ceph/{cluster}-{name}.{pid}.log
since ceph-rest-api is more of a daemon than a client.
Fixes: #6099
Backport: dumpling
Signed-off-by: Dan Mick <dan.mick@inktank.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Fix readdir_r invocation
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
PATH_MAX isn't quite big enough.
Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |/ / / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The buffer needs to be big or else we're walk all over the stack.
Backport: dumpling, cuttlefish, bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
It is possible that we begin the paxos recovery with an uncommitted
value for, say, commit 100. During last/collect we discover 100 has been
committed already. But also, another node provides an uncommitted value
for 101 with the same pn. Currently, we refuse to learn it, because the
pn is not strictly > than our current uncommitted pn... even though it is
the next last_committed+1 value that we need.
There are two possible fixes here:
- make this a >= as we can accept newer values from the same pn.
- discard our uncommitted value metadata when we commit the value.
Let's do both!
Fixes: #6090
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fixes: #6056
When removing a bucket metadata entry we first unlink the bucket
and then we remove the bucket entrypoint object. Originally
when unlinking the bucket we first overwrote the bucket entrypoint
entry marking it as 'unlinked'. However, this is not really needed
as we're just about to remove it. The original version triggered
a bug, as we needed to propagate the new header version first (which
we didn't do, so the subsequent bucket removal failed).
Reviewed-by: Greg Farnum <greg@inktank.com>
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
PGMonitor: pg dump_stuck should respect --format (plain works fine)
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Signed-off-by: Dan Mick <dan.mick@inktank.com>
|
| |/ / / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Do not use some compilation flag invalid for clang
Reviewed-by: Loic Dachary <loic@dachary.org>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
-Wstrict-null-sentinel and -rdynamic are invalid flags
for clang compiler.
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | | |
vstart.sh: Allow to run multiple cluster instances.
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
This patch adds a few ENV variables, so you can use vstart.sh
multiple time to launch multiple clusters
CEPH_DIR -> The working directory of the cluster
CEPH_DEV_DIR -> the dev directory of the cluster
CEPH_OUT_DIR -> the output directory of the cluster
CEPH_RGW_PORT -> the default radosgw port to start with
All theses new variables are set to default values if not specified,
which ones does not change the previous behaviour of vstart.sh
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes: #2901
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Hopefully this won't break old clients; I can't think of any. We *should*
be picky about our requests.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes: #2207
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes: #3660
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes: #2354
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes: #5967
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes: #2914
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
CephFS currently deadlocks under CTDB's ping_pong POSIX locking test
when run concurrently on multiple nodes.
The deadlock is caused by failed removal of a waiting_locks entry when
the waiting lock is merged with an existing lock, e.g:
Initial MDS state (two clients, same file):
held_locks -- start: 0, length: 1, client: 4116, pid: 7899, type: 2
start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2
Waiting lock entry 4116@1:1 fires:
handle_client_file_setlock: start: 1, length: 1,
client: 4116, pid: 7899, type: 2
MDS state after lock is obtained:
held_locks -- start: 0, length: 2, client: 4116, pid: 7899, type: 2
start: 2, length: 1, client: 4110, pid: 40767, type: 2
waiting_locks -- start: 1, length: 1, client: 4116, pid: 7899, type: 2
Note that the waiting 4116@1:1 lock entry is merged with the existing
4116@0:1 held lock to become a 4116@0:2 held lock. However, the now
handled 4116@1:1 waiting_locks entry remains.
When handling a lock request, the MDS calls adjust_locks() to merge
the new lock with available neighbours. If the new lock is merged,
then the waiting_locks entry is not located in the subsequent
remove_waiting() call because adjust_locks changed the new lock to
include the old locks.
This fix ensures that the waiting_locks entry is removed prior to
modification during merge.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
fixes: 6107
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
rgw: rgw-admin throw an error when invalid flag is passed
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
fix #5820 http://tracker.ceph.com/issues/5820
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
osd: add 'osd heartbeat min healthy ratio' tunable
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This was hard-coded to 1/3; make it tunable.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
rgw: Adds --system option help to radosgw-admin
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Signed-off-by: Christophe Courtaut <christophe.courtaut@gmail.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Some distro's have a lack of ltp-kernel packages and all we need is
fstress. This just modified the shell script to download/compile
fstress from source and copy it to the right location if it doesn't
currently exist where it is expected. It is a very small/quick
compile and currently only SLES and debian do not have it already.
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Sandon Van Ness <sandon@inktank.com>
|