| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
avoid adding bare-bones dirfrag that has not yet been fetched from
the disk to the journal.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
|
| |
If fragtree is (*^1, 1*^1) and we want leaves under frag 000*.
get_leaves_under() return frag 0*, frag 10* and frag 11*. This is
obviously wrong.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
If client has outdated directory fragments information, it may request
readdir an non-existent directory fragment. In this case, the MDS finds
an approximate directory fragment and sends its contents back to the
client. When receiving a reply with fragment that is different than the
requested one, the client need to reset the 'readdir offset'.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
| |
So we don't lose the latest readdir frag and offset after marking
end of readdir.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If client has outdated dirfrags information, readdir request from
it may specify a non-existing dirfrag. Current method to handle this
case is reply -EAGAIN and let client retry. When client receives the
the -EAGAIN reply, it need to refresh its dirfrags information first,
then re-send the readdir request.
A better way to handle client request that specify a non-existing
dirfrag is: MDS chooses a approximate dirfrag, then handle the request
like normal. When client receives the readdir reply, it will also
update its dirfrags information.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
| |
make sure it return true when all dirfrags under the given frag_t
are found.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
| |
fragment 'bits' should be negative for the merging case.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
| |
the code that prepares stray dentry assumes that stray inode contains
single dirfrag and we never freeze the stray inode's dirfrag.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stale rstat/dirstat check in CDir::merge() is wrong. dirfrag's
rstat/fragstat is stale if the accounted rstat/fragstat version isn't
equal to inode's rstat/dirstat version.
For CDir::split(), no need to worry about if the rstat/fragstat of the
origin dirfrag is stale. If it's stale, the rstat/fragstat of the
resulting dirfrags are stale too.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
| |
fragmenting dirfrag does not exist on the object store. So all non-null
dentries should be included when committing a fragmenting dirfrag.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
| |
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
| |
We may merge dirfrags with different frag bits into one dirfrag. When rollback
a merging operation, dirfrags should go back to their orgin.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch make the MDS use following steps to fragmentate directory.
---
1. freeze the old dirfrags
2. journal EFragment::OP_PREPARE
3. store the new dirfrags
4. journal EFragment::OP_COMMIT
5. delete the old dirfrags
6. journal EFragment::OP_FINISH
The newly introduced event EFragment::OP_FINISH indicates that all orphan
frags have been deleted. The new process guarantees that orphan frags can
be properly deleted if the MDS crashes while fragmentating directory.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
| |
delete old dirfrags after the EFragment::OP_COMMIT event is logged.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|
|
|
|
|
|
|
|
| |
Start internal MDS request for fragmentating directory operation. With
MDS request, we can easily acquire locks required by the fragmentating
directory operation. (The old way to get locks is 'try lock' style,
which is not reliable)
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|\
| |
| | |
mon: unit tests to protect against some MonCommands.h typos
|
| |
| |
| |
| |
| |
| |
| |
| | |
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Contrary to all other classes, this series of command ( Monitor ) does
not have a common prefix.
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The general idea is to have a series of commands, in the same order as
they show in mon/MonCommands.h, as if they were input to the ceph
client. For each command a valid combination is verified. And at least
one validation error is checked to produce a validation error. For
instance:
['pg', 'stat']
is a valid command and the validate_command function is expected to
return a value that is not None or {}. The command
['pg', 'stat', 'toomany' ]
is also given to validate_command to check that an error occurs when
an extra argument is given.
The TestArparse class implements a few methods to reduce the verbosity
of the tests. It does not provide many methods : only those that
significantly reduce the verbosity have been implemented. The drawback
of writing too many convenience methods is that they are more difficult
to read and maintain.
The signature dictionary is made a global variable so that
it is only extracted once for all classes. It is immutable.
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the / is missing in an EntityAddress, an ArgumentValid exception must
be raised so that it can be caught in the same way other argument
validation exceptions are.
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
All other epochs are validated in the same way
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Run parse_json_funcsigs against the builtin commands found
in mon/MonCommands.h. Although it does not provide for a full
validation, it will detect some degenerate cases.
It is expected to raise if a space is missing at the end of an option
specification ( see https://github.com/ceph/ceph/pull/585 ) and this
case is tested. New tests of the same kind can be added whenever an
undetected error is found. Ideally a validation function is implemented
and it would be updated ( with an associated test ) when a new
pathological case is found.
The json string given to parse_json_funcsigs is obtained from
the support program get_command_descriptions.
The python-nose dependencies are added to the build requirements in
debian/control and ceph.spec.in because make check should always be run
at built time.
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| | |
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The get_command_descriptions function is not designed to be tested in
C++ because all the validation happens in pybind/ceph_argparse.py. The
get_command_descriptions program is designed to be used by python unit
tests as a mean to get a JSON dump of the content of mon/MonCommands.h
get_command_descriptions --all
{"cmd000":{"sig":["pg","stat"],"help": ... "avail":"cli,rest"}}
It also provides a way to reproduce and keep track of past errors
( typos etc. ) to ensure the python validation keeps catching it.
get_command_descriptions --pull585
Add /get_command_descriptions to .gitignore so that
git ls-files --exclude-standard --others
does not see it which is required for
https://github.com/ceph/autobuild-ceph/blob/f018d220f2622a9fc8c86a31e1fa13263790c399/build-ceph.sh#L73
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The get_command_descriptions is added to Monitor.h and contains
the code previously inlined in Monitor::handle_command to implement
the get_command_descriptions command. It is intended for tests to
convert command descriptions into json, including error cases.
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| | |
to be used by unit test support programs that do not need to be installed
Signed-off-by: Loic Dachary <loic@dachary.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The check_SCRIPTS is added in Makefile-env.am to list the tests that do
not require compilation. The scripts listed in check-local and in the
TESTS variable use check_SCRIPTS instead.
The PYTHONPATH environment variable is added to Makefile-env.am and
includes the pybind directory so that python unit tests can load the
libraries from sources.
http://tracker.ceph.com/issues/6274 refs #6274
Reviewed-by: Roald J. van Loon <roaldvanloon@gmail.com>
Signed-off-by: Loic Dachary <loic@dachary.org>
|
|\ \
| | |
| | |
| | |
| | | |
Fixes for purging stray
Reviewed-by: Sage Weil <sage@inktank.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
we save old data pools in both inode_backtrace_t::old_pools and
inode_t::old_pools. We have the inode in the cache when purging
stray, so no need to fetch backtrace to find old data pools.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
don't trim stray inode from the cache, purge it instead. This ensures
the stary directories at minimum size.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Now we call MDCache::maybe_eval_stray() in MDSCacheObject::put().
So there is no need to call MDCache::maybe_eval_stray() after
releasing inode/dentry's refernece.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Current method to purge stray inode is call MDCache::maybe_eval_stray()
after releasing a reference to the stray inode/dentry. It's difficult
to make this method work correct, because there are so many places that
can release reference.
This patch solves the issue by calling MDCache::maybe_eval_stray()
in MDSCacheObject::put(). This avoids adding code that calls
MDCache::maybe_eval_stray() to each place that releases reference.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a new parameter 'delay' to MDCache::eval_stray(). If 'delay'
is true, MDCache::eval_stray() adds the stray dentry to a delayed
list. Delayed stray dentries are processed in MDCache::trim().
This change is required by later commit that evaluates stray when
reference to cache object is released.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Deleted directory inode's dirfrags may contain some null dentries.
When touch_dentry_bottom() is called, also move these null dentries
to the tail of LRU.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
no reason not to rename inode out of the stray directory if the
inode's link count > 1
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
we should add projected parent directory's context to the journal
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
|\ \ \
| | | |
| | | | |
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \ \ |
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
mon: warn when pg_num is too low or appears out of whack wrt the cluster size
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We need to tweak this since we create relatively few pgs per osd.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If there is a pool that has a high objects/pg relative to the rest of the
cluster, warn, as this suggests this particular pool may have too few
PGs.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Users need to adjust pg_num *and* pgp_num for split but may forget to do
both.
Signed-off-by: Sage Weil <sage@inktank.com>
|