summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* mds: don't scatter gather update bare-bones dirfragswip-dirfragYan, Zheng2013-09-241-3/+3
| | | | | | | avoid adding bare-bones dirfrag that has not yet been fetched from the disk to the journal. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* fragtree_t: fix get_leaves_under()Yan, Zheng2013-09-241-1/+1
| | | | | | | | If fragtree is (*^1, 1*^1) and we want leaves under frag 000*. get_leaves_under() return frag 0*, frag 10* and frag 11*. This is obviously wrong. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* client: handle dirfrag mismatch when processing readdir replyYan, Zheng2013-09-242-11/+36
| | | | | | | | | | If client has outdated directory fragments information, it may request readdir an non-existent directory fragment. In this case, the MDS finds an approximate directory fragment and sends its contents back to the client. When receiving a reply with fragment that is different than the requested one, the client need to reset the 'readdir offset'. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* client: use dir_result_t::END as flagYan, Zheng2013-09-241-3/+3
| | | | | | | So we don't lose the latest readdir frag and offset after marking end of readdir. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: handle dirfrag mismatch when processing readdir requestYan, Zheng2013-09-241-12/+9
| | | | | | | | | | | | | | | If client has outdated dirfrags information, readdir request from it may specify a non-existing dirfrag. Current method to handle this case is reply -EAGAIN and let client retry. When client receives the the -EAGAIN reply, it need to refresh its dirfrags information first, then re-send the readdir request. A better way to handle client request that specify a non-existing dirfrag is: MDS chooses a approximate dirfrag, then handle the request like normal. When client receives the readdir reply, it will also update its dirfrags information. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: fix CInode::get_dirfrags_under()Yan, Zheng2013-09-241-8/+0
| | | | | | | make sure it return true when all dirfrags under the given frag_t are found. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: fix MDCache::merge_dir()Yan, Zheng2013-09-241-1/+1
| | | | | | fragment 'bits' should be negative for the merging case. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: don't fragmentate stray directoryYan, Zheng2013-09-241-2/+2
| | | | | | | the code that prepares stray dentry assumes that stray inode contains single dirfrag and we never freeze the stray inode's dirfrag. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: properly update rstat/fragstat when fragmentating directoryYan, Zheng2013-09-241-43/+43
| | | | | | | | | | | | The stale rstat/dirstat check in CDir::merge() is wrong. dirfrag's rstat/fragstat is stale if the accounted rstat/fragstat version isn't equal to inode's rstat/dirstat version. For CDir::split(), no need to worry about if the rstat/fragstat of the origin dirfrag is stale. If it's stale, the rstat/fragstat of the resulting dirfrags are stale too. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: properly store fragmenting dirfragsYan, Zheng2013-09-241-2/+4
| | | | | | | fragmenting dirfrag does not exist on the object store. So all non-null dentries should be included when committing a fragmenting dirfrag. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: trim log segment after finishing uncommitted fragmentsYan, Zheng2013-09-244-22/+38
| | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: use MDCache::force_dir_fragment() to rollback fragmentYan, Zheng2013-09-242-3/+18
| | | | | | | We may merge dirfrags with different frag bits into one dirfrag. When rollback a merging operation, dirfrags should go back to their orgin. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: delete orphan dirfrags during MDS recoversYan, Zheng2013-09-244-41/+150
| | | | | | | | | | | | | | | | | This patch make the MDS use following steps to fragmentate directory. --- 1. freeze the old dirfrags 2. journal EFragment::OP_PREPARE 3. store the new dirfrags 4. journal EFragment::OP_COMMIT 5. delete the old dirfrags 6. journal EFragment::OP_FINISH The newly introduced event EFragment::OP_FINISH indicates that all orphan frags have been deleted. The new process guarantees that orphan frags can be properly deleted if the MDS crashes while fragmentating directory. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: delete orphan dirfrags after fragmentating directoryYan, Zheng2013-09-243-26/+115
| | | | | | delete old dirfrags after the EFragment::OP_COMMIT event is logged. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* mds: start internal MDS request for fragmentating directoryYan, Zheng2013-09-244-86/+93
| | | | | | | | | Start internal MDS request for fragmentating directory operation. With MDS request, we can easily acquire locks required by the fragmentating directory operation. (The old way to get locks is 'try lock' style, which is not reliable) Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* Merge pull request #588 from dachary/wip-6274Sage Weil2013-09-2313-28/+1223
|\ | | | | mon: unit tests to protect against some MonCommands.h typos
| * ceph_argparse: unit tests for validate_command config-keyLoic Dachary2013-09-231-0/+25
| | | | | | | | | | | | | | | | http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * ceph_argparse: unit tests for validate_command osdLoic Dachary2013-09-231-0/+510
| | | | | | | | | | | | | | | | http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * ceph_argparse: unit tests for validate_command monLoic Dachary2013-09-231-0/+29
| | | | | | | | | | | | | | | | http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * ceph_argparse: unit tests for validate_command mdsLoic Dachary2013-09-231-0/+134
| | | | | | | | | | | | | | | | http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * ceph_argparse: unit tests for misc validate_commandLoic Dachary2013-09-231-0/+97
| | | | | | | | | | | | | | | | | | | | | | Contrary to all other classes, this series of command ( Monitor ) does not have a common prefix. http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * ceph_argparse: unit tests for validate_command authLoic Dachary2013-09-231-0/+52
| | | | | | | | | | | | | | | | http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * ceph_argparse: unit tests for validate_command pgLoic Dachary2013-09-231-0/+98
| | | | | | | | | | | | | | | | http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * pybind: ceph_argparse unit tests foundationsLoic Dachary2013-09-231-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The general idea is to have a series of commands, in the same order as they show in mon/MonCommands.h, as if they were input to the ceph client. For each command a valid combination is verified. And at least one validation error is checked to produce a validation error. For instance: ['pg', 'stat'] is a valid command and the validate_command function is expected to return a value that is not None or {}. The command ['pg', 'stat', 'toomany' ] is also given to validate_command to check that an error occurs when an extra argument is given. The TestArparse class implements a few methods to reduce the verbosity of the tests. It does not provide many methods : only those that significantly reduce the verbosity have been implemented. The drawback of writing too many convenience methods is that they are more difficult to read and maintain. The signature dictionary is made a global variable so that it is only extracted once for all classes. It is immutable. http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * pybind: catch EntityAddress missing /Loic Dachary2013-09-231-1/+4
| | | | | | | | | | | | | | | | | | | | | | If the / is missing in an EntityAddress, an ArgumentValid exception must be raised so that it can be caught in the same way other argument validation exceptions are. http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * mon: validate mon dump epoch as a positive integerLoic Dachary2013-09-231-1/+1
| | | | | | | | | | | | | | | | | | All other epochs are validated in the same way http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * pybind: unit tests for ceph_argparse::parse_json_funcsigsLoic Dachary2013-09-234-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run parse_json_funcsigs against the builtin commands found in mon/MonCommands.h. Although it does not provide for a full validation, it will detect some degenerate cases. It is expected to raise if a space is missing at the end of an option specification ( see https://github.com/ceph/ceph/pull/585 ) and this case is tested. New tests of the same kind can be added whenever an undetected error is found. Ideally a validation function is implemented and it would be updated ( with an associated test ) when a new pathological case is found. The json string given to parse_json_funcsigs is obtained from the support program get_command_descriptions. The python-nose dependencies are added to the build requirements in debian/control and ceph.spec.in because make check should always be run at built time. http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * .gitignore gtags(1) generated filesLoic Dachary2013-09-231-1/+7
| | | | | | | | Signed-off-by: Loic Dachary <loic@dachary.org>
| * mon: get_command_descriptions support programLoic Dachary2013-09-233-0/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The get_command_descriptions function is not designed to be tested in C++ because all the validation happens in pybind/ceph_argparse.py. The get_command_descriptions program is designed to be used by python unit tests as a mean to get a JSON dump of the content of mon/MonCommands.h get_command_descriptions --all {"cmd000":{"sig":["pg","stat"],"help": ... "avail":"cli,rest"}} It also provides a way to reproduce and keep track of past errors ( typos etc. ) to ensure the python validation keeps catching it. get_command_descriptions --pull585 Add /get_command_descriptions to .gitignore so that git ls-files --exclude-standard --others does not see it which is required for https://github.com/ceph/autobuild-ceph/blob/f018d220f2622a9fc8c86a31e1fa13263790c399/build-ceph.sh#L73 http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * mon: convenience function to convert commands to jsonLoic Dachary2013-09-232-23/+35
| | | | | | | | | | | | | | | | | | | | | | | | The get_command_descriptions is added to Monitor.h and contains the code previously inlined in Monitor::handle_command to implement the get_command_descriptions command. It is intended for tests to convert command descriptions into json, including error cases. http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Signed-off-by: Loic Dachary <loic@dachary.org>
| * autotools: set noinst_PROGRAMSLoic Dachary2013-09-231-0/+1
| | | | | | | | | | | | to be used by unit test support programs that do not need to be installed Signed-off-by: Loic Dachary <loic@dachary.org>
| * autotools: group test scripts in check_SCRIPTSLoic Dachary2013-09-233-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check_SCRIPTS is added in Makefile-env.am to list the tests that do not require compilation. The scripts listed in check-local and in the TESTS variable use check_SCRIPTS instead. The PYTHONPATH environment variable is added to Makefile-env.am and includes the pybind directory so that python unit tests can load the libraries from sources. http://tracker.ceph.com/issues/6274 refs #6274 Reviewed-by: Roald J. van Loon <roaldvanloon@gmail.com> Signed-off-by: Loic Dachary <loic@dachary.org>
* | Merge pull request #566 from ceph/wip-purge-straySage Weil2013-09-2311-157/+193
|\ \ | | | | | | | | | | | | Fixes for purging stray Reviewed-by: Sage Weil <sage@inktank.com>
| * | mds: remove dirfrags when purging stray directoryYan, Zheng2013-09-222-16/+18
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: avoid fetching backtrace when purging strayYan, Zheng2013-09-222-100/+55
| | | | | | | | | | | | | | | | | | | | | | | | we save old data pools in both inode_backtrace_t::old_pools and inode_t::old_pools. We have the inode in the cache when purging stray, so no need to fetch backtrace to find old data pools. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: don't trim stray inode from the cache.Yan, Zheng2013-09-222-7/+17
| | | | | | | | | | | | | | | | | | | | | don't trim stray inode from the cache, purge it instead. This ensures the stary directories at minimum size. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: remove unnecessary MDCache::maybe_eval_stray() callsYan, Zheng2013-09-223-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | Now we call MDCache::maybe_eval_stray() in MDSCacheObject::put(). So there is no need to call MDCache::maybe_eval_stray() after releasing inode/dentry's refernece. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: evaluate stray when releasing inode/dentry's referenceYan, Zheng2013-09-228-8/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current method to purge stray inode is call MDCache::maybe_eval_stray() after releasing a reference to the stray inode/dentry. It's difficult to make this method work correct, because there are so many places that can release reference. This patch solves the issue by calling MDCache::maybe_eval_stray() in MDSCacheObject::put(). This avoids adding code that calls MDCache::maybe_eval_stray() to each place that releases reference. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: allow delay in evaluating strayYan, Zheng2013-09-193-11/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new parameter 'delay' to MDCache::eval_stray(). If 'delay' is true, MDCache::eval_stray() adds the stray dentry to a delayed list. Delayed stray dentries are processed in MDCache::trim(). This change is required by later commit that evaluates stray when reference to cache object is released. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: touch dentry bottom recursivelyYan, Zheng2013-09-194-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | Deleted directory inode's dirfrags may contain some null dentries. When touch_dentry_bottom() is called, also move these null dentries to the tail of LRU. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: re-integrate stray when link count >= 1Yan, Zheng2013-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | no reason not to rename inode out of the stray directory if the inode's link count > 1 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: fix MDCache::truncate_inode_finish() journalYan, Zheng2013-09-191-2/+3
| | | | | | | | | | | | | | | | | | we should add projected parent directory's context to the journal Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* | | Merge pull request #591 from ceph/wip-miscGregory Farnum2013-09-234-13/+14
|\ \ \ | | | | | | | | Reviewed-by: Greg Farnum <greg@inktank.com>
| * | | common: fix Mutex, Cond no-copy declarationsSage Weil2013-09-112-4/+4
| | | | | | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | | buffer: uninline, constify crc32c()Sage Weil2013-09-092-9/+10
| | | | | | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | | | Merge branch 'master' of github.com:ceph/cephGreg Farnum2013-09-233-0/+49
|\ \ \ \
| * \ \ \ Merge pull request #625 from ceph/wip-warn-pgSage Weil2013-09-233-0/+49
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | mon: warn when pg_num is too low or appears out of whack wrt the cluster size Reviewed-by: Greg Farnum <greg@inktank.com>
| | * | | | vstart: set 'mon pg min per osd'Sage Weil2013-09-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to tweak this since we create relatively few pgs per osd. Signed-off-by: Sage Weil <sage@inktank.com>
| | * | | | mon/PGMonitor: health warn if pool has relatively high objects/pgSage Weil2013-09-232-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is a pool that has a high objects/pg relative to the rest of the cluster, warn, as this suggests this particular pool may have too few PGs. Signed-off-by: Sage Weil <sage@inktank.com>
| | * | | | mon/PGMonitor: health warn if pg_num != pgp_numSage Weil2013-09-231-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users need to adjust pg_num *and* pgp_num for split but may forget to do both. Signed-off-by: Sage Weil <sage@inktank.com>