summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* fixup "ReplicatedPG: take recovery locks in wait_for_missing_object()"wip-6585Greg Farnum2013-10-221-0/+1
| | | | | | | | And the subsequent "ReplicatedPG: add a cookie to get_backfill_read() to identify requester" Forgot to take the lock in the unfound branch, which still adds it to the waiting_for_missing_object list.
* PG: clear out the waiting_for_missing_object list properlyGreg Farnum2013-10-222-0/+2
| | | | | | | We were previously not erasing empty entries, which was confusing the rw_manager recovery locking. Signed-off-by: Greg Farnum <greg@inktank.com>
* fixup ReplicatedPG: PG: drop recovery locks when we successfully recover missingGreg Farnum2013-10-222-3/+21
| | | | Should have added these when we took them; whoops!
* ReplicatedPG: add a cookie to get_backfill_read() to identify requesterGreg Farnum2013-10-222-15/+28
| | | | | | | This way we can tell if we're getting clashing lock requesters, or if the repeat is appropriate. Signed-off-by: Greg Farnum <greg@inktank.com>
* fixup "fixup ReplicatedPG: get backfill read in prep_object_replica_pushes()"Greg Farnum2013-10-221-1/+1
|
* ReplicatedPG: take recovery locks in wait_for_missing_object()Greg Farnum2013-10-221-0/+2
| | | | | | Need them here too. Signed-off-by: Greg Farnum <greg@inktank.com>
* ReplicatedPG: take recovery locks in recover_primary()Greg Farnum2013-10-222-0/+10
| | | | | | This was causing tons of crashes Signed-off-by: Greg Farnum <greg@inktank.com>
* ReplicatedPG: RWTracker: always set backfill_waiting_on_read=trueGreg Farnum2013-10-221-1/+3
| | | | | | | | | | Setting it true even if we aren't actually waiting doesn't break anything else happening, and lets us assert 1) that we're the only doing grabbing the recovery lock, 2) that we have actually taken a recovery lock when dropping one. (We aren't always doing so now, so this should simplify debugging.) Signed-off-by: Greg Farnum <greg@inktank.com>
* fixup ReplicatedPG: get backfill read in prep_object_replica_pushes()Greg Farnum2013-10-211-0/+1
| | | | Signed-off-by: Greg Farnum <greg@inktank.com>
* ReplicatedPG: take and drop read locks when doing backfillGreg Farnum2013-10-211-18/+33
| | | | | | | | | | | | | All our interfaces are in place, so now we can actually take and drop the locks. We do so in a few different places: 1) Take locks in ReplicatedPG::recover_backfill. This is the main entry into the code path. 1a) Take locks when responding to a pull request. This can only happen on non-primary nodes, so we don't need to worry about the request blocking. 2) Drop the locks in ReplicatedBackend::build_push_op() when it's done reading the object and has pushed it all. Signed-off-by: Greg Farnum <greg@inktank.com>
* PGBackend: add functions to get and drop read locks for recovery purposesGreg Farnum2013-10-212-0/+26
| | | | | | | | | | Getting the read locks is never going to block, since PGBackend only does this for pulls, on non-primary PGs. The interface is necessary so that the PGBackend can drop locks, though, as it's responsible for organizing and doing the pushes (thereby knowing when it's done with the reads!). Signed-off-by: Greg Farnum <greg@inktank.com>
* ReplicatedPG: rename start_recovery_ops::started -> ops_begunGreg Farnum2013-10-212-13/+12
| | | | | | Get rid of the stupid aliasing we did for git diff convenience. Signed-off-by: Greg Farnum <greg@inktank.com>
* PG: switch the start_recovery_ops interface to specify work to do as a paramGreg Farnum2013-10-214-18/+40
| | | | | | | | | | We previously inferred whether there was useful work to be done by looking at the number of ops started, but with the upcoming introduction of the rw_manager read locking on backfill, we could start no ops while still having work to do. Switch around the interfaces to specify these as separate pieces of information. Signed-off-by: Greg Farnum <greg@inktank.com>
* ReplicatedPG: implement the RWTracker mechanisms for backfill read lockingGreg Farnum2013-10-212-3/+31
| | | | | | | | | | We want backfill to take read locks on the objects it's pushing. Add a get_backfill_read(hobject_t) function, a corresponding drop_backfill_read(), and a backfill_waiting_on_read member in ObjState. Check that member when getting a write lock, and in put_write(). Requeue the recovery if necessary, and clean up the backfill block when its read lock is dropped. Signed-off-by: Greg Farnum <greg@inktank.com>
* ReplicatedPG: separate RWTracker's waitlist from getting locksGreg Farnum2013-10-211-6/+18
| | | | | | | This way we can try and get locks which aren't associated with an OpRequest. Signed-off-by: Greg Farnum <greg@inktank.com>
* common: add an hobject_t::is_min() functionGreg Farnum2013-10-211-0/+7
| | | | Signed-off-by: Greg Farnum <greg@inktank.com>
* Merge pull request #751 from ceph/wip-6603Loic Dachary2013-10-212-0/+3
|\ | | | | a couple trivial leaks
| * common/BackTrace: fix memory leakwip-6603Sage Weil2013-10-211-0/+1
| | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * common/cmdparse: fix memory leakSage Weil2013-10-211-0/+2
|/ | | | | | demangle is allocating with malloc() in this case. Signed-off-by: Sage Weil <sage@inktank.com>
* Merge pull request #746 from ceph/wip-6582Sage Weil2013-10-184-11/+25
|\ | | | | | | | | Wip 6582 Reviewed-by: Sage Weil <sage@inktank.com>
| * ReplicatedPG: copy: conditionally requeue copy ops when cancelledwip-6582Greg Farnum2013-10-182-11/+19
| | | | | | | | | | | | | | | | | | | | | | We may need to requeue copy ops which are cancelled as part of an acting set change but don't change the primary. To support this, add a "requeue" flag to cancel_copy_ops() and copy_ops(), as well as to CopyResults. The CopyCallback is then responsible for requeuing (the higher layers can't do so as they can't know which request actually triggered the copy). Signed-off-by: Greg Farnum <greg@inktank.com>
| * PG: add a requeue_op() function to complement requeue_ops().Greg Farnum2013-10-182-0/+6
|/ | | | Signed-off-by: Greg Farnum <greg@inktank.com>
* Merge branch 'next'Gary Lowell2013-10-182-1/+7
|\
| * v0.71v0.71lastGary Lowell2013-10-172-1/+7
| |
* | Merge pull request #737 from xarses/6127Josh Durgin2013-10-171-0/+2
|\ \ | | | | | | | | | Add Redhat init script option Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
| * | Add Redhat init script optionAndrew Woodward2013-10-171-0/+2
| | | | | | | | | | | | | | | Resolves: 6127 Signed-off-by: Andrew Woodward <awoodward@mirantis.com>
* | | Merge pull request #738 from ceph/wip-cache-crcSage Weil2013-10-171-12/+17
|\ \ \ | |/ / |/| | | | | | | | fix cached crc, bug #6583 Reviewed-by: Samuel Just <sam.just@inktank.com>
| * | common/buffer: invalidate crc on zero, copy_inSage Weil2013-10-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does not capture users who - calc a crc - use c_str() to modify the buffer content - (re)calc a crc Signed-off-by: Sage Weil <sage@inktank.com>
| * | common/buffer: fix crc_map typesSage Weil2013-10-171-8/+8
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | common/buffer: drop unused fieldsSage Weil2013-10-171-4/+2
|/ / | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | qa/workunits/rest/test.py: fix mds {add,remove}_data_pool testSage Weil2013-10-171-2/+2
| | | | | | | | | | | | Arg name changed from poolid to pool in e2602c54. Signed-off-by: Sage Weil <sage@inktank.com>
* | doc/release-notes: link ot the changelogSage Weil2013-10-171-0/+2
| | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | doc/release-notes: v0.61.9Sage Weil2013-10-172-0/+604
| | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | Makefile: fix /sbin vs /usr/sbin behaviorSage Weil2013-10-174-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of telling configure to put things in /sbin, explicitly put the two important items (mkcephfs and mount.fuse.ceph) in /sbin via an automake rule. This unbreaks FreeBSD 9.1 and probably others. Based on patches originally from Alan Somers <asomers@gmail.com>, modified for the current Makefile structure and applied to the specfile too. Fixes: #6456 Signed-off-by: Sage Weil <sage@inktank.com> Tested-by: Alan Somers <asomers@gmail.com>
* | OSD: check for splitting when processing recover/backfill reservationsSamuel Just2013-10-171-50/+59
| | | | | | | | | | | | Fixes: 6565 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | Merge pull request #691 from ceph/wip-dirfragGregory Farnum2013-10-1717-236/+611
|\ \ | | | | | | | | | Reviewed-by: Greg Farnum <greg@inktank.com> Partly-Reviewed-by: Sage Weil <sage@inktank.com>
| * | qa/workunits/misc/dirfrag: make it work on ubuntuYan, Zheng2013-10-141-5/+7
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: optimize map element dereferenceYan, Zheng2013-10-111-17/+24
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | qa/workunits: Add large number of files in a directory test scriptYan, Zheng2013-10-051-0/+46
| | | | | | | | | | | | | | | | | | Test the directory fragments feature Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: reduce verbosity of dir fragment loggingYan, Zheng2013-10-051-6/+6
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: fix bloom filter leaksYan, Zheng2013-10-052-2/+3
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: stop propagating rstat if freezing/frozen dirfrag is encountered.Yan, Zheng2013-10-052-5/+4
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: don't scatter gather update bare-bones dirfragsYan, Zheng2013-10-051-3/+3
| | | | | | | | | | | | | | | | | | | | | avoid adding bare-bones dirfrag that has not yet been fetched from the disk to the journal. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | fragtree_t: fix get_leaves_under()Yan, Zheng2013-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | If fragtree is (*^1, 1*^1) and we want leaves under frag 000*. get_leaves_under() return frag 0*, frag 10* and frag 11*. This is obviously wrong. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | client: handle dirfrag mismatch when processing readdir replyYan, Zheng2013-10-052-11/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If client has outdated directory fragments information, it may request readdir an non-existent directory fragment. In this case, the MDS finds an approximate directory fragment and sends its contents back to the client. When receiving a reply with fragment that is different than the requested one, the client need to reset the 'readdir offset'. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | client: use dir_result_t::END as flagYan, Zheng2013-10-051-3/+3
| | | | | | | | | | | | | | | | | | | | | So we don't lose the latest readdir frag and offset after marking end of readdir. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: handle dirfrag mismatch when processing readdir requestYan, Zheng2013-10-051-12/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If client has outdated dirfrags information, readdir request from it may specify a non-existing dirfrag. Current method to handle this case is reply -EAGAIN and let client retry. When client receives the the -EAGAIN reply, it need to refresh its dirfrags information first, then re-send the readdir request. A better way to handle client request that specify a non-existing dirfrag is: MDS chooses a approximate dirfrag, then handle the request like normal. When client receives the readdir reply, it will also update its dirfrags information. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: fix CInode::get_dirfrags_under()Yan, Zheng2013-10-051-8/+0
| | | | | | | | | | | | | | | | | | | | | make sure it return true when all dirfrags under the given frag_t are found. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: fix MDCache::merge_dir()Yan, Zheng2013-10-051-1/+1
| | | | | | | | | | | | | | | | | | fragment 'bits' should be negative for the merging case. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * | mds: don't fragmentate stray directoriesYan, Zheng2013-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | the code that prepares and purges stray dentry assumes that we never freeze stray directories. So disable fragmentating stray directories temporarily. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>