summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* asdfwip-rgw-vstartSage Weil2013-06-211-11/+18
|
* wipSage Weil2013-06-211-0/+34
|
* Merge remote-tracking branch 'gh/wip-mds'Sage Weil2013-06-2114-224/+239
|\ | | | | | | Reviewed-by: Sage Weil <sage@inktank.com>
| * mds: rev protocolSage Weil2013-06-211-1/+1
| | | | | | | | | | | | | | Commit 18b9e63b4df643e1f2fb8f17416089e5d970bf60 changed the OTW lock encoding. Signed-off-by: Sage Weil <sage@inktank.com>
| * mds: kill Server::handle_client_lookup_hash()Yan, Zheng2013-06-212-144/+1
| | | | | | | | | | | | | | Server::handle_client_lookup_ino() is more simple and robust. Use it to handle both LOOKUPHASH and LOOKUINO requests. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * mds: use "open-by-ino" helper to handle LOOKUPINO requestYan, Zheng2013-06-212-31/+8
| | | | | | | | | | Fixes #3541 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| * Merge remote-tracking branch 'yan/wip-mds' into wip-mdsSage Weil2013-06-2014-51/+232
| |\
| | * mds: fix remote wrlock rejoinYan, Zheng2013-06-201-18/+22
| | | | | | | | | | | | | | | | | | remote wrlock's target is not always inode's auth MDS. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: fix race between scatter gather and dirfrag exportYan, Zheng2013-06-207-10/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we gather dirty scatter lock state while corresponding dirfrag is been exporting, we may receive different dirfrag states from two MDS and we need to find which one is the newest. The solution is adding a new variable "migrate seq" to dirfrag, increase it by one when dirfrag's auth MDS changes. When gathering dirty scatter lock state, use "migrate seq" to find the newest dirfrag state. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: don't journal bare dirfragYan, Zheng2013-06-173-2/+6
| | | | | | | | | | | | | | | | | | | | | don't journal bare dirfrag when starting scatter. also add debug code for bare dirfrag modification. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: fix cross-authorty rename raceYan, Zheng2013-06-174-8/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing cross-authorty rename, we need to make sure bystanders have received all messages sent by inode's original auth MDS, then change inode's authorty. Otherwise lock messages sent by the original/new auth MDS can arrive bystanders out of order. The fix is: inode's original auth MDS sends notify messages to bystanders, performs slave rename after receiving all bystanders' notify acks. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: try purging stray inode after storing backtraceYan, Zheng2013-06-171-1/+2
| | | | | | | | | | | | | | | | | | | | | Inode is auth pinned and can't be purged while storing backtrace, so we should try purging stray inode after storing backtrace. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: handle undefined dirfrags when opening inodeYan, Zheng2013-06-172-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | When MDS is rejoin stage, cache rejoin message can add undefined inodes and dirfrags to the cache. These undefined objects can affect "lookup-by-ino" processes. If an undefined dirfrag is encountered, we should fetch it from disk. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: fix frozen check in Server::try_open_auth_dirfrag()Yan, Zheng2013-06-171-1/+1
| | | | | | | | | | | | Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
| | * mds: don't update migrate_seq when importing non-auth capYan, Zheng2013-06-173-7/+9
| | | | | | | | | | | | | | | | | | | | | We use migrate_seq to distinguish old and new auth MDS. So we should not change migrate_seq when importing non-auth cap. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* | | mds: do not assume segment list is non-empty in standby_trim_segmentsSage Weil2013-06-212-2/+8
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we restart standby replay shortly after startup, before we actually have any segments, we an trigger a segfault here: ceph version 0.64-441-gc39b99c (c39b99cdecceaca77f66eafbcc38387406826406) 1: ceph-mds() [0x975caa] 2: (()+0xfcb0) [0x7fc33b5a5cb0] 3: (MDLog::standby_trim_segments()+0x192) [0x78a932] 4: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x595f69] 5: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x7917b0] 6: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x7c6b38] 7: (Objecter::C_Stat::finish(int)+0xc0) [0x7c7930] 8: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe48) [0x7b2c78] 9: (MDS::handle_core_message(Message*)+0xae8) [0x589858] 10: (MDS::_dispatch(Message*)+0x2f) [0x589a1f] 11: (MDS::ms_dispatch(Message*)+0x1d3) [0x58b4a3] 12: (DispatchQueue::entry()+0x3f1) [0x943861] 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x86e32d] Fixes: #5333 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
* | Merge remote-tracking branch 'gh/next'Sage Weil2013-06-205-49/+63
|\ \
| * | qa/workunits/misc/multiple_rsync.sh: wtfSage Weil2013-06-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 2013-06-15T12:55:29.808 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.1 2013-06-15T12:55:29.808 INFO:teuthology.task.workunit.client.0.err:+ tee a 2013-06-15T12:55:29.820 INFO:teuthology.task.workunit.client.0.out:sending incremental file list 2013-06-15T12:56:46.019 INFO:teuthology.task.workunit.client.0.out: 2013-06-15T12:56:46.020 INFO:teuthology.task.workunit.client.0.out:sent 1452634 bytes received 7485 bytes 19086.52 bytes/sec 2013-06-15T12:56:46.020 INFO:teuthology.task.workunit.client.0.out:total size is 3205063225 speedup is 2195.07 2013-06-15T12:56:46.020 INFO:teuthology.task.workunit.client.0.err:+ wc -l a 2013-06-15T12:56:46.021 INFO:teuthology.task.workunit.client.0.out:4 a 2013-06-15T12:56:46.022 INFO:teuthology.task.workunit.client.0.err:+ wc -l a 2013-06-15T12:56:46.022 INFO:teuthology.task.workunit.client.0.err:+ grep 4 2013-06-15T12:56:46.023 INFO:teuthology.task.workunit.client.0.out:4 a 2013-06-15T12:56:46.024 INFO:teuthology.task.workunit.client.0.err:+ rsync -auv --exclude local/ /usr/ usr.2 2013-06-15T12:56:46.024 INFO:teuthology.task.workunit.client.0.err:+ tee a 2013-06-15T12:56:46.112 INFO:teuthology.task.workunit.client.0.out:sending incremental file list 2013-06-15T12:57:17.172 INFO:teuthology.task.workunit.client.0.out: 2013-06-15T12:57:17.174 INFO:teuthology.task.workunit.client.0.out:sent 1452634 bytes received 7485 bytes 46352.98 bytes/sec 2013-06-15T12:57:17.174 INFO:teuthology.task.workunit.client.0.out:total size is 3205063225 speedup is 2195.07 2013-06-15T12:57:17.175 INFO:teuthology.task.workunit.client.0.err:+ wc -l a 2013-06-15T12:57:17.175 INFO:teuthology.task.workunit.client.0.out:3 a Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 21e85f90be3e4915376106dd384f6982086e2311)
| * | qa/workunits/cephtool/test.sh: fix and cleanup several testsSage Weil2013-06-201-22/+16
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | mon: drop deprecated 'stop_cluster'Sage Weil2013-06-201-1/+0
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | mds: make 'mds compat rm_*compat' idempotentSage Weil2013-06-201-2/+2
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | mon: make 'log ...' command wait for commit before replySage Weil2013-06-202-18/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we would just dump the command argument to our local log client and reply immediately, which could lose the message if we then restarted. Instead, commit directly and wait before replying. Also, log as the actual client, not as the monitor processing the message. Fixes: #5409 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
| * | a/workunits/cephtool/test.sh: --no-log-to-stderr when examining stderrSage Weil2013-06-201-3/+3
| | | | | | | | | | | | | | | | | | | | | We can get random messages to stderror from socket reconnects and such; discard those if we are looking at stderr in the test. Signed-off-by: Sage Weil <sage@inktank.com>
| * | mon: more fix dout use in sync_requester_abort()Sage Weil2013-06-201-1/+1
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | mon: fix raw use of *_dout in sync_requester_abort()Sage Weil2013-06-201-3/+2
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | | osdc: re-calculate truncate_size for strip objectsYan, Zheng2013-06-2013-54/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | Feed truncate_size through the striping algorithm so that it reflects the correct per-object offset (as opposed to the file offset). Fixes #5380 Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | | doc/release-notes: v0.61.4Sage Weil2013-06-202-0/+863
| | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
* | | Merge pull request #367 from ceph/wip-ceph-cliSage Weil2013-06-197-7/+19
|\ \ \ | | | | | | | | Reviewed-by: Dan Mick <dan.mick@inktank.com>
| * | | init-radosgw: use radosgw --show-config-value to get config valuesSage Weil2013-06-192-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This way we get the correct default values, as reflected by radosgw itself. Signed-off-by: Sage Weil <sage@inktank.com>
| * | | ceph: fix ceph-conf call to get admin socket path for 'daemon <name> ...'Sage Weil2013-06-191-0/+1
| | | | | | | | | | | | | | | | Signed-off-by: Sage Weil <sage@inktank.com>
| * | | ceph-conf: make --show-config-value reflect daemon defaultsSage Weil2013-06-194-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want DAEMON defaults, but we don't want global_init to do anything else daemonish like print a banner or mkdir /var/run/ceph. This lets us use ceph-conf -n osd.0 --show-config-value log_file to get the default, while ceph-conf -n osd.0 log_file only reflects what is in the config file. Signed-off-by: Sage Weil <sage@inktank.com>
* | | | Merge remote-tracking branch 'upstream/next'Samuel Just2013-06-191-7/+6
|\ \ \ \ | | |/ / | |/| |
| * | | FileStore: handle observers in constructor/destructorSamuel Just2013-06-191-7/+6
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
| * | | FileStore: apply changes after disabling m_filestore_replica_fadviseSamuel Just2013-06-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> (cherry picked from commit ed8b0e65bde14d0a3a08bc233dee6a997e379dcc)
* | | | FileStore: apply changes after disabling m_filestore_replica_fadviseSamuel Just2013-06-191-0/+1
| |/ / |/| | | | | | | | | | | Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
* | | ceph-disk: use unix lock instead of lockfile classSage Weil2013-06-191-3/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lockfile class relies on file system trickery to get safe mutual exclusion. However, the unix syscalls do this for us. More importantly, the unix locks go away when the owning process dies, which is behavior that we want here. Fixes: #5387 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com>
* | | Merge remote-tracking branch 'gh/next'Sage Weil2013-06-192-9/+18
|\ \ \ | |/ /
| * | ceph-disk: make list_partition behave with unusual device namesAlexandre Maragone2013-06-191-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When you get device names like sdaa you do not want to mistakenly conclude that sdaa is a partition of sda. Use /sys/block/$device/$partition existence instead. Fixes: #5211 Backport: cuttlefish Signed-off-by: Alexandre Maragone <alexandre.maragone@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
| * | os/FileStore: disable fadvise on XFSSage Weil2013-06-191-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fadvise(DONTNEED) on XFS can break writeback ordering and zeroing; see http://oss.sgi.com/archives/xfs/2013-06/msg00066.html If we detect XFS, turn this option off. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com>
* | | Merge pull request #364 from dachary/wip-5213athanatos2013-06-193-10/+487
|\ \ \ | | | | | | | | | | | | | | | | unit tests for PGLog::proc_replica_log Reviewed-by: Samuel Just <sam.just@inktank.com>
| * | | unit tests for PGLog::proc_replica_logLoic Dachary2013-06-191-6/+483
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tests covers 100% of the LOC of proc_replica_log. It is broken down in 7 cases to enumerate all the situations it must address. Each case is isolated in a independant code block where the conditions are reproduced. All tests are done on omissing and oinfo because they are the only data structures that can be modified by proc_replica_log. The first case is a noop and checks that only last_complete gets updated when there are no logs. The following case includes entries that are supposed to be ignored ( x7, x8 and xa ), however this is not an actual proof that the code ignoring them is actually run : it only shows in the code coverage report. The log entry (1,3) modifies the object x9 but the olog entry (2,3) deletes it : log is authoritative and the object is added to missing. x7 is divergent and ignored. x8 has a more recent version in the log and the olog entry is ignored. xa is past last_backfill and ignored. The other cases are a variation of the first case with minimal changes to make them easier to understand and adapt. For instance most of them start with a tail that is the same ( object with hash x5 and both at version 1,1 ). http://tracker.ceph.com/issues/5213 refs #5213 Signed-off-by: Loic Dachary <loic@dachary.org>
| * | | add constness to PGLog::proc_replica_logLoic Dachary2013-06-192-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function is made const by replacing a single call to log.objects[] with log.objects.find. The olog argument is also a const and does not require any change. http://tracker.ceph.com/issues/5213 refs #5213 Signed-off-by: Loic Dachary <loic@dachary.org>
* | | | Merge pull request #366 from dachary/wip-5398athanatos2013-06-191-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | PGLog::rewind_divergent_log must not call mark_dirty_from on end() Reviewed-by: Samuel Just <sam.just@inktank.com>
| * | | | PGLog::rewind_divergent_log must not call mark_dirty_from on end()Loic Dachary2013-06-191-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PGLog::rewind_divergent_log is dereferencing iterator "p" though it is already past the end of its container. When entering the loop for the first time, p is log.log.end() and must not be dereferenced. mark_dirty_from must only be called after p--. It will not rewind past begin() because of the if (p == log.log.begin()) test above. http://tracker.ceph.com/issues/5398 fixes #5398 Signed-off-by: Loic Dachary <loic@dachary.org>
* | | | FileStore: get_index prior to taking fdcache_lock in lfn_unlinkSamuel Just2013-06-191-1/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | We take the fdcache_lock while holding onto index objects elsewhere in the code. Fixes: #5389 Reviewed-by: David Zafman <david.zafman@inktank.com> Signed-off-by: Samuel Just <sam.just@inktank.com>
* | | Merge pull request #342 from ceph/wip-monSage Weil2013-06-1918-214/+185
|\ \ \ | | | | | | | | Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com>
| * | | mon/PaxosService: not active during paxos UPDATING_PREVIOUSSage Weil2013-06-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Treat this as an extension of the recovery process, e.g. RECOVERING -> ACTIVE or RECOVERING -> UPDATING_PREVIOUS -> ACTIVE and we are not active until we get to "the end" in both cases. Signed-off-by: Sage Weil <sage@inktank.com>
| * | | mon: simplify statesSage Weil2013-06-193-40/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - make states mutually exclusive (an enum) - rename locked -> updating_previous - set state prior to begin() to simplify things a bit Signed-off-by: Sage Weil <sage@inktank.com>
| * | | mon/Paxos: not readable when LOCKEDSage Weil2013-06-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we are re-proposing a previously accepted value from a previous quorum, we should not consider it readable, because it is possible it was exposed to clients as committed (2/3 accepted) but not recored to be committed, and we do not want to expose old state as readable when new state was previously readable. Signed-off-by: Sage Weil <sage@inktank.com>
| * | | mon/Paxos: cleanup: drop unused PREPARING state bitSage Weil2013-06-192-13/+2
| | | | | | | | | | | | | | | | | | | | | | | | This is never set when we block, and nobody looks at it. Signed-off-by: Sage Weil <sage@inktank.com>