delta/ceph.git - github.com: ceph/ceph.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	ceph.spec.in: package rbd udev rulewip-3930	Gary Lowell	2013-01-28	1	-0/+4
\| \| \| \| \| \|	Package udev/50-rbd.rules per bug 3930. Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
*	Merge remote-tracking branch 'yan/wip-mds'	Sage Weil	2013-01-28	20	-546/+1072
\|\ \| \| \| \| \| \|	Reviewed-by: Sage Weil <sage@inktank.com>
\| *	mds: clear inode dirty when slave rename finishes.	Yan, Zheng	2013-01-29	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inode is linked to a non-auth directory, so remove it from LogSegment's dirty inode list. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: mark export bounds for cross authority directory rename	Yan, Zheng	2013-01-29	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	this guarantees that the importing MDS gets directory fragment's up-to-date fragstat/rstat. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: allow handling slave request in the clientreplay stage	Yan, Zheng	2013-01-29	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	replaying a client request may need to create slave request and the slave MDS can be also in the clientreplay stage. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fix 'discover' handling in the rejoin stage	Yan, Zheng	2013-01-29	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the MDS is the resolve stage, current MDCache::handle_discover() only handles 'discover' from MDS that it has already gotten rejoin acknowledgement. This can cause circular wait because MDCache::rejoin_gather_finish() fetches reconnected inodes before send rejoin acknowledgements, and fetching reconnected inode may triggers 'discover'. The fix is not delay handling 'discover' from MDS that are also in the rejoin stage. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: add projected rename's subtree bounds to ESubtreeMap	Yan, Zheng	2013-01-29	1	-3/+15
\| \| \| \| \| \| \| \|	Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fetch missing inodes from disk	Yan, Zheng	2013-01-29	2	-1/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem of fetching missing inodes from replicas is that replicated inodes does not have up-to-date rstat and fragstat. So just fetch missing inodes from disk Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: rejoin remote wrlocks and frozen auth pin	Yan, Zheng	2013-01-29	6	-10/+93
\| \| \| \| \| \| \| \| \| \| \| \|	Includes remote wrlocks and frozen authpin in cache rejoin strong message Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: move variables special to rename into MDRequest::more	Yan, Zheng	2013-01-29	4	-87/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	My previous patches add two pointers (ambiguous_auth_inode and auth_pin_freeze) to class Mutation. They are both used by cross authority rename, both point to the renamed inode. Later patches need add more rename special state to MDRequest, So just move them into MDRequest::more Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: properly clear CDir::STATE_COMPLETE when replaying EImportStart	Yan, Zheng	2013-01-29	5	-9/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	when replaying EImportStart, we should set/clear directory's COMPLETE flag according with the flag in the journal entry. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: don't journal opened non-auth inode	Yan, Zheng	2013-01-29	4	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we journal opened non-auth inode, during journal replay, the corresponding entry will add non-auth objects to the cache. But the MDS does not journal all subsequent modifications (rmdir,rename) to these non-auth objects, so the code that manages cache and subtree may get confused. Besides non-auth objects will be trimmed at the resolve stage. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: journal inode's projected parent when doing link rollback	Yan, Zheng	2013-01-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise the journal entry will revert the effect of any on-going rename operation for the inode. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fix for MDCache::disambiguate_imports	Yan, Zheng	2013-01-29	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the resolve stage, if no MDS claims other MDS's disambiguous subtree import, the subtree's dir_auth is undefined. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fix for MDCache::adjust_bounded_subtree_auth	Yan, Zheng	2013-01-29	1	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	After swallowing extra subtrees, subtree bounds may change, so it should re-check. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: don't replace existing slave request	Yan, Zheng	2013-01-29	2	-12/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MDS may receive a client request, but find there is an existing slave request. It means other MDS is handling the same request, so we should not replace the slave request with a new client request, just forward the request. The client request may include embeded cap releases, we need process them even the request is forwarded. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: always use {push,pop}_projected_linkage to change linkage	Yan, Zheng	2013-01-29	2	-30/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current code skips using {push,pop}_projected_linkage to modify replica dentry's linkage. This confuses EMetaBlob::add_dir_context() and makes it record out-of-date path when TO_ROOT mode is used. This patch changes the code to always use {push,pop}_projected_linkage to modify dentry's linkage. It makes sure MDCache::create_subtree_map() record correct and up-to-date subtree map. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: send resolve messages after all MDS reach resolve stage	Yan, Zheng	2013-01-29	4	-8/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current code sends resolve messages when resolving MDS set changes. There is no need to send resolve messages when some MDS leave the resolve stage. Sending message while some MDS are replaying is also not very useful. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: split reslove into two sub-stages	Yan, Zheng	2013-01-29	3	-62/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The resolve stage serves to disambiguate the fate of uncommitted slave updates and resolve subtrees authority. The MDS sends resolve message that claims subtrees authority immediately when reslove stage is entered, When receiving a resolve message, the MDS also processes it immediately. This may cause problem if there are uncommitted slave rename and some of them need rollback later. It's because slave rename rollback may modify subtree map. The fix is split reslove into two sub-stages, the first sub-stage serves to disambiguate slave updates, do slave commit or rollback. After the the first sub-stage finishes, the MDS sends resolve messages that claim subtrees authority to other MDS and processes received resolve messages. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fix slave rename rollback	Yan, Zheng	2013-01-29	2	-119/+212
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main issue of old slave rename rollback code is that it assumes all affected objects are in the cache. The assumption is not true when MDS does rollback in the resolve stage. This patch removes the assumption and makes Server::do_rename_rollback() check individual object and roll back change. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: preserve non-auth/unlinked objects until slave commit	Yan, Zheng	2013-01-29	6	-61/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MDS should not trim objects in non-auth subtree immediately after replaying a slave rename. Because the slave rename may require rollback later and these objects are needed for rollback. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: don't journal non-auth rename source directory	Yan, Zheng	2013-01-29	1	-16/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	After replaying a slave rename, non-auth directory that we rename out of will be trimmed. So there is no need to journal it. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: force journal straydn for rename if necessary	Yan, Zheng	2013-01-29	2	-27/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rename may overwrite an empty directory inode and move it into stray directory. MDS who has auth subtree beneath the overwrited directory need journal the stray dentry when handling rename slave request. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: splits rename force journal check into separate function	Yan, Zheng	2013-01-29	2	-29/+46
\| \| \| \| \| \| \| \| \| \| \| \|	the function will be used by later patch that fixes rename rollback Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fix "had dentry linked to wrong inode" warning	Yan, Zheng	2013-01-29	2	-11/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The reason of "had dentry linked to wrong inode" warning is that Server::_rename_prepare() adds the destdir to the EMetaBlob before adding the straydir. So during MDS recovers, the destdir is first replayed. The old inode is directly replaced by the source inode. We can void the warning by adding the straydir first. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: don't set xlocks on dentries done when early reply rename	Yan, Zheng	2013-01-29	3	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	_rename_finish() does not send dentry link/unlink message to replicas. We should prevent dentries that are modified by the rename operation from getting new replicas while the rename operation is committing. So don't set xlocks on dentries "done". Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: properly set error_dentry for discover reply	Yan, Zheng	2013-01-28	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If MDCache::handle_discover() receives an 'discover path' request but can not find the base inode. It should properly set the 'error_dentry' to make sure MDCache::handle_discover_reply() checks correct object's wait queue. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: introduce XSYN to SYNC lock state transition	Yan, Zheng	2013-01-28	4	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If lock is in XSYN state, Locker::simple_sync() firstly try changing lock state to EXCL. If it fail to change lock state to EXCL, it just returns. So Locker::simple_sync() does not guarantee the lock state eventually changes to SYNC. This issue can cause replica that requests read lock hang. The fix is introduce an intermediate state for XSYN to SYNC transition. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: allow journaling multiple root inodes in EMetaBlob	Yan, Zheng	2013-01-28	2	-27/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases (rename, rmdir, subtree map), we may need journal multiple root inodes (/, mdsdir) in one EMetaBlob. This patch modifies EMetaBlob format to support journaling multiple root inodes. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: lock remote inode's primary dentry during rename	Yan, Zheng	2013-01-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 1203cd2110 (mds: allow open_remote_ino() to open xlocked dentry) makes Server::handle_client_rename() xlocks remote inodes' primary dentry so witness MDS can open xlocked dentry. But I added remote inodes' projected primary dentries to the xlock list. This is wrong because projected dentries are invisible for path traverse. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: check deleted directory in Server::rdlock_path_xlock_dentry	Yan, Zheng	2013-01-28	3	-21/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit b03eab22e4 (mds: forbid creating file in deleted directory) is not complete, mknod, mkdir and symlink are missed. Move the ckeck into Server::rdlock_path_xlock_dentry() fixes the issue. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
\| *	mds: fix end check in Server::handle_client_readdir()	Yan, Zheng	2013-01-28	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 1174dd3188 (don't retry readdir request after issuing caps) introduced an bug that wrongly marks 'end' in the the readdir reply. The code that touches existing dentries re-uses an iterator, and the iterator is used for checking if readdir is end. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
* \|	doc: fix overly-big fixed-width text in Firefox	Ross Turk	2013-01-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Changed font size for <pre> elements to be 15pt instead of 1.5em - Firefox seems to render 1.1em a bit bigger than other browsers. Signed-off-by: Ross Turk <ross@inktank.com>
* \|	rbd-fuse: fix warning	Sage Weil	2013-01-28	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
* \|	doc: Removed indep, and clarified explanation.	John Wilkins	2013-01-28	1	-6/+16
\| \| \| \| \| \| \| \|	Signed-off-by: John Wilkins <john.wilkins@inktank.com>
* \|	Merge remote-tracking branch 'gh/next'	Sage Weil	2013-01-28	1	-5/+3
\|\ \
\| * \|	rgw: fix crash when missing content-type in POST object	Yehuda Sadeh	2013-01-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: #3941 This fixes a crash when handling S3 POST request and content type is not provided. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
\| * \|	mds: fix end check in Server::handle_client_readdir()	Yan, Zheng	2013-01-23	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 1174dd3188 (don't retry readdir request after issuing caps) introduced an bug that wrongly marks 'end' in the the readdir reply. The code that touches existing dentries re-uses an iterator, and the iterator is used for checking if readdir is end. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com>
* \| \|	Merge branch 'master' of https://github.com/ceph/ceph	John Wilkins	2013-01-28	23	-92/+278
\|\ \ \
\| * \ \	Merge branch 'wip-pool-delete'	Josh Durgin	2013-01-28	14	-60/+52
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
\| \| * \| \|	doc: update ceph man page link	Josh Durgin	2013-01-28	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not the wiki anymore, and the man page needed to be regenerated. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
\| \| * \| \|	ceph, rados: update pool delete docs and usage	Josh Durgin	2013-01-28	7	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
\| \| * \| \|	qa: fix mon pool_ops workunit	Josh Durgin	2013-01-28	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use ! for clarity when commands are supposed to fail. Check a few other cases that should fail, and correct deleting non-existent pools. Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
\| \| * \| \|	qa/workunits/rbd/copy.sh: use non-deprecated --image-format option	Sage Weil	2013-01-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	--format is deprecated. Signed-off-by: Sage Weil <sage@inktank.com>
\| \| * \| \|	mon: safety interlock for pool deletion	Sage Weil	2013-01-26	4	-20/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Require that the pool name be passed twice along with an force option before we irreversibly delete an entire pool of objects. Signed-off-by: Sage Weil <sage@inktank.com>
\| \| * \| \|	Revert "mon: implement safety interlock for deleting pools"	Sage Weil	2013-01-26	2	-43/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit c993ac9b1fa4037f4cc2674455728ee38a7c978b. This is too hard to test. Requiring the pool name twice along with --yes-i-really-really-mean-it should be sufficient. Signed-off-by: Sage Weil <sage@inktank.com>
\| * \| \| \|	Merge branch 'wip-osd-down-out'	Sage Weil	2013-01-28	6	-4/+126
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Samuel Just <sam.just@inktank.com>
\| \| * \| \| \|	mon: set limit so that we do not an entire down subtree out	Sage Weil	2013-01-28	2	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new configurable 'mon osd down out subtree limit' so that you can prevent marking out an entire subtree. If for example an entire rack is down, do not mark anything in it out. If less than the whole rack is down, everything is fair game. Set the default to 'rack'. Signed-off-by: Sage Weil <sage@inktank.com>
\| \| * \| \| \|	osdmap: implement subtree_is_down() and containing_subtree_is_down()	Sage Weil	2013-01-28	2	-0/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement two methos to see if an entire subtree is down, and if the containing parent node of type T of a given node is completely down. Signed-off-by: Sage Weil <sage@inktank.com>
\| \| * \| \| \|	crush: implement get_children(), get_immediate_parent_id()	Sage Weil	2013-01-28	2	-4/+41
\| \| \| \|_\|/ \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>