| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Package udev/50-rbd.rules per bug 3930.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
|
|\
| |
| |
| | |
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| | |
The inode is linked to a non-auth directory, so remove it from LogSegment's
dirty inode list.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
this guarantees that the importing MDS gets directory fragment's
up-to-date fragstat/rstat.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
replaying a client request may need to create slave request and the slave
MDS can be also in the clientreplay stage.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the MDS is the resolve stage, current MDCache::handle_discover() only handles
'discover' from MDS that it has already gotten rejoin acknowledgement. This can
cause circular wait because MDCache::rejoin_gather_finish() fetches reconnected
inodes before send rejoin acknowledgements, and fetching reconnected inode may
triggers 'discover'. The fix is not delay handling 'discover' from MDS that are
also in the rejoin stage.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| | |
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The problem of fetching missing inodes from replicas is that replicated inodes
does not have up-to-date rstat and fragstat. So just fetch missing inodes from
disk
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| | |
Includes remote wrlocks and frozen authpin in cache rejoin strong message
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
My previous patches add two pointers (ambiguous_auth_inode and
auth_pin_freeze) to class Mutation. They are both used by cross
authority rename, both point to the renamed inode. Later patches
need add more rename special state to MDRequest, So just move them
into MDRequest::more
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
when replaying EImportStart, we should set/clear directory's COMPLETE
flag according with the flag in the journal entry.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If we journal opened non-auth inode, during journal replay, the corresponding
entry will add non-auth objects to the cache. But the MDS does not journal all
subsequent modifications (rmdir,rename) to these non-auth objects, so the code
that manages cache and subtree may get confused. Besides non-auth objects will
be trimmed at the resolve stage.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
Otherwise the journal entry will revert the effect of any on-going
rename operation for the inode.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
In the resolve stage, if no MDS claims other MDS's disambiguous subtree
import, the subtree's dir_auth is undefined.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
After swallowing extra subtrees, subtree bounds may change, so it
should re-check.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The MDS may receive a client request, but find there is an existing
slave request. It means other MDS is handling the same request, so
we should not replace the slave request with a new client request,
just forward the request.
The client request may include embeded cap releases, we need process
them even the request is forwarded.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Current code skips using {push,pop}_projected_linkage to modify replica
dentry's linkage. This confuses EMetaBlob::add_dir_context() and makes
it record out-of-date path when TO_ROOT mode is used. This patch changes
the code to always use {push,pop}_projected_linkage to modify dentry's
linkage. It makes sure MDCache::create_subtree_map() record correct and
up-to-date subtree map.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Current code sends resolve messages when resolving MDS set changes.
There is no need to send resolve messages when some MDS leave the
resolve stage. Sending message while some MDS are replaying is also
not very useful.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The resolve stage serves to disambiguate the fate of uncommitted slave
updates and resolve subtrees authority. The MDS sends resolve message
that claims subtrees authority immediately when reslove stage is entered,
When receiving a resolve message, the MDS also processes it immediately.
This may cause problem if there are uncommitted slave rename and some of
them need rollback later. It's because slave rename rollback may modify
subtree map.
The fix is split reslove into two sub-stages, the first sub-stage serves
to disambiguate slave updates, do slave commit or rollback. After the
the first sub-stage finishes, the MDS sends resolve messages that claim
subtrees authority to other MDS and processes received resolve messages.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The main issue of old slave rename rollback code is that it assumes
all affected objects are in the cache. The assumption is not true
when MDS does rollback in the resolve stage. This patch removes the
assumption and makes Server::do_rename_rollback() check individual
object and roll back change.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The MDS should not trim objects in non-auth subtree immediately after
replaying a slave rename. Because the slave rename may require rollback
later and these objects are needed for rollback.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| | |
After replaying a slave rename, non-auth directory that we rename out of will
be trimmed. So there is no need to journal it.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
rename may overwrite an empty directory inode and move it into stray
directory. MDS who has auth subtree beneath the overwrited directory
need journal the stray dentry when handling rename slave request.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| | |
the function will be used by later patch that fixes rename rollback
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The reason of "had dentry linked to wrong inode" warning is that
Server::_rename_prepare() adds the destdir to the EMetaBlob before
adding the straydir. So during MDS recovers, the destdir is first
replayed. The old inode is directly replaced by the source inode.
We can void the warning by adding the straydir first.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
_rename_finish() does not send dentry link/unlink message to replicas.
We should prevent dentries that are modified by the rename operation
from getting new replicas while the rename operation is committing.
So don't set xlocks on dentries "done".
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If MDCache::handle_discover() receives an 'discover path' request but
can not find the base inode. It should properly set the 'error_dentry'
to make sure MDCache::handle_discover_reply() checks correct object's
wait queue.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If lock is in XSYN state, Locker::simple_sync() firstly try changing
lock state to EXCL. If it fail to change lock state to EXCL, it just
returns. So Locker::simple_sync() does not guarantee the lock state
eventually changes to SYNC. This issue can cause replica that requests
read lock hang. The fix is introduce an intermediate state for XSYN
to SYNC transition.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
In some cases (rename, rmdir, subtree map), we may need journal multiple
root inodes (/, mdsdir) in one EMetaBlob. This patch modifies EMetaBlob
format to support journaling multiple root inodes.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit 1203cd2110 (mds: allow open_remote_ino() to open xlocked dentry)
makes Server::handle_client_rename() xlocks remote inodes' primary
dentry so witness MDS can open xlocked dentry. But I added remote inodes'
projected primary dentries to the xlock list. This is wrong because
projected dentries are invisible for path traverse.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Commit b03eab22e4 (mds: forbid creating file in deleted directory)
is not complete, mknod, mkdir and symlink are missed. Move the ckeck
into Server::rdlock_path_xlock_dentry() fixes the issue.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the the readdir reply.
The code that touches existing dentries re-uses an iterator, and the
iterator is used for checking if readdir is end.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
|
| |
| |
| |
| |
| |
| | |
Changed font size for <pre> elements to be 15pt instead of 1.5em - Firefox seems to render 1.1em a bit bigger than other browsers.
Signed-off-by: Ross Turk <ross@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the the readdir reply.
The code that touches existing dentries re-uses an iterator, and the
iterator is used for checking if readdir is end.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
|
|\ \ \ |
|
| |\ \ \
| | | | |
| | | | |
| | | | | |
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
It's not the wiki anymore, and the man page needed to be regenerated.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Use ! for clarity when commands are supposed to fail.
Check a few other cases that should fail, and correct deleting
non-existent pools.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
--format is deprecated.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Require that the pool name be passed twice along with an force option
before we irreversibly delete an entire pool of objects.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This reverts commit c993ac9b1fa4037f4cc2674455728ee38a7c978b.
This is too hard to test. Requiring the pool name twice along with
--yes-i-really-really-mean-it should be sufficient.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | | |
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Add new configurable 'mon osd down out subtree limit' so that you can
prevent marking out an entire subtree. If for example an entire rack is
down, do not mark anything in it out. If less than the whole rack is down,
everything is fair game.
Set the default to 'rack'.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Implement two methos to see if an entire subtree is down, and if the
containing parent node of type T of a given node is completely down.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | |_|/
| | |/| |
| | | | |
| | | | | |
Signed-off-by: Sage Weil <sage@inktank.com>
|