| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Fixes: #4363
Backport: argnaut, bobtail
When listing objects in namespace don't iterate through all the
objects, only go though the ones that starts with the namespace
prefix
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
|
| |
This fixes bad entries in user's bucket list that may have occured
due to issue #4039. Syntax:
$ radosgw-admin user check --uid=<uid> [--fix]
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #4039
User's list of buckets is getting modified even if bucket already
exists. This fix removes the newly created directory object, and
makes sure that user info's data points at the correct bucket.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #4011
When completing the multipart upload, we also need to unlink the
parts from the bucket index. Originally we used to remove the parts
however, nowadays the parts live on as we just point the object
manifest at them. So we don't remove the objects, however, we need
to remove them from the bucket index.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
|
| |
Checks specified bucket for the #4011 symptoms, optionally fix
the issue.
sytax:
radosgw-admin bucket check --bucket=<bucket> [--fix]
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
| |
Add a radosgw-admin option to remove object from bucket index
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This marks a PG for immediate scrub or repair. Adjust the sched_scrub()
code so that we handle these PGs even when should_schedule_scrub is
false (e.g., because the load is high). When we explicitly request a
scrub or repair, we then go through the normal scrub reservation process
to avoid unduly impacting cluster performance.
This is particularly helpful on argonaut, where the final scrub
finalization step blocks writes to the PG, and overlapping scrubs can
exacerbate the problem.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| | |
Move the duplicated reach into info.history.last_scrub_stamp into a helper
so we can control when we queue the PG for scrub.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to detect links which remain in an erroneous snap
collection.
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistencies during
scrub_finalize.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| | |
./include/encoding.h: In function 'void encode(int64_t, ceph::bufferlist&, uint64_t)':
./include/encoding.h:101:1: warning: narrowing conversion of 'v' from 'int64_t {aka long int}' to '__le64 {aka long long unsigned int}' inside { } is ill-formed in C++11 [-Wnarrowing]
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| |
| |
| |
| | |
We need to introduce some new fields here, so to maintain compatibility
we'll need to first bring the 48.* series up to the current encoding.
Signed-off-by: Samuel Just <sam.just@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we can't repair.
See #3705.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If an unlink is interupted between removing the file
and updating the subdir attribute, the attribute will
overestimate the number of files in the directory. This
is by design, at worst we will merge the collection later
than intended, but closing the gap would require a second
subdir xattr update. However, this can in extreme cases
result in a collection with subdirectories but no objects.
FileStore::_destry_collection would therefore see an
erroneous -ENOTEMPTY.
prep_delete allows the CollectionIndex implementation to
clean up state prior to removal.
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit fdc5e5d1877d7d7ed3851b9ec01f884559748249)
Conflicts:
src/os/HashIndex.cc
src/os/HashIndex.h
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used to copy complete metadata of source
object. However, this shouldn't include the source ACLs.
Conflicts:
src/rgw/rgw_rados.cc
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit ccfefe3097a51b49885f2ed5d9334e85b497d963)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection links in some cases. This
has resulted in clones incorrectly not being trimmed. This patch
replaces the logic with something less efficient but hopefully a bit
clearer.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 0f42c37359d976d1fe90f2d3b877b9b0268adc0b)
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Verify that the PG is still RECOVERING or BACKFILL when we take the pg
lock in the recovery thread. This prevents a crash from an invalid
state machine event when the recovery queue races with a PG state change
(e.g., due to peering).
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
|
|
|
|
|
|
| |
Fixes: #3722
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which would
result in the completion never being called unless the buffers
were merged before it's original read completed. This would cause
a hang in any higher level waiting for a read to complete.
The existing loop went backwards (using a forward iterator),
but stopped when the iterator reached the beginning of the map,
or when a waiter belonged to the left BufferHead.
If the first list of waiters should have been moved to the right
BufferHead, it was skipped because at that point the iterator
was at the beginning of the map, which was the main condition
of the loop.
Restructure the waiters-moving loop to go forward in the map instead,
so it's harder to make an off-by-one error.
Possibly-fixes: #3286
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
(cherry picked from commit 2e862f4d183d8b57b43b0777737886f18f68bf00)
|
|
|
|
|
|
|
| |
Commit c34e38bcdc0460219d19b21ca7a0554adf7f7f84 meant to do this but got
the wrong number of zeros.
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as a reference, we should write this
file before we take the snap. We normally ignore current/ contents anyway.
On non-btrfs file systems, however, we should only write this file *after*
we do a full sync, and we should then fsync(2) it before we continue
(and potentially trim anything from the journal).
This fixes a serious bug that could cause data loss and corruption after
a power loss event. For a 'kill -9' or crash, however, there was little
risk, since the writes were still captured by the host's cache.
Fixes: #3721
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 28d59d374b28629a230d36b93e60a8474c902aa5)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging threads are waiting, and we hit
a limit, we the logger could signal another logger instead of the flusher,
and we could deadlock.
Similarly, if the flusher empties the queue, it might signal only a single
logger, and that logger could re-signal the flusher, and the other logger
could wait forever.
Intead, break the single cond into two: one for loggers, and one for the
flusher. Always signal the (one) flusher, and always broadcast to all
loggers.
Backport: bobtail, argonaut
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 813787af3dbb99e42f481af670c4bb0e254e4432)
|
|
|
|
|
|
|
|
|
|
| |
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garbling the crash dump output.
Backport: bobtail, argonaut
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 43cba617aa0247d714632bddf31b9271ef3a1b50)
|
|
|
|
|
|
|
|
| |
The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backticks altogether.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6c7b667badc5e7608b69c533a119a2afc062e257)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running "service ceph status -a", a version number was never
returned for remote hosts, only for the local. This was because
the command to query the version number didn't use the do_cmd
function, which is responsible for running the command over SSH
when needed.
Modify the ceph init.d script to use do_cmd for querying the
Ceph version.
Signed-off-by: Travis Rhoden <trhoden@gmail.com>
(cherry picked from commit 60fdb6fda6233b01dae4ed8a34427d5960840b84)
|
|
|
|
|
|
|
|
| |
This is what we were (wrongly) doing before, so there are no memory
utilization surprises.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 78286b1403a5e0f14f95fe6b92f2fdb163e909f1)
|
|
|
|
|
|
|
| |
<facepalm>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4de7748b72d4f90eb1197a70015c199c15203354)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can have a race like:
queue has 1 item, max is 1.
A: enter submit_entry, signal cond, wait on condition
B: enter submit_entry, signal cond, wait on condition
C: flush wakes up, flushes 1 previous item
A: retakes lock, enqueues something, exits
B: retakes lock, condition fails, waits
-> C is never woken up as there are 2 items waiting
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 50914e7a429acddb981bc3344f51a793280704e6)
|
| |
|
|
|
|
| |
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
|
|
|
|
| |
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
|
|
|
|
| |
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
|
|
|
|
|
|
|
| |
Fixes: #3649
verify_swift_token returns a bool and not an int.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
| |
Fixes bug #3184 where the ceph-fuse client segfaults if authx is
enabled but no keyring file is present. This was due to the
client->init() return value not getting checked.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
(cherry picked from commit 47983df4cbd31f299eef896b4612d3837bd7c7bd)
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise our keyring default location, or any other similarly formatted
location, will be taken as the actual location for the keyring and fail.
Reported-by: tziOm (at) #ceph
Fixes: 3276
Backport: argonaut
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
(cherry picked from commit 7ef0df25e001bfae303feb3ae36514608767b1f2)
|
| |
|
|
|
|
|
|
|
|
|
| |
Fix for bug 3451. Use the commit count and sha1 from git describe to
construct a release string for rpm packages.
Conflicts:
configure.ac
|
|
|
|
|
|
| |
This is a partial fix for bug 3471. Enable building of debuginfo package.
Some distributions enable this automatically by installing additional rpm
macros, on others it needs to be explicity added to the spec file.
|
|
|
|
|
|
|
|
|
| |
Fixes: #3565
Originally ops were using static structures, but that
has since changed. Switching swift auth handler to do
the same.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
| |
The original implementation broke whenever data exceeded
the chunk size. Also don't keep cache for objects that
exceed the chunk size as cache is not designed for
it. Increased chunk size to 512k.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
| |
This fixes a regression introduced at
17e4c0df44781f5ff1d74f3800722452b6a0fc58. The original
patch fixed error leak, however it also removed the
operation's send_response() call.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
| |
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit f86522cdfcd81b2d28c581ac8b8de6226bc8d1a4)
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit 98a04d76ebffa61c3ba4b033cdd57ac57b2f29f3)
Conflicts:
src/rgw/rgw_op.cc
src/rgw/rgw_op.h
|
|
|
|
|
|
|
|
| |
Fixes: #3452
When we read object info, don't try to convert mtime to
UTC, it's already in UTC.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
| |
Don't try to parse beyond the GMT or UTC. Some clients use
special date formatting. If we end up misparsing the date
it'll fail in the authorization, so don't need to be too
restrictive.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
|
|
|
|
|
|
|
|
|
| |
If the given device is already mounted at the target location, do not
mount --move it again and create a bunch of dup entries in the /etc/mtab
and kernel mount table.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c435d314caeb5424c1f4482ad02f8a085317ad5b)
|