| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow librbd users to create in-memory cache pools, and open images using
those caches. This lets you control the total amount of memory consumed
for some number of open images. It also lets you individually control
the writeback behavior for individual images (e.g., have some write-thru,
some write-back).
This doesn't let you specify max_dirty limits on a per-image basis, tho,
unless you give that image its own cache.
Note that doing rbd_open on an image when 'rbd cache' is true is
equivalent to creating a separate cache for that image using the
'rbd cache *' tunables. This API is meant to be used in leiu of the
'rbd cache*' options.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
Put the ObjectCacher and its lock into a separate CacheCtx object. This
will eventually let us share it between images.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
This will allow us to share a single cache between different users.
Also use a pointer rather than a reference.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make ObjectCacher users specify the cache size for each ObjectCacher
instances. This avoids the confusing config namespace for the object
cache (client_oc_*), and also will make it possible to eventually have
cache sizes that vary between (say) RBD images.
- drop unused client_oc_max_sync_write
- add rbd_cache_max_size, max_dirty, target_dirty config values (these are
the defaults for each image)
We probably want to add librbd calls to specify the cache size on a
per-image basis? Alternatively, we should make it possible to share a
cache pool between multiple images in some explicit way.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
Once upon a time the caller would do this, but none of those have survived,
and this makes more sense.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
|
|
|
|
|
|
|
|
| |
If the max_dirty config is 0, switch to write-thru mode, which will
explicitly flush and wait on the range we just dirtied.
Closes: #2335
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
| |
Similar to C_SafeCond, but assume finisher already holds the relevant lock.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
A common pattern is to search for the first buffer intersecting or
following an object offset. Use a helper for that.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
Add ability to flush a range of an object, or a vector of ObjectExtents. Flush
any buffers that intersect the specified range, or the entire object if len==0.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
This gives us access to the original ObjectExtent (useful later), and
simplifies the callers.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We do three things here:
- Wait for the dirty limit to drop _after_ writing into the cache. This
means that an active thread can always provide its dirty data to the
cache for potential writing without waiting (a small win). It's also
helpful later... (see below, and next commit)
- Don't wait for other waiters. If another thread dirtying 1MB and is
waiting for it, don't wait for them too. This prevents two threads
writing 1MB at a time with a limit of 1MB from serializing: both can
dirty their 1MB and initiate a flush, and they once 1/2 of that has
flushed one of them will be allowed to proceed.
- Update the flusher to add the dirty_waiting bytes to the amount to
write so that the OPs will indeed be parallel.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
If you give it a nonsensical loc, it will fail check_item_loc() (false) and
then error out on insert_item().
Reported-by: Sam Just <sam.just@inktank.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
| |
We don't adjust the internal hierarchy structure (currently). This is a
bit confusing, so describe the semantics in some detail.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|
|
|
|
|
|
|
| |
- drop useless cur for check_item_loc
- comment the checks we're doing so the code is understandable
- use name_exists instead of broken get_item_id != 0 check
Signed-off-by: Sage Weil <sage@newdream.net>
|
|\
| |
| |
| | |
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| |
| |
| |
| |
| |
| | |
If the weight doesn't change it should be a no-op.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
|
| |
| |
| |
| |
| |
| |
| | |
This is less ugly than converting the quantized value back to a float and
comparing that.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
|
| |
| |
| |
| |
| |
| | |
Thanks Greg!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
|
| |
| |
| |
| |
| |
| | |
'osd crush set ...' is better, us that instead.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| | |
This operation will add/update/move the item to the specified location.
It is idempotent and much more useful than 'osd crush add ...'.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| | |
Similar to --add-item, except it will move, rename, or reweight the item if
it is already present in the map.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is similar to insert_item(), except it will succeed if the item is
already there, and will move an item to the specified location if it is
not. It returns 0 for no change, 1 if a chance was made. It also makes
sure the weight and name match.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The check_item_loc() method will take an item and position and tell you if
it matches the items current location. The matching is identical to that
used for insert_item, in that a specific location constraint match means
success, even if a less specific one does not match (e.g., rack=wrongrack,
host=correcthost will return true).
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| | |
Reweight an item to 0 before removing it, so that the parent weights are
adjusted accordingly.
Signed-off-by: Sage Weil <sage@newdream.net>
|
|\ \
| | |
| | |
| | | |
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Make the flow clearer for the three cases (exists, about to exist, new).
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | |
| | |
| | |
| | |
| | |
| | | |
It's easier to manage/rev/grok these inline.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |/
| |
| |
| |
| |
| |
| |
| | |
All users pass around bufferlists and avoid encoding these structures
inline, but the dencoder tests are picky. Disable that for these types so
that we can add new fields without noise.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| | |
This exercises all the new per-osd uuid code.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| | |
E.g., ceph-osd --mkfs --osd-uuid <uuid> -i 123 ...
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| | |
Mkfs currently always generates a new uuid. Allow the caller to feed one
in.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Make the osd create command idempotent by providing a uuid. If you call it
multiple times with the same (or some other existing) uuid you'll get back
the osd id that is already using it.
Drop support for 'osd create <id>', which was mostly useless and
non-idempotent anyway.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We may want to make this more strict, so that if it is defined it has to
match the map, and only fill it in when the map's uuid is still zeroed
(for legacy clusters)...
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Rev the extended section of the map to store it. Dump it when the osd
exists. Zero it out if an osd is destroyed. Provide some accessors to
identify an osd given a uuid (linear search).
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| |
| | |
It's overkill to do the dir fsync on each file, but not worth making
efficient.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| |
| |
| |
| |
| |
| | |
No spaces here, apparently!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
|
| |
| |
| |
| |
| |
| |
| | |
This looks in /proc to count threads. Kludgey and no longer needed.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We were verifying that there was only 1 thread (the presumably main()) when
we call daemonize. However, with the new logging code, we stop a thread
right before the check, and /proc apparently updates asynchronously such
that our attempt to count running threads gives us a bad answer.
Just remove this kludgey check; we'll have to catch this class of bugs
the hard way.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
|
| |
| |
| |
| |
| | |
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes issue #2381.
The method interface was different than the one needed in order
to override the one in RGWRados.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|
| |
| |
| |
| | |
Signed-off-by: Sage Weil <sage@newdream.net>
|
|\ \ |
|