delta/ceph.git - github.com: ceph/ceph.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	librbd: add explicit management of cacheshistoric/rbd-multi-cache	Sage Weil	2012-05-05	2	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow librbd users to create in-memory cache pools, and open images using those caches. This lets you control the total amount of memory consumed for some number of open images. It also lets you individually control the writeback behavior for individual images (e.g., have some write-thru, some write-back). This doesn't let you specify max_dirty limits on a per-image basis, tho, unless you give that image its own cache. Note that doing rbd_open on an image when 'rbd cache' is true is equivalent to creating a separate cache for that image using the 'rbd cache ' tunables. This API is meant to be used in leiu of the 'rbd cache' options. Signed-off-by: Sage Weil <sage@newdream.net>
*	librbd: move cache into separate CacheCtx	Sage Weil	2012-05-05	1	-56/+84
\| \| \| \| \| \| \|	Put the ObjectCacher and its lock into a separate CacheCtx object. This will eventually let us share it between images. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: specify the WritebackHandler for each ObjectSet	Sage Weil	2012-05-05	5	-34/+33
\| \| \| \| \| \| \| \|	This will allow us to share a single cache between different users. Also use a pointer rather than a reference. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: make cache sizes explicit	Sage Weil	2012-05-05	5	-21/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make ObjectCacher users specify the cache size for each ObjectCacher instances. This avoids the confusing config namespace for the object cache (client_oc_*), and also will make it possible to eventually have cache sizes that vary between (say) RBD images. - drop unused client_oc_max_sync_write - add rbd_cache_max_size, max_dirty, target_dirty config values (these are the defaults for each image) We probably want to add librbd calls to specify the cache size on a per-image basis? Alternatively, we should make it possible to share a cache pool between multiple images in some explicit way. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: delete unused onfinish from flush_set	Sage Weil	2012-05-05	1	-0/+4
\| \| \| \| \| \| \|	Once upon a time the caller would do this, but none of those have survived, and this makes more sense. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	objectcacher: explicit write-thru mode	Sage Weil	2012-05-05	1	-16/+29
\| \| \| \| \| \| \| \|	If the max_dirty config is 0, switch to write-thru mode, which will explicitly flush and wait on the range we just dirtied. Closes: #2335 Signed-off-by: Sage Weil <sage@newdream.net>
*	common: add C_Cond	Sage Weil	2012-05-05	1	-5/+34
\| \| \| \| \| \|	Similar to C_SafeCond, but assume finisher already holds the relevant lock. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: user helper to get starting point in buffer map	Sage Weil	2012-05-05	2	-58/+27
\| \| \| \| \| \| \|	A common pattern is to search for the first buffer intersecting or following an object offset. Use a helper for that. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: flush range, set	Sage Weil	2012-05-05	2	-9/+79
\| \| \| \| \| \| \|	Add ability to flush a range of an object, or a vector of ObjectExtents. Flush any buffers that intersect the specified range, or the entire object if len==0. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: wait directly from writex()	Sage Weil	2012-05-04	4	-14/+17
\| \| \| \| \| \| \|	This gives us access to the original ObjectExtent (useful later), and simplifies the callers. Signed-off-by: Sage Weil <sage@newdream.net>
*	objectcacher: don't wait for write waiters; wait after dirtying	Sage Weil	2012-05-04	4	-20/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We do three things here: - Wait for the dirty limit to drop _after_ writing into the cache. This means that an active thread can always provide its dirty data to the cache for potential writing without waiting (a small win). It's also helpful later... (see below, and next commit) - Don't wait for other waiters. If another thread dirtying 1MB and is waiting for it, don't wait for them too. This prevents two threads writing 1MB at a time with a limit of 1MB from serializing: both can dirty their 1MB and initiate a flush, and they once 1/2 of that has flushed one of them will be allowed to proceed. - Update the flusher to add the dirty_waiting bytes to the amount to write so that the OPs will indeed be parallel. Signed-off-by: Sage Weil <sage@newdream.net>
*	crush: update_item() should pass an error back to the caller	Sage Weil	2012-05-04	1	-3/+3
\| \| \| \| \| \| \| \|	If you give it a nonsensical loc, it will fail check_item_loc() (false) and then error out on insert_item(). Reported-by: Sam Just <sam.just@inktank.com> Signed-off-by: Sage Weil <sage@newdream.net>
*	crush: improve docs/comments for check_item_loc and insert_item semantics	Sage Weil	2012-05-04	1	-1/+27
\| \| \| \| \| \| \|	We don't adjust the internal hierarchy structure (currently). This is a bit confusing, so describe the semantics in some detail. Signed-off-by: Sage Weil <sage@newdream.net>
*	crush: comment and clean up checks for check_item_loc and insert_item	Sage Weil	2012-05-04	1	-12/+13
\| \| \| \| \| \| \| \|	- drop useless cur for check_item_loc - comment the checks we're doing so the code is understandable - use name_exists instead of broken get_item_id != 0 check Signed-off-by: Sage Weil <sage@newdream.net>
*	Merge branch 'wip-crush-update'	Sage Weil	2012-05-03	12	-21/+429
\|\ \| \| \| \| \| \|	Reviewed-by: Greg Farnum <greg@inktank.com>
\| *	crushtool: another simple test for update	Sage Weil	2012-05-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	If the weight doesn't change it should be a no-op. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	crush: document return values	Sage Weil	2012-05-03	1	-6/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	crush: compare fixed-point weights in update_item	Sage Weil	2012-05-03	2	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is less ugly than converting the quantized value back to a float and comparing that. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	crush: clean up check_item_loc() comments	Sage Weil	2012-05-03	2	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Thanks Greg! Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	mon: drop 'osd crush add ...'	Sage Weil	2012-05-02	1	-47/+0
\| \| \| \| \| \| \| \| \| \| \| \|	'osd crush set ...' is better, us that instead. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	vstart.sh: use 'osd crush set ...'	Sage Weil	2012-05-02	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| *	mon: 'osd crush set ...' do an add or update	Sage Weil	2012-05-02	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This operation will add/update/move the item to the specified location. It is idempotent and much more useful than 'osd crush add ...'. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	crushtool: extent cli test to include --remove-item and --update-item	Sage Weil	2012-05-02	7	-1/+249
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| *	crushtool: add --update-item command	Sage Weil	2012-05-02	1	-5/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to --add-item, except it will move, rename, or reweight the item if it is already present in the map. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	crush: do some docs	Sage Weil	2012-05-02	2	-8/+41
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| *	crush: implement update_item()	Sage Weil	2012-05-02	2	-1/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is similar to insert_item(), except it will succeed if the item is already there, and will move an item to the specified location if it is not. It returns 0 for no change, 1 if a chance was made. It also makes sure the weight and name match. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	crush: add check_item_loc	Sage Weil	2012-05-02	2	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The check_item_loc() method will take an item and position and tell you if it matches the items current location. The matching is identical to that used for insert_item, in that a specific location constraint match means success, even if a less specific one does not match (e.g., rack=wrongrack, host=correcthost will return true). Signed-off-by: Sage Weil <sage@newdream.net>
\| *	crush: fix weights when removing items	Sage Weil	2012-05-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reweight an item to 0 before removing it, so that the parent weights are adjusted accordingly. Signed-off-by: Sage Weil <sage@newdream.net>
* \|	Merge branch 'wip-osd-uuid'	Sage Weil	2012-05-03	14	-68/+179
\|\ \ \| \| \| \| \| \| \| \| \|	Reviewed-by: Greg Farnum <greg@inktank.com>
\| * \|	mon: simplify 'osd create <uuid>' command	Sage Weil	2012-05-03	1	-28/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the flow clearer for the three cases (exists, about to exist, new). Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	osd: drop unused CEPH_OSDMAPVERSION #defines	Sage Weil	2012-05-03	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's easier to manage/rev/grok these inline. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	ceph-object-corpus: a few instances of the newly encoded types	Sage Weil	2012-05-02	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	ceph-dencoder: ignore trailing goop after OSDMap and OSDMap::Incremental	Sage Weil	2012-05-02	2	-4/+9
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All users pass around bufferlists and avoid encoding these structures inline, but the dencoder tests are picky. Disable that for these types so that we can add new fields without noise. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	vstart.sh: explicitly specify uuids during startup	Sage Weil	2012-05-01	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This exercises all the new per-osd uuid code. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	osd: --get-{osd,journal}-uuid synonyms for --get-{osd,journal}-fsid	Sage Weil	2012-05-01	1	-2/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| *	osd: allow uuid to be fed to mkfs with 'osd uuid' setting	Sage Weil	2012-05-01	2	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	E.g., ceph-osd --mkfs --osd-uuid <uuid> -i 123 ... Signed-off-by: Sage Weil <sage@newdream.net>
\| *	filestore: allow fsid to be fed in for mkfs	Sage Weil	2012-05-01	3	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mkfs currently always generates a new uuid. Allow the caller to feed one in. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	mon: 'osd create <uuid>'	Sage Weil	2012-05-01	1	-21/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the osd create command idempotent by providing a uuid. If you call it multiple times with the same (or some other existing) uuid you'll get back the osd id that is already using it. Drop support for 'osd create <id>', which was mostly useless and non-idempotent anyway. Signed-off-by: Sage Weil <sage@newdream.net>
\| *	mon: fill in osd uuid in map on boot	Sage Weil	2012-05-01	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We may want to make this more strict, so that if it is defined it has to match the map, and only fill it in when the map's uuid is still zeroed (for legacy clusters)... Signed-off-by: Sage Weil <sage@newdream.net>
\| *	osdmap: store a uuid for each osd	Sage Weil	2012-05-01	2	-2/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rev the extended section of the map to store it. Dump it when the osd exists. Zero it out if an osd is destroyed. Provide some accessors to identify an osd given a uuid (linear search). Signed-off-by: Sage Weil <sage@newdream.net>
\| *	osd: make output less ugly	Sage Weil	2012-05-01	1	-5/+5
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| *	osd: create a 'ready' file on mkfs completion	Sage Weil	2012-05-01	1	-6/+15
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
\| *	osd: use fsync+rename when writing meta files (during mkfs)	Sage Weil	2012-05-01	1	-6/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's overkill to do the dir fsync on each file, but not worth making efficient. Signed-off-by: Sage Weil <sage@newdream.net>
* \|	Makefile: fix $shell_scripts substution	Sage Weil	2012-05-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	No spaces here, apparently! Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	thread: remove get_num_threads() static	Sage Weil	2012-05-03	2	-33/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This looks in /proc to count threads. Kludgey and no longer needed. Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Reviewed-by: Greg Farnum <greg@inktank.com>
* \|	global_init: do not count threads before daemonize()	Sage Weil	2012-05-03	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were verifying that there was only 1 thread (the presumably main()) when we call daemonize. However, with the new logging code, we stop a thread right before the check, and /proc apparently updates asynchronously such that our attempt to count running threads gives us a bad answer. Just remove this kludgey check; we'll have to catch this class of bugs the hard way. Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Reviewed-by: Greg Farnum <greg@inktank.com>
* \|	OpRequest: only show a small set of the oldest messages, instead of all.	Joao Eduardo Luis	2012-05-03	2	-5/+38
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
* \|	rgw: update cache interface for put_obj_meta	Yehuda Sadeh	2012-05-03	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes issue #2381. The method interface was different than the one needed in order to override the one in RGWRados. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* \|	doc: fix some underscores	Sage Weil	2012-05-03	1	-2/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
* \|	Merge branch 'wip-doc-rebase-2'	Sage Weil	2012-05-03	69	-987/+29566
\|\ \