summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* mon/OSDMonitor: 'osd tier {set,remove}-overlay <pool> [tierpool]'wip-tierSage Weil2013-08-273-0/+83
| | | | | | Also prevent 'osd tier remove ...' if the tierpool is the current overlay. Signed-off-by: Sage Weil <sage@inktank.com>
* qa/workunits/cephtool/test.sh: test osd tier CLISage Weil2013-08-271-0/+16
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* mon/OSDMonitor: 'osd tier cache-mode <pool> <mode>'Sage Weil2013-08-273-0/+46
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* mon/OSDMonitor: 'osd pool tier <add|remove> <pool> <tierpool>'Sage Weil2013-08-272-1/+90
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* osd/OSDMonitor: avoid polluting pending_inc on error for 'osd pool set ...'Sage Weil2013-08-271-2/+15
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* osd_types: add pg_pool_t cache-related fieldsSage Weil2013-08-274-8/+72
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* doc/dev/cache-pool: document cache pool management interfaceSage Weil2013-08-271-0/+70
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* add CEPH_FEATURE_OSD_CACHEPOOLSage Weil2013-08-271-0/+2
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* osd/ReplicatedPG: do not requeue if not primarySage Weil2013-08-271-5/+15
| | | | | | | | This saves us a bit of work, since we will discard the op anyway if we aren't primary (or even if we become primary again before we get to it). Signed-off-by: Sage Weil <sage@inktank.com>
* osd: COPY_GET operationSage Weil2013-08-278-0/+249
| | | | | | | | Add new rados operation to copy all user-visible content for an object in a simple, safe way. Use a new object_copy_cursor_t to keep track of our position. Signed-off-by: Sage Weil <sage@inktank.com>
* osd/ReplicatedPG: factor {execute,reply}_ctx() out of do_op()Sage Weil2013-08-272-26/+55
| | | | | | | | Separate the processing of an OpContext from the preamble and allocation, so that we can delay the execution for some ops (like the COPYFROM operation we're about to add). Signed-off-by: Sage Weil <sage@inktank.com>
* osd: feed OSDMaps to the ObjecterSage Weil2013-08-261-0/+7
| | | | | | | | | Feed every map message we see (that isn't discarded for some other reason) to the Objecter. It has the same continuity requirements that the OSD has, so it should be satisfied with what we get. It can also request maps via our MonClient. Signed-off-by: Sage Weil <sage@inktank.com>
* osd: add an Objecter instanceSage Weil2013-08-265-4/+91
| | | | | | It gets its own lock, timer, and osdmap. Signed-off-by: Sage Weil <sage@inktank.com>
* osd: discriminate based on connection messenger, not peer typeSage Weil2013-08-261-6/+3
| | | | | | | | Replace ->get_source().is_osd() checks and instead see if it is the cluster_messenger so that we do not confuse ourselves when we get legit requests from other OSDs on our public interface. Signed-off-by: Sage Weil <sage@inktank.com>
* ceph-osd: rename msgr varsSage Weil2013-08-262-71/+79
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* osd: add a separate messenger for the ObjecterSage Weil2013-08-263-0/+14
| | | | | | | We will give the OSD's Objecter its own messenger so that it does not interfere with the OSD when it marks things up or down. Signed-off-by: Sage Weil <sage@inktank.com>
* osd/ReplicatedPG: add whitespaceSage Weil2013-08-261-0/+10
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* osd: less whitespaceSage Weil2013-08-261-8/+2
| | | | Signed-off-by: Sage Weil <sage@inktank.com>
* osdc/Objecter: allow ops to be canceledSage Weil2013-08-262-0/+29
| | | | | | | This is useful in general, and specifically will be useful for the rados COPY operation. Signed-off-by: Sage Weil <sage@inktank.com>
* osdc/Objecter: only request map on startup if epoch == 0Sage Weil2013-08-261-1/+2
| | | | | | | Normal clients have no map and need one to get started. If we are the OSD, we will already have one and will get fed maps as they come in. Signed-off-by: Sage Weil <sage@inktank.com>
* osd, objecter: clean up assert_ver()Sage Weil2013-08-264-3/+10
| | | | | | | | Create a separate union in the args and clean up the code a bit so that this doesn't reuse the (unrelated) watch helpers. No change in protocol. Signed-off-by: Sage Weil <sage@inktank.com>
* osd/ReplicatedPG: drop src_obc.clear() callsSage Weil2013-08-261-6/+0
| | | | | | | These are all about to go out of scope; no need to clear them explicitly. Signed-off-by: Sage Weil <sage@inktank.com>
* os/ObjectStore: add bufferlist variant of setattrsSage Weil2013-08-261-0/+15
| | | | | | And hopefully we can kill the bufferptr ones someday! Signed-off-by: Sage Weil <sage@inktank.com>
* mon/DataHealthService: preserve compat of data stats dumpSage Weil2013-08-261-2/+1
| | | | | | See 96621bdb004e539a0186fb592f44d51cf49f1c31. Signed-off-by: Sage Weil <sage@inktank.com>
* Merge pull request #526 from ceph/wip-5909Sage Weil2013-08-269-12/+207
|\ | | | | | | | | mon: Early warning system for monitor stores growing over predefined threshold Reviewed-by: Sage Weil <sage@inktank.com>
| * mon: DataHealthService: monitor backing store's size and report itJoao Eduardo Luis2013-08-244-3/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | If the store's size grows beyond what we believe to be reasonable, we must let the user know that something fishy may be going on. This intends to act as an early warning system for monitors suffering from leveldb compaction issues. However, if the monitor's store is just growing a lot due to normal cluster behaviour, we made sure that the warning threshold is adjustable by tuning 'mon_leveldb_size_warn' (defaulting to 40GB). Fixes: #5909 Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
| * mon: mon_types: DataStats: add 'dump(Formatter*)' methodJoao Eduardo Luis2013-08-242-8/+15
| | | | | | | | | | | | | | ... and use it on DataHealthService.cc, instead of building our own version of the classes' formatted output. Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
| * mon: MonitorDBStore: rely on backing store to provide estimated store sizeJoao Eduardo Luis2013-08-241-0/+4
| | | | | | | | Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
| * test: ceph_test_store_tool: output estimated store size on 'get-size'Joao Eduardo Luis2013-08-241-0/+14
| | | | | | | | Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
| * os: KeyValueDB: expose interface to obtain estimated store sizeJoao Eduardo Luis2013-08-233-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | On LevelDBStore, instead of using leveldb's GetApproximateSizes() function, we will instead assess what's the store's raw size from the contents of the store dir (this means .sst's, .log's, etc). The reason behind this approach is that GetApproximateSizes() would expect us to provide a range of keys for which to obtain an approximate size; on the other hand, what we really want is to obtain the size of the store -- not the size of the data (besides, with the compaction issues we've been seeing, we wonder how reliable such approximation would be). Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
| * mon: MonitorDBStore: output to derr instead of std::coutJoao Eduardo Luis2013-08-161-2/+2
| | | | | | | | Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
* | Merge pull request #540 from ceph/wip-doc-updateSage Weil2013-08-261-1/+27
|\ \ | | | | | | List packages needed for RPM-based distros
| * | fix nss lib nameAlfredo Deza2013-08-261-1/+1
| | | | | | | | | | | | Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
| * | update the README with required RPM packagesAlfredo Deza2013-08-261-1/+27
| | | | | | | | | | | | Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com>
* | | WBThrottle: use fdatasync instead of fsyncSamuel Just2013-08-261-1/+1
| | | | | | | | | | | | | | | | | | Backport: dumpling Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | | FileStore: add config option to disable the wbthrottleSamuel Just2013-08-262-1/+3
|/ / | | | | | | | | | | Backport: dumpling Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
* | Merge branch 'sleinen'Josh Durgin2013-08-251-1/+1
|\ \ | | | | | | | | | Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
| * | Improve warning message when there are unfound objects, but probingSimon Leinen2013-08-251-1/+1
| | | | | | | | | | | | | | | | | | hasn't finished yet. Signed-off-by: Simon Leinen <simon.leinen@switch.ch>
* | | Merge remote-tracking branch 'gh/next'Sage Weil2013-08-249-11/+41
|\ \ \
| * \ \ Merge pull request #531 from dmick/wip-6099Sage Weil2013-08-231-0/+13
| |\ \ \ | | | | | | | | | | | | | | | | | | | | ceph_rest_api.py: create own default for log_file Reviewed-by: Sage Weil <sage@inktank.com>
| | * | | ceph_rest_api.py: create own default for log_fileDan Mick2013-08-231-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | common/config thinks the default log_file for non-daemons should be "". Override that so that the default is /var/log/ceph/{cluster}-{name}.{pid}.log since ceph-rest-api is more of a daemon than a client. Fixes: #6099 Backport: dumpling Signed-off-by: Dan Mick <dan.mick@inktank.com>
| * | | | Merge pull request #535 from ceph/wip-readdir-r-sucksYehuda Sadeh2013-08-234-8/+11
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | Fix readdir_r invocation Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
| | * | | | os: make readdir_r buffers largerSage Weil2013-08-232-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PATH_MAX isn't quite big enough. Backport: dumpling, cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com>
| | * | | | os: fix readdir_r buffer sizeSage Weil2013-08-232-4/+6
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The buffer needs to be big or else we're walk all over the stack. Backport: dumpling, cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com>
| * | | | mon/Paxos: fix another uncommitted value corner caseSage Weil2013-08-231-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that we begin the paxos recovery with an uncommitted value for, say, commit 100. During last/collect we discover 100 has been committed already. But also, another node provides an uncommitted value for 101 with the same pn. Currently, we refuse to learn it, because the pn is not strictly > than our current uncommitted pn... even though it is the next last_committed+1 value that we need. There are two possible fixes here: - make this a >= as we can accept newer values from the same pn. - discard our uncommitted value metadata when we commit the value. Let's do both! Fixes: #6090 Signed-off-by: Sage Weil <sage@inktank.com>
| * | | | rgw: bucket meta remove don't overwrite entry point firstYehuda Sadeh2013-08-231-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: #6056 When removing a bucket metadata entry we first unlink the bucket and then we remove the bucket entrypoint object. Originally when unlinking the bucket we first overwrote the bucket entrypoint entry marking it as 'unlinked'. However, this is not really needed as we're just about to remove it. The original version triggered a bug, as we needed to propagate the new header version first (which we didn't do, so the subsequent bucket removal failed). Reviewed-by: Greg Farnum <greg@inktank.com> Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
| * | | | ceph-disk: specify the filetype when mountingAlfredo Deza2013-08-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Alfredo Deza <alfredo.deza@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
| * | | | Merge pull request #532 from dmick/nextSage Weil2013-08-221-1/+0
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | PGMonitor: pg dump_stuck should respect --format (plain works fine) Reviewed-by: Sage Weil <sage@inktank.com>
| | * | | | PGMonitor: pg dump_stuck should respect --format (plain works fine)Dan Mick2013-08-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dan Mick <dan.mick@inktank.com>
| * | | | | QA: Compile fsstress if missing on machine.Sandon Van Ness2013-08-221-0/+15
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some distro's have a lack of ltp-kernel packages and all we need is fstress. This just modified the shell script to download/compile fstress from source and copy it to the right location if it doesn't currently exist where it is expected. It is a very small/quick compile and currently only SLES and debian do not have it already. Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Sandon Van Ness <sandon@inktank.com>