summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJoao Eduardo Luis <joao.luis@inktank.com>2012-12-20 18:25:14 +0000
committerJoao Eduardo Luis <joao.luis@inktank.com>2012-12-20 19:19:27 +0000
commitc72c0bb1e301d181ac40fda7a727dfe7446e2b89 (patch)
treed73f2c36e29b9808f9f9c55d7a669f11113e26a7
parent1c5a9de5c8367343d31099c97b3bb4dec7a26283 (diff)
downloadceph-c72c0bb1e301d181ac40fda7a727dfe7446e2b89.tar.gz
doc: add-or-rm-mons.rst: Add 'Changing Monitor's IPs' section
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
-rw-r--r--doc/rados/operations/add-or-rm-mons.rst188
1 files changed, 187 insertions, 1 deletions
diff --git a/doc/rados/operations/add-or-rm-mons.rst b/doc/rados/operations/add-or-rm-mons.rst
index d2c7d24cf3e..b3952159979 100644
--- a/doc/rados/operations/add-or-rm-mons.rst
+++ b/doc/rados/operations/add-or-rm-mons.rst
@@ -54,7 +54,7 @@ SSH keys`_, `install Ruby`_ and `install the Chef client`_ on your host. See
.. _Installing Chef: ../../install/chef
.. _install Ruby: ../../install/chef#installruby
-.. _adding-mon:
+.. _Adding a Monitor (Manual):
Adding a Monitor (Manual)
-------------------------
@@ -131,6 +131,8 @@ PAXOS to establish consensus about the master cluster map. You must have
a sufficient number of monitors to establish a quorum for consensus about
the cluster map.
+.. _Removing a Monitor (Manual):
+
Removing a Monitor (Manual)
---------------------------
@@ -200,3 +202,187 @@ a cluster that has placement groups that are persistently not ``active + clean``
ceph-mon -i {mon-id} --inject-monmap {map-path}
#for example
ceph-mon -i a --inject-monmap /etc/surviving_map
+
+
+.. _Changing a Monitor's IP address:
+
+Changing a Monitor's IP address
+===============================
+
+.. important:: Existing monitors are not supposed to change their IPs.
+
+Ceph has strict requirements when it comes to the monitors, given that these
+are a critical component and need to maintain a quorum for the whole system
+to properly work. For a quorum to be established, the monitors need to find
+each other.
+
+A common misconception is that the monitors will infer the location of the other
+monitors in the cluster by reading ``ceph.conf``, but this couldn't be farthest
+from the truth. Clients and the other daemons will in fact use ``ceph.conf``
+to discover the monitors, but not the monitors between themselves. If you
+refer to `Adding a Monitor (Manual)`_ you will see that you need to obtain the
+current cluster monmap when creating a new monitor, as it is one of the required
+arguments of ``ceph-mon -i {mon-id} --mkfs``.
+
+In the following sections we will explain a bit about the monitor's
+consistency requirements, and draw a picture on why it's a bad idea to change
+the monitor's IP addresses. Finally, we will present a couple of safe ways to
+change a monitor's IP address in case of dire need.
+
+
+Consistency Requirements
+------------------------
+
+The monitors will always resort to the monmap stashed on their store
+when finding the other monitors in the cluster. This approach avoids silly
+mistakes that would be likely to happen if we were using ``ceph.conf``
+directly, such as typos when specifying a monitor address or port that could
+break the cluster. Above all, it confers the monitors a strict guarantee that
+the contents of their monmap are the ones that the cluster agrees on, and this
+consensus is crucial on Ceph, specially considering that the monitors are
+responsible for sharing the monmap with clients and other components in the
+cluster.
+
+As with any other updates on the monitor, changes to the monmap always run
+through a distributed consensus algorithm called Paxos. Each update, such
+as adding or removing a monitor, is agreed among all the monitors in the quorum,
+ensuring that all end up with the same version. Furthermore, updates can be
+seen as incremental, meaning that the monitors will hold the latest agreed upon
+version and a reasonable set of previous versions, allowing a new monitor (or a
+monitor that was down for a considerable amount of time) to catch up with the
+current cluster state. This is how serious map consistency is considered on Ceph.
+
+Now for the sake of the argument, say that we were reading the other monitors'
+location (i.e., ip:port) from ``ceph.conf``. Not only would we have to
+accordingly change the configuration file on all the nodes each time we wanted
+to add a new monitor, to reflect the new monitor, but we would be in a world of
+pain if we happened to make a mistake that could be as simple as misspelling the
+monitor's name or port on one of the nodes. Given that the monitors' strict
+requirements for allowing other monitors in the cluster, if one of the existing
+monitors were to be unable to correctly identify another monitor, we could very
+well end up without quorum; worse, we could end up with some anomalous state in
+which most monitors recognize one other monitor as being in the cluster, while
+others don't. The last thing one wants in a cluster is allowing room for chaos
+on a critical component such as the monitors.
+
+
+Changing a Monitor's IP address (The Right Way)
+-----------------------------------------------
+
+We have established that changing the IPs on the ``ceph.conf`` is not an option
+from the monitor's point-of-view. It's still required, of course, but only so
+that clients and the other daemons know where to contact the monitors. From the
+monitor's perspective however, not so much.
+
+The right way of changing a monitor's IP address comes in the form of adding a
+new monitor with the new IP address (as described in `Adding a Monitor (Manual)`_),
+prior to removing the monitor we want to get rid of. This way will ensure that
+not only the remaining monitors know where to contact the new monitor, but it will
+also maintain availability.
+
+For instance, lets assume there are three monitors in place, such as ::
+
+ [mon.a]
+ host = host01
+ addr = 10.0.0.1:6789
+ [mon.b]
+ host = host02
+ addr = 10.0.0.2:6789
+ [mon.c]
+ host = host03
+ addr = 10.0.0.3:6789
+
+Now say that we want to change ``mon.c`` to ``host04``, which has the IP address
+``10.0.0.4``. We would follow the steps on `Adding a Monitor (Manual)`_, adding a
+new monitor ``mon.d``, and then would remove ``mon.c`` as described on
+`Removing a Monitor (Manual)`_. One should be aware that ``mon.d`` must be
+running before removing ``mon.c``, or quorum will be broken. Moving all three
+monitors would thus require repeating this process as many times as needed.
+
+
+Changing a Monitor's IP address (The Messy Way)
+-----------------------------------------------
+
+There may come a time when the monitors must be moved to a different network,
+a different part of the datacenter or a different datacenter altogether. While
+it is possible to do it, the process becomes a bit more hazardous.
+
+In such a case, the solution is to generate a new monmap with updated IP
+addresses for all the monitors in the cluster, and inject the new map on each
+individual monitor. This is not the most user-friendly approach, but we do not
+expect this to be something that needs to be done every other week. As it is
+clearly stated on the top of this section, monitors are not supposed to change
+IP addresses.
+
+Take the previous monitor configuration and say that we want to move all the
+monitors from the ``10.0.0.x`` range to ``10.1.0.x``, and that these networks
+are unable to communicate. Here is what we should do in such a remote
+scenario:
+
+
+#. Retrieve the monitor map, where ``{tmp}`` is the path to
+ the retrieved monitor map, and ``{filename}`` is the name of the file
+ containing the retrieved monitor monitor map. ::
+
+ ceph mon getmap -o {tmp}/{filename}
+
+#. These should be the contents of the monmap. ::
+
+ $ monmaptool --print {tmp}/{filename}
+
+ monmaptool: monmap file {tmp}/{filename}
+ epoch 1
+ fsid 224e376d-c5fe-4504-96bb-ea6332a19e61
+ last_changed 2012-12-17 02:46:41.591248
+ created 2012-12-17 02:46:41.591248
+ 0: 10.0.0.1:6789/0 mon.a
+ 1: 10.0.0.2:6789/0 mon.b
+ 2: 10.0.0.3:6789/0 mon.c
+
+#. Remove the existing monitors. ::
+
+ $ monmaptool --rm a --rm b --rm c {tmp}/{filename}
+
+ monmaptool: monmap file {tmp}/{filename}
+ monmaptool: removing a
+ monmaptool: removing b
+ monmaptool: removing c
+ monmaptool: writing epoch 1 to {tmp}/{filename} (0 monitors)
+
+#. Add the new monitor locations. ::
+
+ $ monmaptool --add a 10.1.0.1:6789 --add b 10.1.0.2:6789 --add c 10.1.0.3:6789 {tmp}/{filename}
+
+ monmaptool: monmap file {tmp}/{filename}
+ monmaptool: writing epoch 1 to {tmp}/{filename} (3 monitors)
+
+#. Check new contents. ::
+
+ $ monmaptool --print {tmp}/{filename}
+
+ monmaptool: monmap file {tmp}/{filename}
+ epoch 1
+ fsid 224e376d-c5fe-4504-96bb-ea6332a19e61
+ last_changed 2012-12-17 02:46:41.591248
+ created 2012-12-17 02:46:41.591248
+ 0: 10.1.0.1:6789/0 mon.a
+ 1: 10.1.0.2:6789/0 mon.b
+ 2: 10.1.0.3:6789/0 mon.c
+
+At this point, it is assumed that either the servers have physically moved to
+the new location, or that the monitor's stores were. That said, the next step
+is to make the changed monmap available to the monitors and inject it to each
+individual monitor.
+
+#. First, make sure to stop all your monitors. Injection must be done while
+ the daemon is not running.
+
+#. Inject the monmap. ::
+
+ ceph-mon -i {mon-id} --inject-monmap {tmp}/{filename}
+
+#. Restart the monitors.
+
+After this step, your monitors should be able to live forever happy in their
+new homes.
+