Rename replication section "High Availability and Load Balancing".

author: Bruce Momjian <bruce@momjian.us> 2006-11-17 16:38:44 +0000
committer: Bruce Momjian <bruce@momjian.us> 2006-11-17 16:38:44 +0000
commit: e1693e514c197b6ddd27b59678f78b9385ac0539 (patch)
tree: 556b481a9bd656ed1e9d066b7f50f5a31dc1bfb0 /doc/src/sgml/high-availability.sgml
parent: cc9698254c9cff7c36ed1e7ea78dad51d87616b1 (diff)
download: postgresql-e1693e514c197b6ddd27b59678f78b9385ac0539.tar.gz
1 files changed, 249 insertions, 0 deletions
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
new file mode 100644
index 0000000000..56733e38e8
--- /dev/null
+++ b/doc/src/sgml/high-availability.sgml
@@ -0,0 +1,249 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.1 2006/11/17 16:38:44 momjian Exp $ -->
+
+<chapter id="high-availability">
+ <title>High Availability and Load Balancing</title>
+
+ <indexterm><primary>high availability</></>
+ <indexterm><primary>failover</></>
+ <indexterm><primary>replication</></>
+ <indexterm><primary>load balancing</></>
+ <indexterm><primary>clustering</></>
+ <indexterm><primary>data partitioning</></>
+
+ <para>
+  Database servers can work together to allow a second server to
+  quickly take over quickly if the primary server fails (high
+  availability), or to allow several computers to serve the same
+  data (load balancing).  Ideally, database servers could work
+  together seamlessly.  Web servers serving static web pages can
+  be combined quite easily by merely load-balancing web requests
+  to multiple machines.  In fact, read-only database servers can
+  be combined relatively easily too.  Unfortunately, most database
+  servers have a read/write mix of requests, and read/write servers
+  are much harder to combine.  This is because though read-only
+  data needs to be placed on each server only once, a write to any
+  server has to be propagated to all servers so that future read
+  requests to those servers return consistent results.
+ </para>
+
+ <para>
+  This synchronization problem is the fundamental difficulty for
+  servers working together.  Because there is no single solution
+  that eliminates the impact of the sync problem for all use cases,
+  there are multiple solutions.  Each solution addresses this
+  problem in a different way, and minimizes its impact for a specific
+  workload.
+ </para>
+
+ <para>
+  Some solutions deal with synchronization by allowing only one
+  server to modify the data.  Servers that can modify data are
+  called read/write or "master" servers.  Servers that can reply
+  to read-only queries are called "slave" servers.  Servers that
+  cannot be accessed until they are changed to master servers are
+  called "standby" servers.
+ </para>
+
+ <para>
+  Some failover and load balancing solutions are synchronous, meaning that
+  a data-modifying transaction is not considered committed until all
+  servers have committed the transaction.  This guarantees that a failover
+  will not lose any data and that all load-balanced servers will return
+  consistent results with no propagation delay. Asynchronous updating has
+  a small delay between the time of commit and its propagation to the
+  other servers, opening the possibility that some transactions might be
+  lost in the switch to a backup server, and that load balanced servers
+  might return slightly stale results.  Asynchronous communication is used
+  when synchronous would be too slow.
+ </para>
+
+ <para>
+  Solutions can also be categorized by their granularity.  Some solutions
+  can deal only with an entire database server, while others allow control
+  at the per-table or per-database level.
+ </para>
+
+ <para>
+  Performance must be considered in any failover or load balancing
+  choice.  There is usually a tradeoff between functionality and
+  performance.  For example, a full synchronous solution over a slow
+  network might cut performance by more than half, while an asynchronous
+  one might have a minimal performance impact.
+ </para>
+
+ <para>
+  The remainder of this section outlines various failover, replication,
+  and load balancing solutions.
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+  <term>Shared Disk Failover</term>
+  <listitem>
+
+   <para>
+    Shared disk failover avoids synchronization overhead by having only one
+    copy of the database.  It uses a single disk array that is shared by
+    multiple servers.  If the main database server fails, the standby server
+    is able to mount and start the database as though it was recovering from
+    a database crash.  This allows rapid failover with no data loss.
+   </para>
+
+   <para>
+    Shared hardware functionality is common in network storage
+    devices.  Using a network file system is also possible, though
+    care must be taken that the file system has full POSIX behavior.
+    One significant limitation of this method is that if the shared
+    disk array fails or becomes corrupt, the primary and standby
+    servers are both nonfunctional.  Another issue is that the
+    standby server should never access the shared storage while
+    the primary server is running.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Warm Standby Using Point-In-Time Recovery</term>
+  <listitem>
+
+   <para>
+    A warm standby server (see <xref linkend="warm-standby">) can
+    be kept current by reading a stream of write-ahead log (WAL)
+    records.  If the main server fails, the warm standby contains
+    almost all of the data of the main server, and can be quickly
+    made the new master database server.  This is asynchronous and
+    can only be done for the entire database server.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Master/Slave Replication</term>
+  <listitem>
+
+   <para>
+    A master/slave replication setup sends all data modification
+    queries to the master server.  The master server asynchronously
+    sends data changes to the slave server.  The slave can answer
+    read-only queries while the master server is running.  The
+    slave server is ideal for data warehouse queries.
+   </para>
+
+   <para>
+    Slony-I is an example of this type of replication, with per-table
+    granularity, and support for multiple slaves.  Because it
+    updates the slave server asynchronously (in batches), there is
+    possible data loss during fail over.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Query Broadcasting</term>
+  <listitem>
+
+   <para>
+    In query broadcasting, a program intercepts every SQL query
+    and sends it to all servers.  Each server operates independently.
+    Read-only queries can be sent to a single server because there
+    is no need for all servers to process it.
+   </para>
+
+   <para>
+    One limitation of this solution is that functions like
+    <function>random()</>, <function>CURRENT_TIMESTAMP</>, and
+    sequences can have different values on different servers.  This
+    is because each server operates independently, and because SQL
+    queries are broadcast (and not actual modified rows).  If this
+    is unacceptable, applications must query such values from a
+    single server and then use those values in write queries.
+    Also, care must be taken that all transactions either commit
+    or abort on all servers, perhaps using two-phase commit (<xref
+    linkend="sql-prepare-transaction"
+    endterm="sql-prepare-transaction-title"> and <xref
+    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">.
+    Pgpool is an example of this type of replication.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Multi-Master Replication Using Clustering</term>
+  <listitem>
+
+   <para>
+    In clustering, each server can accept write requests, and
+    modified data is transmitted from the original server to every
+    other server before each transaction commits.  Heavy write
+    activity can cause excessive locking, leading to poor performance.
+    In fact, write performance is often worse than that of a single
+    server.  Read requests can be sent to any server.  Clustering
+    is best for mostly read workloads, though its big advantage
+    is that any server can accept write requests &mdash; there is
+    no need to partition workloads between master and slave servers,
+    and because the changes are sent from one server to another,
+    there is not a problem with non-deterministic functions like
+    <function>random()</>.
+   </para>
+
+   <para>
+    Clustering is implemented by <productname>Oracle</> in their
+    <productname><acronym>RAC</></> product.  <productname>PostgreSQL</>
+    does not offer this type of load balancing, though
+    <productname>PostgreSQL</> two-phase commit (<xref
+    linkend="sql-prepare-transaction"
+    endterm="sql-prepare-transaction-title"> and <xref
+    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">)
+    can be used to implement this in application code or middleware.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Data Partitioning</term>
+  <listitem>
+
+   <para>
+    Data partitioning splits tables into data sets.  Each set can
+    be modified by only one server.  For example, data can be
+    partitioned by offices, e.g. London and Paris, with a server
+    in each office.  If queries combining London and Paris data
+    are necessary, an application can query both servers, or
+    master/slave replication can be used to keep a read-only copy
+    of the other office's data on each server.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Clustering For Parallel Query Execution</term>
+  <listitem>
+
+   <para>
+    This allows multiple servers to work concurrently on a single
+    query.  One possible way this could work is for the data to be
+    split among servers and for each server to execute its part of
+    the query and results sent to a central server to be combined
+    and returned to the user.  There currently is no
+    <productname>PostgreSQL</> open source solution for this.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Commercial Solutions</term>
+  <listitem>
+
+   <para>
+    Because <productname>PostgreSQL</> is open source and easily
+    extended, a number of companies have taken <productname>PostgreSQL</>
+    and created commercial closed-source solutions with unique
+    failover, replication, and load balancing capabilities.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+</chapter>
author	Bruce Momjian <bruce@momjian.us>	2006-11-17 16:38:44 +0000
committer	Bruce Momjian <bruce@momjian.us>	2006-11-17 16:38:44 +0000
commit	e1693e514c197b6ddd27b59678f78b9385ac0539 (patch)
tree	556b481a9bd656ed1e9d066b7f50f5a31dc1bfb0 /doc/src/sgml/high-availability.sgml
parent	cc9698254c9cff7c36ed1e7ea78dad51d87616b1 (diff)
download	postgresql-e1693e514c197b6ddd27b59678f78b9385ac0539.tar.gz