diff options
| author | Bruce Momjian <bruce@momjian.us> | 2006-11-17 16:38:44 +0000 |
|---|---|---|
| committer | Bruce Momjian <bruce@momjian.us> | 2006-11-17 16:38:44 +0000 |
| commit | e1693e514c197b6ddd27b59678f78b9385ac0539 (patch) | |
| tree | 556b481a9bd656ed1e9d066b7f50f5a31dc1bfb0 /doc/src/sgml/high-availability.sgml | |
| parent | cc9698254c9cff7c36ed1e7ea78dad51d87616b1 (diff) | |
| download | postgresql-e1693e514c197b6ddd27b59678f78b9385ac0539.tar.gz | |
Rename replication section "High Availability and Load Balancing".
Diffstat (limited to 'doc/src/sgml/high-availability.sgml')
| -rw-r--r-- | doc/src/sgml/high-availability.sgml | 249 |
1 files changed, 249 insertions, 0 deletions
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml new file mode 100644 index 0000000000..56733e38e8 --- /dev/null +++ b/doc/src/sgml/high-availability.sgml @@ -0,0 +1,249 @@ +<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.1 2006/11/17 16:38:44 momjian Exp $ --> + +<chapter id="high-availability"> + <title>High Availability and Load Balancing</title> + + <indexterm><primary>high availability</></> + <indexterm><primary>failover</></> + <indexterm><primary>replication</></> + <indexterm><primary>load balancing</></> + <indexterm><primary>clustering</></> + <indexterm><primary>data partitioning</></> + + <para> + Database servers can work together to allow a second server to + quickly take over quickly if the primary server fails (high + availability), or to allow several computers to serve the same + data (load balancing). Ideally, database servers could work + together seamlessly. Web servers serving static web pages can + be combined quite easily by merely load-balancing web requests + to multiple machines. In fact, read-only database servers can + be combined relatively easily too. Unfortunately, most database + servers have a read/write mix of requests, and read/write servers + are much harder to combine. This is because though read-only + data needs to be placed on each server only once, a write to any + server has to be propagated to all servers so that future read + requests to those servers return consistent results. + </para> + + <para> + This synchronization problem is the fundamental difficulty for + servers working together. Because there is no single solution + that eliminates the impact of the sync problem for all use cases, + there are multiple solutions. Each solution addresses this + problem in a different way, and minimizes its impact for a specific + workload. + </para> + + <para> + Some solutions deal with synchronization by allowing only one + server to modify the data. Servers that can modify data are + called read/write or "master" servers. Servers that can reply + to read-only queries are called "slave" servers. Servers that + cannot be accessed until they are changed to master servers are + called "standby" servers. + </para> + + <para> + Some failover and load balancing solutions are synchronous, meaning that + a data-modifying transaction is not considered committed until all + servers have committed the transaction. This guarantees that a failover + will not lose any data and that all load-balanced servers will return + consistent results with no propagation delay. Asynchronous updating has + a small delay between the time of commit and its propagation to the + other servers, opening the possibility that some transactions might be + lost in the switch to a backup server, and that load balanced servers + might return slightly stale results. Asynchronous communication is used + when synchronous would be too slow. + </para> + + <para> + Solutions can also be categorized by their granularity. Some solutions + can deal only with an entire database server, while others allow control + at the per-table or per-database level. + </para> + + <para> + Performance must be considered in any failover or load balancing + choice. There is usually a tradeoff between functionality and + performance. For example, a full synchronous solution over a slow + network might cut performance by more than half, while an asynchronous + one might have a minimal performance impact. + </para> + + <para> + The remainder of this section outlines various failover, replication, + and load balancing solutions. + </para> + + <variablelist> + + <varlistentry> + <term>Shared Disk Failover</term> + <listitem> + + <para> + Shared disk failover avoids synchronization overhead by having only one + copy of the database. It uses a single disk array that is shared by + multiple servers. If the main database server fails, the standby server + is able to mount and start the database as though it was recovering from + a database crash. This allows rapid failover with no data loss. + </para> + + <para> + Shared hardware functionality is common in network storage + devices. Using a network file system is also possible, though + care must be taken that the file system has full POSIX behavior. + One significant limitation of this method is that if the shared + disk array fails or becomes corrupt, the primary and standby + servers are both nonfunctional. Another issue is that the + standby server should never access the shared storage while + the primary server is running. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Warm Standby Using Point-In-Time Recovery</term> + <listitem> + + <para> + A warm standby server (see <xref linkend="warm-standby">) can + be kept current by reading a stream of write-ahead log (WAL) + records. If the main server fails, the warm standby contains + almost all of the data of the main server, and can be quickly + made the new master database server. This is asynchronous and + can only be done for the entire database server. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Master/Slave Replication</term> + <listitem> + + <para> + A master/slave replication setup sends all data modification + queries to the master server. The master server asynchronously + sends data changes to the slave server. The slave can answer + read-only queries while the master server is running. The + slave server is ideal for data warehouse queries. + </para> + + <para> + Slony-I is an example of this type of replication, with per-table + granularity, and support for multiple slaves. Because it + updates the slave server asynchronously (in batches), there is + possible data loss during fail over. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Query Broadcasting</term> + <listitem> + + <para> + In query broadcasting, a program intercepts every SQL query + and sends it to all servers. Each server operates independently. + Read-only queries can be sent to a single server because there + is no need for all servers to process it. + </para> + + <para> + One limitation of this solution is that functions like + <function>random()</>, <function>CURRENT_TIMESTAMP</>, and + sequences can have different values on different servers. This + is because each server operates independently, and because SQL + queries are broadcast (and not actual modified rows). If this + is unacceptable, applications must query such values from a + single server and then use those values in write queries. + Also, care must be taken that all transactions either commit + or abort on all servers, perhaps using two-phase commit (<xref + linkend="sql-prepare-transaction" + endterm="sql-prepare-transaction-title"> and <xref + linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">. + Pgpool is an example of this type of replication. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Multi-Master Replication Using Clustering</term> + <listitem> + + <para> + In clustering, each server can accept write requests, and + modified data is transmitted from the original server to every + other server before each transaction commits. Heavy write + activity can cause excessive locking, leading to poor performance. + In fact, write performance is often worse than that of a single + server. Read requests can be sent to any server. Clustering + is best for mostly read workloads, though its big advantage + is that any server can accept write requests — there is + no need to partition workloads between master and slave servers, + and because the changes are sent from one server to another, + there is not a problem with non-deterministic functions like + <function>random()</>. + </para> + + <para> + Clustering is implemented by <productname>Oracle</> in their + <productname><acronym>RAC</></> product. <productname>PostgreSQL</> + does not offer this type of load balancing, though + <productname>PostgreSQL</> two-phase commit (<xref + linkend="sql-prepare-transaction" + endterm="sql-prepare-transaction-title"> and <xref + linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">) + can be used to implement this in application code or middleware. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Data Partitioning</term> + <listitem> + + <para> + Data partitioning splits tables into data sets. Each set can + be modified by only one server. For example, data can be + partitioned by offices, e.g. London and Paris, with a server + in each office. If queries combining London and Paris data + are necessary, an application can query both servers, or + master/slave replication can be used to keep a read-only copy + of the other office's data on each server. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Clustering For Parallel Query Execution</term> + <listitem> + + <para> + This allows multiple servers to work concurrently on a single + query. One possible way this could work is for the data to be + split among servers and for each server to execute its part of + the query and results sent to a central server to be combined + and returned to the user. There currently is no + <productname>PostgreSQL</> open source solution for this. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term>Commercial Solutions</term> + <listitem> + + <para> + Because <productname>PostgreSQL</> is open source and easily + extended, a number of companies have taken <productname>PostgreSQL</> + and created commercial closed-source solutions with unique + failover, replication, and load balancing capabilities. + </para> + </listitem> + </varlistentry> + + </variablelist> + +</chapter> |
