summaryrefslogtreecommitdiff
path: root/doc/src/sgml/wal.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/wal.sgml')
-rw-r--r--doc/src/sgml/wal.sgml60
1 files changed, 30 insertions, 30 deletions
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 76f1fdcf3b..2c5ce01112 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.60 2009/11/28 16:21:31 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.61 2010/02/03 17:25:06 momjian Exp $ -->
<chapter id="wal">
<title>Reliability and the Write-Ahead Log</title>
@@ -42,9 +42,9 @@
<para>
Next, there might be a cache in the disk drive controller; this is
particularly common on <acronym>RAID</> controller cards. Some of
- these caches are <firstterm>write-through</>, meaning writes are passed
- along to the drive as soon as they arrive. Others are
- <firstterm>write-back</>, meaning data is passed on to the drive at
+ these caches are <firstterm>write-through</>, meaning writes are sent
+ to the drive as soon as they arrive. Others are
+ <firstterm>write-back</>, meaning data is sent to the drive at
some later time. Such caches can be a reliability hazard because the
memory in the disk controller cache is volatile, and will lose its
contents in a power failure. Better controller cards have
@@ -61,7 +61,7 @@
particularly likely to have write-back caches that will not survive a
power failure. To check write caching on <productname>Linux</> use
<command>hdparm -I</>; it is enabled if there is a <literal>*</> next
- to <literal>Write cache</>. <command>hdparm -W</> to turn off
+ to <literal>Write cache</>; <command>hdparm -W</> to turn off
write caching. On <productname>FreeBSD</> use
<application>atacontrol</>. (For SCSI disks use <ulink
url="http://sg.torque.net/sg/sdparm.html"><application>sdparm</></ulink>
@@ -79,10 +79,10 @@
</para>
<para>
- When the operating system sends a write request to the disk hardware,
+ When the operating system sends a write request to the storage hardware,
there is little it can do to make sure the data has arrived at a truly
non-volatile storage area. Rather, it is the
- administrator's responsibility to be sure that all storage components
+ administrator's responsibility to make certain that all storage components
ensure data integrity. Avoid disk controllers that have non-battery-backed
write caches. At the drive level, disable write-back caching if the
drive cannot guarantee the data will be written before shutdown.
@@ -100,11 +100,11 @@
to power loss at any time, meaning some of the 512-byte sectors were
written, and others were not. To guard against such failures,
<productname>PostgreSQL</> periodically writes full page images to
- permanent storage <emphasis>before</> modifying the actual page on
+ permanent WAL storage <emphasis>before</> modifying the actual page on
disk. By doing this, during crash recovery <productname>PostgreSQL</> can
restore partially-written pages. If you have a battery-backed disk
controller or file-system software that prevents partial page writes
- (e.g., ReiserFS 4), you can turn off this page imaging by using the
+ (e.g., ZFS), you can turn off this page imaging by turning off the
<xref linkend="guc-full-page-writes"> parameter.
</para>
</sect1>
@@ -140,12 +140,12 @@
<tip>
<para>
Because <acronym>WAL</acronym> restores database file
- contents after a crash, journaled filesystems are not necessary for
+ contents after a crash, journaled file systems are not necessary for
reliable storage of the data files or WAL files. In fact, journaling
overhead can reduce performance, especially if journaling
causes file system <emphasis>data</emphasis> to be flushed
to disk. Fortunately, data flushing during journaling can
- often be disabled with a filesystem mount option, e.g.
+ often be disabled with a file system mount option, e.g.
<literal>data=writeback</> on a Linux ext3 file system.
Journaled file systems do improve boot speed after a crash.
</para>
@@ -308,7 +308,7 @@
committing at about the same time. Setting <varname>commit_delay</varname>
can only help when there are many concurrently committing transactions,
and it is difficult to tune it to a value that actually helps rather
- than hurting throughput.
+ than hurt throughput.
</para>
</sect1>
@@ -326,7 +326,7 @@
<para>
<firstterm>Checkpoints</firstterm><indexterm><primary>checkpoint</></>
are points in the sequence of transactions at which it is guaranteed
- that the data files have been updated with all information written before
+ that the heap and index data files have been updated with all information written before
the checkpoint. At checkpoint time, all dirty data pages are flushed to
disk and a special checkpoint record is written to the log file.
(The changes were previously flushed to the <acronym>WAL</acronym> files.)
@@ -349,18 +349,18 @@
</para>
<para>
- The server's background writer process will automatically perform
+ The server's background writer process automatically performs
a checkpoint every so often. A checkpoint is created every <xref
linkend="guc-checkpoint-segments"> log segments, or every <xref
linkend="guc-checkpoint-timeout"> seconds, whichever comes first.
- The default settings are 3 segments and 300 seconds respectively.
+ The default settings are 3 segments and 300 seconds (5 minutes), respectively.
It is also possible to force a checkpoint by using the SQL command
<command>CHECKPOINT</command>.
</para>
<para>
Reducing <varname>checkpoint_segments</varname> and/or
- <varname>checkpoint_timeout</varname> causes checkpoints to be done
+ <varname>checkpoint_timeout</varname> causes checkpoints to occur
more often. This allows faster after-crash recovery (since less work
will need to be redone). However, one must balance this against the
increased cost of flushing dirty data pages more often. If
@@ -469,7 +469,7 @@
server processes to add their commit records to the log so as to have all
of them flushed with a single log sync. No sleep will occur if
<xref linkend="guc-fsync">
- is not enabled, nor if fewer than <xref linkend="guc-commit-siblings">
+ is not enabled, or if fewer than <xref linkend="guc-commit-siblings">
other sessions are currently in active transactions; this avoids
sleeping when it's unlikely that any other session will commit soon.
Note that on most platforms, the resolution of a sleep request is
@@ -483,7 +483,7 @@
The <xref linkend="guc-wal-sync-method"> parameter determines how
<productname>PostgreSQL</productname> will ask the kernel to force
<acronym>WAL</acronym> updates out to disk.
- All the options should be the same as far as reliability goes,
+ All the options should be the same in terms of reliability,
but it's quite platform-specific which one will be the fastest.
Note that this parameter is irrelevant if <varname>fsync</varname>
has been turned off.
@@ -521,26 +521,26 @@
<filename>access/xlog.h</filename>; the record content is dependent
on the type of event that is being logged. Segment files are given
ever-increasing numbers as names, starting at
- <filename>000000010000000000000000</filename>. The numbers do not wrap, at
- present, but it should take a very very long time to exhaust the
+ <filename>000000010000000000000000</filename>. The numbers do not wrap,
+ but it will take a very, very long time to exhaust the
available stock of numbers.
</para>
<para>
- It is of advantage if the log is located on another disk than the
- main database files. This can be achieved by moving the directory
- <filename>pg_xlog</filename> to another location (while the server
+ It is advantageous if the log is located on a different disk from the
+ main database files. This can be achieved by moving the
+ <filename>pg_xlog</filename> directory to another location (while the server
is shut down, of course) and creating a symbolic link from the
original location in the main data directory to the new location.
</para>
<para>
- The aim of <acronym>WAL</acronym>, to ensure that the log is
- written before database records are altered, can be subverted by
+ The aim of <acronym>WAL</acronym> is to ensure that the log is
+ written before database records are altered, but this can be subverted by
disk drives<indexterm><primary>disk drive</></> that falsely report a
successful write to the kernel,
when in fact they have only cached the data and not yet stored it
- on the disk. A power failure in such a situation might still lead to
+ on the disk. A power failure in such a situation might lead to
irrecoverable data corruption. Administrators should try to ensure
that disks holding <productname>PostgreSQL</productname>'s
<acronym>WAL</acronym> log files do not make such false reports.
@@ -549,8 +549,8 @@
<para>
After a checkpoint has been made and the log flushed, the
checkpoint's position is saved in the file
- <filename>pg_control</filename>. Therefore, when recovery is to be
- done, the server first reads <filename>pg_control</filename> and
+ <filename>pg_control</filename>. Therefore, at the start of recovery,
+ the server first reads <filename>pg_control</filename> and
then the checkpoint record; then it performs the REDO operation by
scanning forward from the log position indicated in the checkpoint
record. Because the entire content of data pages is saved in the
@@ -562,12 +562,12 @@
<para>
To deal with the case where <filename>pg_control</filename> is
- corrupted, we should support the possibility of scanning existing log
+ corrupt, we should support the possibility of scanning existing log
segments in reverse order &mdash; newest to oldest &mdash; in order to find the
latest checkpoint. This has not been implemented yet.
<filename>pg_control</filename> is small enough (less than one disk page)
that it is not subject to partial-write problems, and as of this writing
- there have been no reports of database failures due solely to inability
+ there have been no reports of database failures due solely to the inability
to read <filename>pg_control</filename> itself. So while it is
theoretically a weak spot, <filename>pg_control</filename> does not
seem to be a problem in practice.