summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authormattip <matti.picus@gmail.com>2019-06-26 00:45:41 +0300
committermattip <matti.picus@gmail.com>2019-06-26 01:13:48 +0300
commitefa35e738027dc833c0d02c8b15f41c9cf547749 (patch)
treefd2611d808bd0a76d70dc2648eb1615fc4faefc4 /doc
parent8bb4645fe56c6fc107ca5c5bef7a05802112cfdf (diff)
downloadnumpy-efa35e738027dc833c0d02c8b15f41c9cf547749.tar.gz
ENH: use SeedSequence to generate entropy for seeding
Diffstat (limited to 'doc')
-rw-r--r--doc/source/reference/random/bit_generators/bitgenerators.rst11
-rw-r--r--doc/source/reference/random/bit_generators/index.rst35
-rw-r--r--doc/source/reference/random/bit_generators/mt19937.rst5
-rw-r--r--doc/source/reference/random/bit_generators/pcg64.rst1
-rw-r--r--doc/source/reference/random/bit_generators/philox.rst5
-rw-r--r--doc/source/reference/random/index.rst67
6 files changed, 84 insertions, 40 deletions
diff --git a/doc/source/reference/random/bit_generators/bitgenerators.rst b/doc/source/reference/random/bit_generators/bitgenerators.rst
new file mode 100644
index 000000000..1474f7dac
--- /dev/null
+++ b/doc/source/reference/random/bit_generators/bitgenerators.rst
@@ -0,0 +1,11 @@
+:orphan:
+
+BitGenerator
+------------
+
+.. currentmodule:: numpy.random.bit_generator
+
+.. autosummary::
+ :toctree: generated/
+
+ BitGenerator
diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst
index c5a2b4466..7f88231bd 100644
--- a/doc/source/reference/random/bit_generators/index.rst
+++ b/doc/source/reference/random/bit_generators/index.rst
@@ -1,10 +1,10 @@
.. _bit_generator:
+.. currentmodule:: numpy.random
+
Bit Generators
--------------
-.. currentmodule:: numpy.random
-
The random values produced by :class:`~Generator`
orignate in a BitGenerator. The BitGenerators do not directly provide
random numbers and only contains methods used for seeding, getting or
@@ -12,13 +12,38 @@ setting the state, jumping or advancing the state, and for accessing
low-level wrappers for consumption by code that can efficiently
access the functions provided, e.g., `numba <https://numba.pydata.org>`_.
-Stable RNGs
-===========
-
.. toctree::
:maxdepth: 1
+ BitGenerator <bitgenerators>
MT19937 <mt19937>
PCG64 <pcg64>
Philox <philox>
+Seeding and Entropy
+-------------------
+
+A BitGenerator provides a stream of random values. In order to generate
+reproducableis streams, BitGenerators support setting their initial state via a
+seed. But how best to seed the BitGenerator? On first impulse one would like to
+do something like ``[bg(i) for i in range(12)]`` to obtain 12 non-correlated,
+independent BitGenerators. However using a highly correlated set of seeds could
+generate BitGenerators that are correlated or overlap within a few samples.
+
+NumPy uses a `SeedSequence` class to mix the seed in a reproducible way that
+introduces the necessary entropy to produce independent and largely non-
+overlapping streams. Small seeds may still be unable to reach all possible
+initialization states, which can cause biases among an ensemble of small-seed
+runs. For many cases, that doesn't matter. If you just want to hold things in
+place while you debug something, biases aren't a concern. For actual
+simulations whose results you care about, let ``SeedSequence(None)`` do its
+thing and then log/print the `SeedSequence.entropy` for repeatable
+`BitGenerator` streams.
+
+.. autosummary::
+ :toctree: generated/
+
+ bit_generator.ISeedSequence
+ bit_generator.ISpawnableSeedSequence
+ SeedSequence
+ bit_generator.SeedlessSeedSequence
diff --git a/doc/source/reference/random/bit_generators/mt19937.rst b/doc/source/reference/random/bit_generators/mt19937.rst
index f5843ccf0..25ba1d7b5 100644
--- a/doc/source/reference/random/bit_generators/mt19937.rst
+++ b/doc/source/reference/random/bit_generators/mt19937.rst
@@ -8,13 +8,12 @@ Mersenne Twister (MT19937)
.. autoclass:: MT19937
:exclude-members:
-Seeding and State
-=================
+State
+=====
.. autosummary::
:toctree: generated/
- ~MT19937.seed
~MT19937.state
Parallel generation
diff --git a/doc/source/reference/random/bit_generators/pcg64.rst b/doc/source/reference/random/bit_generators/pcg64.rst
index fa719cea4..cbc654f05 100644
--- a/doc/source/reference/random/bit_generators/pcg64.rst
+++ b/doc/source/reference/random/bit_generators/pcg64.rst
@@ -14,7 +14,6 @@ Seeding and State
.. autosummary::
:toctree: generated/
- ~PCG64.seed
~PCG64.state
Parallel generation
diff --git a/doc/source/reference/random/bit_generators/philox.rst b/doc/source/reference/random/bit_generators/philox.rst
index 7ef451d4b..5e581e094 100644
--- a/doc/source/reference/random/bit_generators/philox.rst
+++ b/doc/source/reference/random/bit_generators/philox.rst
@@ -8,13 +8,12 @@ Philox Counter-based RNG
.. autoclass:: Philox
:exclude-members:
-Seeding and State
-=================
+State
+=====
.. autosummary::
:toctree: generated/
- ~Philox.seed
~Philox.state
Parallel generation
diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst
index a35ba4aaf..109302b7a 100644
--- a/doc/source/reference/random/index.rst
+++ b/doc/source/reference/random/index.rst
@@ -9,6 +9,10 @@ Numpy's random number routines produce pseudo random numbers using
combinations of a `BitGenerator` to create sequences and a `Generator`
to use those sequences to sample from different statistical distributions:
+* SeedSequence: Objects that provide entropy for the initial state of a
+ BitGenerator. A good SeedSequence will provide initializations across the
+ entire range of possible states for the BitGenerator, otherwise biases may
+ creep into the generated bit streams.
* BitGenerators: Objects that generate random numbers. These are typically
unsigned integer words filled with sequences of either 32 or 64 random bits.
* Generators: Objects that transform sequences of random bits from a
@@ -52,28 +56,37 @@ the random values are generated by `~PCG64`. The
rg.standard_normal()
rg.bit_generator
-
-Seeds can be passed to any of the BitGenerators. Here `mt19937.MT19937` is used
-and is the wrapped with a `~.Generator`.
-
+Seeds can be passed to any of the BitGenerators. The provided value is mixed
+via `~.SeedSequence` to spread a possible sequence of seeds across a wider
+range of initialization states for the BitGenerator. Here `~.PCG64` is used and
+is wrapped with a `~.Generator`.
.. code-block:: python
- from numpy.random import Generator, MT19937
- rg = Generator(MT19937(12345))
+ from numpy.random import Generator, PCG64
+ rg = Generator(PCG64(12345))
rg.standard_normal()
-
Introduction
------------
RandomGen takes a different approach to producing random numbers from the
-`RandomState` object. Random number generation is separated into two
-components, a bit generator and a random generator.
+`RandomState` object. Random number generation is separated into three
+components, a seed sequence, a bit generator and a random generator.
-The bit generator has a limited set of responsibilities. It manages state
+The `BitGenerator` has a limited set of responsibilities. It manages state
and provides functions to produce random doubles and random unsigned 32- and
-64-bit values. The bit generator also handles all seeding which varies with
-different bit generators.
+64-bit values.
+
+The `SeedSequence` takes a seed and provides the initial state for the
+`BitGenerator`. Since consecutive seeds can cause bad effects when comparing
+`BitGenerator` streams, the `SeedSequence` uses current best-practice methods
+to spread the initial state out. However small seeds may still be unable to
+reach all possible initialization states, which can cause biases among an
+ensemble of small-seed runs. For many cases, that doesn't matter. If you just
+want to hold things in place while you debug something, biases aren't a
+concern. For actual simulations whose results you care about, let
+``SeedSequence(None)`` do its thing and then log/print the
+`SeedSequence.entropy` for repeatable `BitGenerator` streams.
The `random generator <Generator>` takes the
bit generator-provided stream and transforms them into more useful
@@ -86,15 +99,15 @@ The `Generator` is the user-facing object that is nearly identical to
the sole argument. Note that the BitGenerator must be instantiated.
.. code-block:: python
- from numpy.random import Generator, MT19937
- rg = Generator(MT19937())
+ from numpy.random import Generator, PCG64
+ rg = Generator(PCG64())
rg.random()
Seed information is directly passed to the bit generator.
.. code-block:: python
- rg = Generator(MT19937(12345))
+ rg = Generator(PCG64(12345))
rg.random()
What's New or Different
@@ -150,9 +163,14 @@ Supported BitGenerators
-----------------------
The included BitGenerators are:
-* MT19937 - The standard Python BitGenerator. Produces identical results to
- Python using the same seed/state. Adds a `~mt19937.MT19937.jumped` function
- that returns a new generator with state as-if ``2**128`` draws have been made.
+* MT19937 - The standard Python BitGenerator. Adds a `~mt19937.MT19937.jumped`
+ function that returns a new generator with state as-if ``2**128`` draws have
+ been made.
+* PCG-64 - Fast generator that support many parallel streams and
+ can be advanced by an arbitrary amount. See the documentation for
+ :meth:`~.PCG64.advance`. PCG-64 has a period of
+ :math:`2^{128}`. See the `PCG author's page`_ for more details about
+ this class of PRNG.
* Xorshiro256** and Xorshiro512** - The most recently introduced XOR,
shift, and rotate generator. Supports ``jumped`` and so can be used in
parallel applications. See the documentation for
@@ -163,21 +181,14 @@ The included BitGenerators are:
.. _`PCG author's page`: http://www.pcg-random.org/
.. _`Random123`: https://www.deshawresearch.com/resources_random123.html
-Generator
----------
+Concepts
+--------
.. toctree::
:maxdepth: 1
generator
legacy mtrand <legacy>
-
-BitGenerators
--------------
-
-.. toctree::
- :maxdepth: 1
-
- BitGenerators <bit_generators/index>
+ BitGenerators, SeedSequences <bit_generators/index>
Features
--------