diff options
author | mattip <matti.picus@gmail.com> | 2019-06-26 00:45:41 +0300 |
---|---|---|
committer | mattip <matti.picus@gmail.com> | 2019-06-26 01:13:48 +0300 |
commit | efa35e738027dc833c0d02c8b15f41c9cf547749 (patch) | |
tree | fd2611d808bd0a76d70dc2648eb1615fc4faefc4 /doc | |
parent | 8bb4645fe56c6fc107ca5c5bef7a05802112cfdf (diff) | |
download | numpy-efa35e738027dc833c0d02c8b15f41c9cf547749.tar.gz |
ENH: use SeedSequence to generate entropy for seeding
Diffstat (limited to 'doc')
6 files changed, 84 insertions, 40 deletions
diff --git a/doc/source/reference/random/bit_generators/bitgenerators.rst b/doc/source/reference/random/bit_generators/bitgenerators.rst new file mode 100644 index 000000000..1474f7dac --- /dev/null +++ b/doc/source/reference/random/bit_generators/bitgenerators.rst @@ -0,0 +1,11 @@ +:orphan: + +BitGenerator +------------ + +.. currentmodule:: numpy.random.bit_generator + +.. autosummary:: + :toctree: generated/ + + BitGenerator diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst index c5a2b4466..7f88231bd 100644 --- a/doc/source/reference/random/bit_generators/index.rst +++ b/doc/source/reference/random/bit_generators/index.rst @@ -1,10 +1,10 @@ .. _bit_generator: +.. currentmodule:: numpy.random + Bit Generators -------------- -.. currentmodule:: numpy.random - The random values produced by :class:`~Generator` orignate in a BitGenerator. The BitGenerators do not directly provide random numbers and only contains methods used for seeding, getting or @@ -12,13 +12,38 @@ setting the state, jumping or advancing the state, and for accessing low-level wrappers for consumption by code that can efficiently access the functions provided, e.g., `numba <https://numba.pydata.org>`_. -Stable RNGs -=========== - .. toctree:: :maxdepth: 1 + BitGenerator <bitgenerators> MT19937 <mt19937> PCG64 <pcg64> Philox <philox> +Seeding and Entropy +------------------- + +A BitGenerator provides a stream of random values. In order to generate +reproducableis streams, BitGenerators support setting their initial state via a +seed. But how best to seed the BitGenerator? On first impulse one would like to +do something like ``[bg(i) for i in range(12)]`` to obtain 12 non-correlated, +independent BitGenerators. However using a highly correlated set of seeds could +generate BitGenerators that are correlated or overlap within a few samples. + +NumPy uses a `SeedSequence` class to mix the seed in a reproducible way that +introduces the necessary entropy to produce independent and largely non- +overlapping streams. Small seeds may still be unable to reach all possible +initialization states, which can cause biases among an ensemble of small-seed +runs. For many cases, that doesn't matter. If you just want to hold things in +place while you debug something, biases aren't a concern. For actual +simulations whose results you care about, let ``SeedSequence(None)`` do its +thing and then log/print the `SeedSequence.entropy` for repeatable +`BitGenerator` streams. + +.. autosummary:: + :toctree: generated/ + + bit_generator.ISeedSequence + bit_generator.ISpawnableSeedSequence + SeedSequence + bit_generator.SeedlessSeedSequence diff --git a/doc/source/reference/random/bit_generators/mt19937.rst b/doc/source/reference/random/bit_generators/mt19937.rst index f5843ccf0..25ba1d7b5 100644 --- a/doc/source/reference/random/bit_generators/mt19937.rst +++ b/doc/source/reference/random/bit_generators/mt19937.rst @@ -8,13 +8,12 @@ Mersenne Twister (MT19937) .. autoclass:: MT19937 :exclude-members: -Seeding and State -================= +State +===== .. autosummary:: :toctree: generated/ - ~MT19937.seed ~MT19937.state Parallel generation diff --git a/doc/source/reference/random/bit_generators/pcg64.rst b/doc/source/reference/random/bit_generators/pcg64.rst index fa719cea4..cbc654f05 100644 --- a/doc/source/reference/random/bit_generators/pcg64.rst +++ b/doc/source/reference/random/bit_generators/pcg64.rst @@ -14,7 +14,6 @@ Seeding and State .. autosummary:: :toctree: generated/ - ~PCG64.seed ~PCG64.state Parallel generation diff --git a/doc/source/reference/random/bit_generators/philox.rst b/doc/source/reference/random/bit_generators/philox.rst index 7ef451d4b..5e581e094 100644 --- a/doc/source/reference/random/bit_generators/philox.rst +++ b/doc/source/reference/random/bit_generators/philox.rst @@ -8,13 +8,12 @@ Philox Counter-based RNG .. autoclass:: Philox :exclude-members: -Seeding and State -================= +State +===== .. autosummary:: :toctree: generated/ - ~Philox.seed ~Philox.state Parallel generation diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst index a35ba4aaf..109302b7a 100644 --- a/doc/source/reference/random/index.rst +++ b/doc/source/reference/random/index.rst @@ -9,6 +9,10 @@ Numpy's random number routines produce pseudo random numbers using combinations of a `BitGenerator` to create sequences and a `Generator` to use those sequences to sample from different statistical distributions: +* SeedSequence: Objects that provide entropy for the initial state of a + BitGenerator. A good SeedSequence will provide initializations across the + entire range of possible states for the BitGenerator, otherwise biases may + creep into the generated bit streams. * BitGenerators: Objects that generate random numbers. These are typically unsigned integer words filled with sequences of either 32 or 64 random bits. * Generators: Objects that transform sequences of random bits from a @@ -52,28 +56,37 @@ the random values are generated by `~PCG64`. The rg.standard_normal() rg.bit_generator - -Seeds can be passed to any of the BitGenerators. Here `mt19937.MT19937` is used -and is the wrapped with a `~.Generator`. - +Seeds can be passed to any of the BitGenerators. The provided value is mixed +via `~.SeedSequence` to spread a possible sequence of seeds across a wider +range of initialization states for the BitGenerator. Here `~.PCG64` is used and +is wrapped with a `~.Generator`. .. code-block:: python - from numpy.random import Generator, MT19937 - rg = Generator(MT19937(12345)) + from numpy.random import Generator, PCG64 + rg = Generator(PCG64(12345)) rg.standard_normal() - Introduction ------------ RandomGen takes a different approach to producing random numbers from the -`RandomState` object. Random number generation is separated into two -components, a bit generator and a random generator. +`RandomState` object. Random number generation is separated into three +components, a seed sequence, a bit generator and a random generator. -The bit generator has a limited set of responsibilities. It manages state +The `BitGenerator` has a limited set of responsibilities. It manages state and provides functions to produce random doubles and random unsigned 32- and -64-bit values. The bit generator also handles all seeding which varies with -different bit generators. +64-bit values. + +The `SeedSequence` takes a seed and provides the initial state for the +`BitGenerator`. Since consecutive seeds can cause bad effects when comparing +`BitGenerator` streams, the `SeedSequence` uses current best-practice methods +to spread the initial state out. However small seeds may still be unable to +reach all possible initialization states, which can cause biases among an +ensemble of small-seed runs. For many cases, that doesn't matter. If you just +want to hold things in place while you debug something, biases aren't a +concern. For actual simulations whose results you care about, let +``SeedSequence(None)`` do its thing and then log/print the +`SeedSequence.entropy` for repeatable `BitGenerator` streams. The `random generator <Generator>` takes the bit generator-provided stream and transforms them into more useful @@ -86,15 +99,15 @@ The `Generator` is the user-facing object that is nearly identical to the sole argument. Note that the BitGenerator must be instantiated. .. code-block:: python - from numpy.random import Generator, MT19937 - rg = Generator(MT19937()) + from numpy.random import Generator, PCG64 + rg = Generator(PCG64()) rg.random() Seed information is directly passed to the bit generator. .. code-block:: python - rg = Generator(MT19937(12345)) + rg = Generator(PCG64(12345)) rg.random() What's New or Different @@ -150,9 +163,14 @@ Supported BitGenerators ----------------------- The included BitGenerators are: -* MT19937 - The standard Python BitGenerator. Produces identical results to - Python using the same seed/state. Adds a `~mt19937.MT19937.jumped` function - that returns a new generator with state as-if ``2**128`` draws have been made. +* MT19937 - The standard Python BitGenerator. Adds a `~mt19937.MT19937.jumped` + function that returns a new generator with state as-if ``2**128`` draws have + been made. +* PCG-64 - Fast generator that support many parallel streams and + can be advanced by an arbitrary amount. See the documentation for + :meth:`~.PCG64.advance`. PCG-64 has a period of + :math:`2^{128}`. See the `PCG author's page`_ for more details about + this class of PRNG. * Xorshiro256** and Xorshiro512** - The most recently introduced XOR, shift, and rotate generator. Supports ``jumped`` and so can be used in parallel applications. See the documentation for @@ -163,21 +181,14 @@ The included BitGenerators are: .. _`PCG author's page`: http://www.pcg-random.org/ .. _`Random123`: https://www.deshawresearch.com/resources_random123.html -Generator ---------- +Concepts +-------- .. toctree:: :maxdepth: 1 generator legacy mtrand <legacy> - -BitGenerators -------------- - -.. toctree:: - :maxdepth: 1 - - BitGenerators <bit_generators/index> + BitGenerators, SeedSequences <bit_generators/index> Features -------- |