diff options
Diffstat (limited to 'doc/source')
24 files changed, 308 insertions, 398 deletions
diff --git a/doc/source/dev/development_environment.rst b/doc/source/dev/development_environment.rst index 445ce3204..bc491b711 100644 --- a/doc/source/dev/development_environment.rst +++ b/doc/source/dev/development_environment.rst @@ -147,9 +147,9 @@ That also takes extra arguments, like ``--pdb`` which drops you into the Python debugger when a test fails or an exception is raised. Running tests with `tox`_ is also supported. For example, to build NumPy and -run the test suite with Python 3.4, use:: +run the test suite with Python 3.7, use:: - $ tox -e py34 + $ tox -e py37 For more extensive information, see :ref:`testing-guidelines` diff --git a/doc/source/reference/arrays.dtypes.rst b/doc/source/reference/arrays.dtypes.rst index b55feb247..ab743a8ee 100644 --- a/doc/source/reference/arrays.dtypes.rst +++ b/doc/source/reference/arrays.dtypes.rst @@ -538,6 +538,7 @@ Attributes providing additional information: dtype.isnative dtype.descr dtype.alignment + dtype.base Methods diff --git a/doc/source/reference/c-api.types-and-structures.rst b/doc/source/reference/c-api.types-and-structures.rst index b72d9f902..a716b5a06 100644 --- a/doc/source/reference/c-api.types-and-structures.rst +++ b/doc/source/reference/c-api.types-and-structures.rst @@ -57,8 +57,8 @@ types are place holders that allow the array scalars to fit into a hierarchy of actual Python types. -PyArray_Type ------------- +PyArray_Type and PyArrayObject +------------------------------ .. c:var:: PyArray_Type @@ -74,7 +74,7 @@ PyArray_Type subclasses) will have this structure. For future compatibility, these structure members should normally be accessed using the provided macros. If you need a shorter name, then you can make use - of :c:type:`NPY_AO` which is defined to be equivalent to + of :c:type:`NPY_AO` (deprecated) which is defined to be equivalent to :c:type:`PyArrayObject`. .. code-block:: c @@ -91,7 +91,7 @@ PyArray_Type PyObject *weakreflist; } PyArrayObject; -.. c:macro: PyArrayObject.PyObject_HEAD +.. c:macro:: PyArrayObject.PyObject_HEAD This is needed by all Python objects. It consists of (at least) a reference count member ( ``ob_refcnt`` ) and a pointer to the @@ -130,14 +130,16 @@ PyArray_Type .. c:member:: PyObject *PyArrayObject.base This member is used to hold a pointer to another Python object that - is related to this array. There are two use cases: 1) If this array - does not own its own memory, then base points to the Python object - that owns it (perhaps another array object), 2) If this array has - the (deprecated) :c:data:`NPY_ARRAY_UPDATEIFCOPY` or - :c:data:NPY_ARRAY_WRITEBACKIFCOPY`: flag set, then this array is - a working copy of a "misbehaved" array. When - ``PyArray_ResolveWritebackIfCopy`` is called, the array pointed to by base - will be updated with the contents of this array. + is related to this array. There are two use cases: + + - If this array does not own its own memory, then base points to the + Python object that owns it (perhaps another array object) + - If this array has the (deprecated) :c:data:`NPY_ARRAY_UPDATEIFCOPY` or + :c:data:`NPY_ARRAY_WRITEBACKIFCOPY` flag set, then this array is a working + copy of a "misbehaved" array. + + When ``PyArray_ResolveWritebackIfCopy`` is called, the array pointed to + by base will be updated with the contents of this array. .. c:member:: PyArray_Descr *PyArrayObject.descr @@ -163,8 +165,8 @@ PyArray_Type weakref module). -PyArrayDescr_Type ------------------ +PyArrayDescr_Type and PyArray_Descr +----------------------------------- .. c:var:: PyArrayDescr_Type @@ -253,11 +255,13 @@ PyArrayDescr_Type .. c:var:: NPY_ITEM_REFCOUNT - .. c:var:: NPY_ITEM_HASOBJECT - Indicates that items of this data-type must be reference counted (using :c:func:`Py_INCREF` and :c:func:`Py_DECREF` ). + .. c:var:: NPY_ITEM_HASOBJECT + + Same as :c:data:`NPY_ITEM_REFCOUNT`. + .. c:var:: NPY_LIST_PICKLE Indicates arrays of this data-type must be converted to a list @@ -676,25 +680,28 @@ PyArrayDescr_Type The :c:data:`PyArray_Type` typeobject implements many of the features of -Python objects including the tp_as_number, tp_as_sequence, -tp_as_mapping, and tp_as_buffer interfaces. The rich comparison -(tp_richcompare) is also used along with new-style attribute lookup -for methods (tp_methods) and properties (tp_getset). The -:c:data:`PyArray_Type` can also be sub-typed. +:c:type:`Python objects <PyTypeObject>` including the :c:member:`tp_as_number +<PyTypeObject.tp_as_number>`, :c:member:`tp_as_sequence +<PyTypeObject.tp_as_sequence>`, :c:member:`tp_as_mapping +<PyTypeObject.tp_as_mapping>`, and :c:member:`tp_as_buffer +<PyTypeObject.tp_as_buffer>` interfaces. The :c:type:`rich comparison +<richcmpfunc>`) is also used along with new-style attribute lookup for +member (:c:member:`tp_members <PyTypeObject.tp_members>`) and properties +(:c:member:`tp_getset <PyTypeObject.tp_getset>`). +The :c:data:`PyArray_Type` can also be sub-typed. .. tip:: - The tp_as_number methods use a generic approach to call whatever - function has been registered for handling the operation. The - function PyNumeric_SetOps(..) can be used to register functions to - handle particular mathematical operations (for all arrays). When - the umath module is imported, it sets the numeric operations for - all arrays to the corresponding ufuncs. The tp_str and tp_repr - methods can also be altered using PyString_SetStringFunction(...). + The ``tp_as_number`` methods use a generic approach to call whatever + function has been registered for handling the operation. When the + ``_multiarray_umath module`` is imported, it sets the numeric operations + for all arrays to the corresponding ufuncs. This choice can be changed with + :c:func:`PyUFunc_ReplaceLoopBySignature` The ``tp_str`` and ``tp_repr`` + methods can also be altered using :c:func:`PyArray_SetStringFunction`. -PyUFunc_Type ------------- +PyUFunc_Type and PyUFuncObject +------------------------------ .. c:var:: PyUFunc_Type @@ -786,8 +793,8 @@ PyUFunc_Type the identity for this operation. It is only used for a reduce-like call on an empty array. - .. c:member:: void PyUFuncObject.functions(char** args, npy_intp* dims, - npy_intp* steps, void* extradata) + .. c:member:: void PyUFuncObject.functions( \ + char** args, npy_intp* dims, npy_intp* steps, void* extradata) An array of function pointers --- one for each data type supported by the ufunc. This is the vector loop that is called @@ -932,8 +939,8 @@ PyUFunc_Type - :c:data:`UFUNC_CORE_DIM_SIZE_INFERRED` if the dim size will be determined from the operands and not from a :ref:`frozen <frozen>` signature -PyArrayIter_Type ----------------- +PyArrayIter_Type and PyArrayIterObject +-------------------------------------- .. c:var:: PyArrayIter_Type @@ -1042,8 +1049,8 @@ with it through the use of the macros :c:func:`PyArray_ITER_NEXT` (it), :c:type:`PyArrayIterObject *`. -PyArrayMultiIter_Type ---------------------- +PyArrayMultiIter_Type and PyArrayMultiIterObject +------------------------------------------------ .. c:var:: PyArrayMultiIter_Type @@ -1104,8 +1111,8 @@ PyArrayMultiIter_Type arrays to be broadcast together. On return, the iterators are adjusted for broadcasting. -PyArrayNeighborhoodIter_Type ----------------------------- +PyArrayNeighborhoodIter_Type and PyArrayNeighborhoodIterObject +-------------------------------------------------------------- .. c:var:: PyArrayNeighborhoodIter_Type @@ -1118,8 +1125,33 @@ PyArrayNeighborhoodIter_Type :c:data:`PyArrayNeighborhoodIter_Type` is the :c:type:`PyArrayNeighborhoodIterObject`. -PyArrayFlags_Type ------------------ + .. code-block:: c + + typedef struct { + PyObject_HEAD + int nd_m1; + npy_intp index, size; + npy_intp coordinates[NPY_MAXDIMS] + npy_intp dims_m1[NPY_MAXDIMS]; + npy_intp strides[NPY_MAXDIMS]; + npy_intp backstrides[NPY_MAXDIMS]; + npy_intp factors[NPY_MAXDIMS]; + PyArrayObject *ao; + char *dataptr; + npy_bool contiguous; + npy_intp bounds[NPY_MAXDIMS][2]; + npy_intp limits[NPY_MAXDIMS][2]; + npy_intp limits_sizes[NPY_MAXDIMS]; + npy_iter_get_dataptr_t translate; + npy_intp nd; + npy_intp dimensions[NPY_MAXDIMS]; + PyArrayIterObject* _internal_iter; + char* constant; + int mode; + } PyArrayNeighborhoodIterObject; + +PyArrayFlags_Type and PyArrayFlagsObject +---------------------------------------- .. c:var:: PyArrayFlags_Type @@ -1129,6 +1161,16 @@ PyArrayFlags_Type attributes or by accessing them as if the object were a dictionary with the flag names as entries. +.. c:type:: PyArrayFlagsObject + + .. code-block:: c + + typedef struct PyArrayFlagsObject { + PyObject_HEAD + PyObject *arr; + int flags; + } PyArrayFlagsObject; + ScalarArrayTypes ---------------- diff --git a/doc/source/reference/random/bit_generators/bitgenerators.rst b/doc/source/reference/random/bit_generators/bitgenerators.rst new file mode 100644 index 000000000..1474f7dac --- /dev/null +++ b/doc/source/reference/random/bit_generators/bitgenerators.rst @@ -0,0 +1,11 @@ +:orphan: + +BitGenerator +------------ + +.. currentmodule:: numpy.random.bit_generator + +.. autosummary:: + :toctree: generated/ + + BitGenerator diff --git a/doc/source/reference/random/bit_generators/dsfmt.rst b/doc/source/reference/random/bit_generators/dsfmt.rst deleted file mode 100644 index e7c6dbb31..000000000 --- a/doc/source/reference/random/bit_generators/dsfmt.rst +++ /dev/null @@ -1,36 +0,0 @@ -Double SIMD Mersenne Twister (dSFMT) ------------------------------------- - -.. module:: numpy.random.dsfmt - -.. currentmodule:: numpy.random.dsfmt - - -.. autoclass:: DSFMT - :exclude-members: - -Seeding and State -================= - -.. autosummary:: - :toctree: generated/ - - ~DSFMT.seed - ~DSFMT.state - -Parallel generation -=================== -.. autosummary:: - :toctree: generated/ - - ~DSFMT.jumped - -Extending -========= -.. autosummary:: - :toctree: generated/ - - ~DSFMT.cffi - ~DSFMT.ctypes - - diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst index 3a9294bfb..4540f60d9 100644 --- a/doc/source/reference/random/bit_generators/index.rst +++ b/doc/source/reference/random/bit_generators/index.rst @@ -1,10 +1,10 @@ .. _bit_generator: +.. currentmodule:: numpy.random + Bit Generators -------------- -.. currentmodule:: numpy.random - The random values produced by :class:`~Generator` orignate in a BitGenerator. The BitGenerators do not directly provide random numbers and only contains methods used for seeding, getting or @@ -12,18 +12,60 @@ setting the state, jumping or advancing the state, and for accessing low-level wrappers for consumption by code that can efficiently access the functions provided, e.g., `numba <https://numba.pydata.org>`_. -Stable RNGs -=========== +Supported BitGenerators +======================= + +The included BitGenerators are: + +* MT19937 - The standard Python BitGenerator. Adds a `~mt19937.MT19937.jumped` + function that returns a new generator with state as-if ``2**128`` draws have + been made. +* PCG-64 - Fast generator that support many parallel streams and + can be advanced by an arbitrary amount. See the documentation for + :meth:`~.PCG64.advance`. PCG-64 has a period of + :math:`2^{128}`. See the `PCG author's page`_ for more details about + this class of PRNG. +* Philox - a counter-based generator capable of being advanced an + arbitrary number of steps or generating independent streams. See the + `Random123`_ page for more details about this class of bit generators. + +.. _`PCG author's page`: http://www.pcg-random.org/ +.. _`Random123`: https://www.deshawresearch.com/resources_random123.html + .. toctree:: :maxdepth: 1 - DSFMT <dsfmt> + BitGenerator <bitgenerators> MT19937 <mt19937> - PCG32 <pcg32> PCG64 <pcg64> Philox <philox> - ThreeFry <threefry> - Xoshiro256** <xoshiro256> - Xoshiro512** <xoshiro512> + SFC64 <sfc64> + +Seeding and Entropy +------------------- + +A BitGenerator provides a stream of random values. In order to generate +reproducableis streams, BitGenerators support setting their initial state via a +seed. But how best to seed the BitGenerator? On first impulse one would like to +do something like ``[bg(i) for i in range(12)]`` to obtain 12 non-correlated, +independent BitGenerators. However using a highly correlated set of seeds could +generate BitGenerators that are correlated or overlap within a few samples. + +NumPy uses a `SeedSequence` class to mix the seed in a reproducible way that +introduces the necessary entropy to produce independent and largely non- +overlapping streams. Small seeds are unable to fill the complete range of +initializaiton states, and lead to biases among an ensemble of small-seed +runs. For many cases, that doesn't matter. If you just want to hold things in +place while you debug something, biases aren't a concern. For actual +simulations whose results you care about, let ``SeedSequence(None)`` do its +thing and then log/print the `SeedSequence.entropy` for repeatable +`BitGenerator` streams. + +.. autosummary:: + :toctree: generated/ + bit_generator.ISeedSequence + bit_generator.ISpawnableSeedSequence + SeedSequence + bit_generator.SeedlessSeedSequence diff --git a/doc/source/reference/random/bit_generators/mt19937.rst b/doc/source/reference/random/bit_generators/mt19937.rst index f5843ccf0..25ba1d7b5 100644 --- a/doc/source/reference/random/bit_generators/mt19937.rst +++ b/doc/source/reference/random/bit_generators/mt19937.rst @@ -8,13 +8,12 @@ Mersenne Twister (MT19937) .. autoclass:: MT19937 :exclude-members: -Seeding and State -================= +State +===== .. autosummary:: :toctree: generated/ - ~MT19937.seed ~MT19937.state Parallel generation diff --git a/doc/source/reference/random/bit_generators/pcg32.rst b/doc/source/reference/random/bit_generators/pcg32.rst deleted file mode 100644 index faaccaf9b..000000000 --- a/doc/source/reference/random/bit_generators/pcg32.rst +++ /dev/null @@ -1,34 +0,0 @@ -Parallel Congruent Generator (32-bit, PCG32) --------------------------------------------- - -.. module:: numpy.random.pcg32 - -.. currentmodule:: numpy.random.pcg32 - -.. autoclass:: PCG32 - :exclude-members: - -Seeding and State -================= - -.. autosummary:: - :toctree: generated/ - - ~PCG32.seed - ~PCG32.state - -Parallel generation -=================== -.. autosummary:: - :toctree: generated/ - - ~PCG32.advance - ~PCG32.jumped - -Extending -========= -.. autosummary:: - :toctree: generated/ - - ~PCG32.cffi - ~PCG32.ctypes diff --git a/doc/source/reference/random/bit_generators/pcg64.rst b/doc/source/reference/random/bit_generators/pcg64.rst index fa719cea4..7aef1e0dd 100644 --- a/doc/source/reference/random/bit_generators/pcg64.rst +++ b/doc/source/reference/random/bit_generators/pcg64.rst @@ -8,13 +8,12 @@ Parallel Congruent Generator (64-bit, PCG64) .. autoclass:: PCG64 :exclude-members: -Seeding and State -================= +State +===== .. autosummary:: :toctree: generated/ - ~PCG64.seed ~PCG64.state Parallel generation diff --git a/doc/source/reference/random/bit_generators/philox.rst b/doc/source/reference/random/bit_generators/philox.rst index 7ef451d4b..5e581e094 100644 --- a/doc/source/reference/random/bit_generators/philox.rst +++ b/doc/source/reference/random/bit_generators/philox.rst @@ -8,13 +8,12 @@ Philox Counter-based RNG .. autoclass:: Philox :exclude-members: -Seeding and State -================= +State +===== .. autosummary:: :toctree: generated/ - ~Philox.seed ~Philox.state Parallel generation diff --git a/doc/source/reference/random/bit_generators/sfc64.rst b/doc/source/reference/random/bit_generators/sfc64.rst new file mode 100644 index 000000000..dc03820ae --- /dev/null +++ b/doc/source/reference/random/bit_generators/sfc64.rst @@ -0,0 +1,28 @@ +SFC64 Small Fast Chaotic PRNG +----------------------------- + +.. module:: numpy.random.sfc64 + +.. currentmodule:: numpy.random.sfc64 + +.. autoclass:: SFC64 + :exclude-members: + +State +===== + +.. autosummary:: + :toctree: generated/ + + ~SFC64.state + +Extending +========= +.. autosummary:: + :toctree: generated/ + + ~SFC64.cffi + ~SFC64.ctypes + + + diff --git a/doc/source/reference/random/bit_generators/threefry.rst b/doc/source/reference/random/bit_generators/threefry.rst deleted file mode 100644 index 951108d72..000000000 --- a/doc/source/reference/random/bit_generators/threefry.rst +++ /dev/null @@ -1,36 +0,0 @@ -ThreeFry Counter-based RNG --------------------------- - -.. module:: numpy.random.threefry - -.. currentmodule:: numpy.random.threefry - -.. autoclass:: ThreeFry - :exclude-members: - -Seeding and State -================= - -.. autosummary:: - :toctree: generated/ - - ~ThreeFry.seed - ~ThreeFry.state - -Parallel generation -=================== -.. autosummary:: - :toctree: generated/ - - ~ThreeFry.advance - ~ThreeFry.jumped - -Extending -========= -.. autosummary:: - :toctree: generated/ - - ~ThreeFry.cffi - ~ThreeFry.ctypes - - diff --git a/doc/source/reference/random/bit_generators/xoshiro256.rst b/doc/source/reference/random/bit_generators/xoshiro256.rst deleted file mode 100644 index fedc61b33..000000000 --- a/doc/source/reference/random/bit_generators/xoshiro256.rst +++ /dev/null @@ -1,35 +0,0 @@ -Xoshiro256** ------------- - -.. module:: numpy.random.xoshiro256 - -.. currentmodule:: numpy.random.xoshiro256 - -.. autoclass:: Xoshiro256 - :exclude-members: - -Seeding and State -================= - -.. autosummary:: - :toctree: generated/ - - ~Xoshiro256.seed - ~Xoshiro256.state - -Parallel generation -=================== -.. autosummary:: - :toctree: generated/ - - ~Xoshiro256.jumped - -Extending -========= -.. autosummary:: - :toctree: generated/ - - ~Xoshiro256.cffi - ~Xoshiro256.ctypes - - diff --git a/doc/source/reference/random/bit_generators/xoshiro512.rst b/doc/source/reference/random/bit_generators/xoshiro512.rst deleted file mode 100644 index e39346cd6..000000000 --- a/doc/source/reference/random/bit_generators/xoshiro512.rst +++ /dev/null @@ -1,35 +0,0 @@ -Xoshiro512** ------------- - -.. module:: numpy.random.xoshiro512 - -.. currentmodule:: numpy.random.xoshiro512 - -.. autoclass:: Xoshiro512 - :exclude-members: - -Seeding and State -================= - -.. autosummary:: - :toctree: generated/ - - ~Xoshiro512.seed - ~Xoshiro512.state - -Parallel generation -=================== -.. autosummary:: - :toctree: generated/ - - ~Xoshiro512.jumped - -Extending -========= -.. autosummary:: - :toctree: generated/ - - ~Xoshiro512.cffi - ~Xoshiro512.ctypes - - diff --git a/doc/source/reference/random/extending.rst b/doc/source/reference/random/extending.rst index 28db4021c..22f9cb7e4 100644 --- a/doc/source/reference/random/extending.rst +++ b/doc/source/reference/random/extending.rst @@ -18,11 +18,11 @@ provided by ``ctypes.next_double``. .. code-block:: python - from numpy.random import Xoshiro256 + from numpy.random import PCG64 import numpy as np import numba as nb - x = Xoshiro256() + x = PCG64() f = x.ctypes.next_double s = x.ctypes.state state_addr = x.ctypes.state_address @@ -50,7 +50,7 @@ provided by ``ctypes.next_double``. # Must use state address not state with numba normalsj(1, state_addr) %timeit normalsj(1000000, state_addr) - print('1,000,000 Box-Muller (numba/Xoshiro256) randoms') + print('1,000,000 Box-Muller (numba/PCG64) randoms') %timeit np.random.standard_normal(1000000) print('1,000,000 Box-Muller (NumPy) randoms') @@ -66,7 +66,7 @@ Cython ====== Cython can be used to unpack the ``PyCapsule`` provided by a BitGenerator. -This example uses `~xoshiro256.Xoshiro256` and +This example uses `~pcg64.PCG64` and ``random_gauss_zig``, the Ziggurat-based generator for normals, to fill an array. The usual caveats for writing high-performance code using Cython -- removing bounds checks and wrap around, providing array alignment information @@ -80,7 +80,7 @@ removing bounds checks and wrap around, providing array alignment information from cpython.pycapsule cimport PyCapsule_IsValid, PyCapsule_GetPointer from numpy.random.common cimport * from numpy.random.distributions cimport random_gauss_zig - from numpy.random import Xoshiro256 + from numpy.random import PCG64 @cython.boundscheck(False) @@ -91,7 +91,7 @@ removing bounds checks and wrap around, providing array alignment information cdef const char *capsule_name = "BitGenerator" cdef double[::1] random_values - x = Xoshiro256() + x = PCG64() capsule = x.capsule if not PyCapsule_IsValid(capsule, capsule_name): raise ValueError("Invalid pointer to anon_func_state") @@ -117,7 +117,7 @@ RNG structure. cdef const char *capsule_name = "BitGenerator" cdef double[::1] random_values - x = Xoshiro256() + x = PCG64() capsule = x.capsule # Optional check that the capsule if from a BitGenerator if not PyCapsule_IsValid(capsule, capsule_name): diff --git a/doc/source/reference/random/generator.rst b/doc/source/reference/random/generator.rst index 8b086e901..22bce2e6c 100644 --- a/doc/source/reference/random/generator.rst +++ b/doc/source/reference/random/generator.rst @@ -8,7 +8,7 @@ a wide range of distributions, and served as a replacement for the two is that ``Generator`` relies on an additional BitGenerator to manage state and generate the random bits, which are then transformed into random values from useful distributions. The default BitGenerator used by -``Generator`` is :class:`~xoshiro256.Xoshiro256`. The BitGenerator +``Generator`` is `~PCG64`. The BitGenerator can be changed by passing an instantized BitGenerator to ``Generator``. diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst index 3159f0e1c..f32853e7c 100644 --- a/doc/source/reference/random/index.rst +++ b/doc/source/reference/random/index.rst @@ -9,6 +9,10 @@ Numpy's random number routines produce pseudo random numbers using combinations of a `BitGenerator` to create sequences and a `Generator` to use those sequences to sample from different statistical distributions: +* SeedSequence: Objects that provide entropy for the initial state of a + BitGenerator. A good SeedSequence will provide initializations across the + entire range of possible states for the BitGenerator, otherwise biases may + creep into the generated bit streams. * BitGenerators: Objects that generate random numbers. These are typically unsigned integer words filled with sequences of either 32 or 64 random bits. * Generators: Objects that transform sequences of random bits from a @@ -30,8 +34,8 @@ instance's methods are imported into the numpy.random namespace, see Quick Start ----------- -By default, `Generator` uses normals provided by `xoshiro256.Xoshiro256` -which will be faster than the legacy methods in `RandomState` +By default, `Generator` uses normals provided by `PCG64` which will be +statistically more reliable than the legacy methods in `RandomState` .. code-block:: python @@ -40,7 +44,7 @@ which will be faster than the legacy methods in `RandomState` random.standard_normal() `Generator` can be used as a direct replacement for `~RandomState`, although -the random values are generated by `~xoshiro256.Xoshiro256`. The +the random values are generated by `~PCG64`. The `Generator` holds an instance of a BitGenerator. It is accessible as ``gen.bit_generator``. @@ -52,28 +56,37 @@ the random values are generated by `~xoshiro256.Xoshiro256`. The rg.standard_normal() rg.bit_generator - -Seeds can be passed to any of the BitGenerators. Here `mt19937.MT19937` is used -and is the wrapped with a `~.Generator`. - +Seeds can be passed to any of the BitGenerators. The provided value is mixed +via `~.SeedSequence` to spread a possible sequence of seeds across a wider +range of initialization states for the BitGenerator. Here `~.PCG64` is used and +is wrapped with a `~.Generator`. .. code-block:: python - from numpy.random import Generator, MT19937 - rg = Generator(MT19937(12345)) + from numpy.random import Generator, PCG64 + rg = Generator(PCG64(12345)) rg.standard_normal() - Introduction ------------ RandomGen takes a different approach to producing random numbers from the -`RandomState` object. Random number generation is separated into two -components, a bit generator and a random generator. +`RandomState` object. Random number generation is separated into three +components, a seed sequence, a bit generator and a random generator. -The bit generator has a limited set of responsibilities. It manages state +The `BitGenerator` has a limited set of responsibilities. It manages state and provides functions to produce random doubles and random unsigned 32- and -64-bit values. The bit generator also handles all seeding which varies with -different bit generators. +64-bit values. + +The `SeedSequence` takes a seed and provides the initial state for the +`BitGenerator`. Since consecutive seeds can cause bad effects when comparing +`BitGenerator` streams, the `SeedSequence` uses current best-practice methods +to spread the initial state out. However small seeds may still be unable to +reach all possible initialization states, which can cause biases among an +ensemble of small-seed runs. For many cases, that doesn't matter. If you just +want to hold things in place while you debug something, biases aren't a +concern. For actual simulations whose results you care about, let +``SeedSequence(None)`` do its thing and then log/print the +`SeedSequence.entropy` for repeatable `BitGenerator` streams. The `random generator <Generator>` takes the bit generator-provided stream and transforms them into more useful @@ -86,15 +99,15 @@ The `Generator` is the user-facing object that is nearly identical to the sole argument. Note that the BitGenerator must be instantiated. .. code-block:: python - from numpy.random import Generator, MT19937 - rg = Generator(MT19937()) + from numpy.random import Generator, PCG64 + rg = Generator(PCG64()) rg.random() Seed information is directly passed to the bit generator. .. code-block:: python - rg = Generator(MT19937(12345)) + rg = Generator(PCG64(12345)) rg.random() What's New or Different @@ -120,8 +133,8 @@ What's New or Different source of randomness that is used in cryptographic applications (e.g., ``/dev/urandom`` on Unix). * All BitGenerators can produce doubles, uint64s and uint32s via CTypes - (`~xoshiro256.Xoshiro256.ctypes`) and CFFI - (:meth:`~xoshiro256.Xoshiro256.cffi`). This allows the bit generators to + (`~PCG64.ctypes`) and CFFI + (:meth:`~PCG64.cffi`). This allows the bit generators to be used in numba. * The bit generators can be used in downstream projects via :ref:`Cython <randomgen_cython>`. @@ -146,47 +159,14 @@ one of two ways: * :ref:`independent-streams` * :ref:`jump-and-advance` -Supported BitGenerators ------------------------ -The included BitGenerators are: - -* MT19937 - The standard Python BitGenerator. Produces identical results to - Python using the same seed/state. Adds a `~mt19937.MT19937.jumped` function - that returns a new generator with state as-if ``2**128`` draws have been made. -* dSFMT - SSE2 enabled versions of the MT19937 generator. Theoretically - the same, but with a different state and so it is not possible to produce a - sequence identical to MT19937. Supports ``jumped`` and so can - be used in parallel applications. See the `dSFMT authors' page`_. -* Xorshiro256** and Xorshiro512** - The most recently introduced XOR, - shift, and rotate generator. Supports ``jumped`` and so can be used in - parallel applications. See the documentation for - `~xoshiro256.Xoshirt256.jumped` for details. More information about these bit - generators is available at the `xorshift, xoroshiro and xoshiro authors' - page`_. -* ThreeFry and Philox - counter-based generators capable of being advanced an - arbitrary number of steps or generating independent streams. See the - `Random123`_ page for more details about this class of bit generators. - -.. _`dSFMT authors' page`: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/ -.. _`PCG author's page`: http://www.pcg-random.org/ -.. _`xorshift, xoroshiro and xoshiro authors' page`: http://xoroshiro.di.unimi.it/ -.. _`Random123`: https://www.deshawresearch.com/resources_random123.html - -Generator ---------- +Concepts +-------- .. toctree:: :maxdepth: 1 generator legacy mtrand <legacy> - -BitGenerators -------------- - -.. toctree:: - :maxdepth: 1 - - BitGenerators <bit_generators/index> + BitGenerators, SeedSequences <bit_generators/index> Features -------- diff --git a/doc/source/reference/random/multithreading.rst b/doc/source/reference/random/multithreading.rst index 7ce90af99..849d64d4e 100644 --- a/doc/source/reference/random/multithreading.rst +++ b/doc/source/reference/random/multithreading.rst @@ -10,21 +10,21 @@ these requirements. This example makes use of Python 3 :mod:`concurrent.futures` to fill an array using multiple threads. Threads are long-lived so that repeated calls do not require any additional overheads from thread creation. The underlying -BitGenerator is `Xoshiro256` which is fast, has a long period and supports -using `Xoshiro256.jumped` to return a new generator while advancing the +BitGenerator is `PCG64` which is fast, has a long period and supports +using `PCG64.jumped` to return a new generator while advancing the state. The random numbers generated are reproducible in the sense that the same seed will produce the same outputs. .. code-block:: ipython - from numpy.random import Generator, Xoshiro256 + from numpy.random import Generator, PCG64 import multiprocessing import concurrent.futures import numpy as np class MultithreadedRNG(object): def __init__(self, n, seed=None, threads=None): - rg = Xoshiro256(seed) + rg = PCG64(seed) if threads is None: threads = multiprocessing.cpu_count() self.threads = threads @@ -89,7 +89,7 @@ The single threaded call directly uses the BitGenerator. .. code-block:: ipython In [5]: values = np.empty(10000000) - ...: rg = Generator(Xoshiro256()) + ...: rg = Generator(PCG64()) ...: %timeit rg.standard_normal(out=values) 99.6 ms ± 222 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) @@ -100,7 +100,7 @@ that does not use an existing array due to array creation overhead. .. code-block:: ipython - In [6]: rg = Generator(Xoshiro256()) + In [6]: rg = Generator(PCG64()) ...: %timeit rg.standard_normal(10000000) 125 ms ± 309 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) diff --git a/doc/source/reference/random/new-or-different.rst b/doc/source/reference/random/new-or-different.rst index a6de9c8dc..11638824a 100644 --- a/doc/source/reference/random/new-or-different.rst +++ b/doc/source/reference/random/new-or-different.rst @@ -58,8 +58,7 @@ And in more detail: This replaces both ``randint`` and the deprecated ``random_integers``. * The Box-Muller used to produce NumPy's normals is no longer available. * All bit generators can produce doubles, uint64s and - uint32s via CTypes (`~.xoshiro256.Xoshiro256. - ctypes`) and CFFI (`~.xoshiro256.Xoshiro256.cffi`). + uint32s via CTypes (`~PCG64.ctypes`) and CFFI (`~PCG64.cffi`). This allows these bit generators to be used in numba. * The bit generators can be used in downstream projects via Cython. @@ -67,9 +66,9 @@ And in more detail: .. ipython:: python - from numpy.random import Generator, Xoshiro256 + from numpy.random import Generator, PCG64 import numpy.random - rg = Generator(Xoshiro256()) + rg = Generator(PCG64()) %timeit rg.standard_normal(100000) %timeit numpy.random.standard_normal(100000) @@ -94,9 +93,8 @@ And in more detail: .. ipython:: python - rg.bit_generator.seed(0) + rg = Generator(PCG64(0)) rg.random(3, dtype='d') - rg.bit_generator.seed(0) rg.random(3, dtype='f') * Optional ``out`` argument that allows existing arrays to be filled for diff --git a/doc/source/reference/random/parallel.rst b/doc/source/reference/random/parallel.rst index ffbaea62b..36e173ef2 100644 --- a/doc/source/reference/random/parallel.rst +++ b/doc/source/reference/random/parallel.rst @@ -64,8 +64,6 @@ are listed below. +-----------------+-------------------------+-------------------------+-------------------------+ | BitGenerator | Period | Jump Size | Bits | +=================+=========================+=========================+=========================+ -| DSFMT | :math:`2^{19937}` | :math:`2^{128}` | 53 | -+-----------------+-------------------------+-------------------------+-------------------------+ | MT19937 | :math:`2^{19937}` | :math:`2^{128}` | 32 | +-----------------+-------------------------+-------------------------+-------------------------+ | PCG64 | :math:`2^{128}` | :math:`2^{64}` | 64 | @@ -74,10 +72,6 @@ are listed below. +-----------------+-------------------------+-------------------------+-------------------------+ | ThreeFry | :math:`2^{256}` | :math:`2^{128}` | 64 | +-----------------+-------------------------+-------------------------+-------------------------+ -| Xoshiro256** | :math:`2^{256}` | :math:`2^{128}` | 64 | -+-----------------+-------------------------+-------------------------+-------------------------+ -| Xoshiro512** | :math:`2^{512}` | :math:`2^{256}` | 64 | -+-----------------+-------------------------+-------------------------+-------------------------+ ``jumped`` can be used to produce long blocks which should be long enough to not overlap. @@ -85,13 +79,13 @@ overlap. .. code-block:: python from numpy.random.entropy import random_entropy - from numpy.random import Xoshiro256 + from numpy.random import PCG64 entropy = random_entropy(2).astype(np.uint64) # 64-bit number as a seed seed = entropy[0] * 2**32 + entropy[1] blocked_rng = [] - rng = Xoshiro256(seed) + rng = PCG64(seed) for i in range(10): blocked_rng.append(rng.jumped(i)) diff --git a/doc/source/reference/random/performance.py b/doc/source/reference/random/performance.py index 54165226e..ed8745078 100644 --- a/doc/source/reference/random/performance.py +++ b/doc/source/reference/random/performance.py @@ -4,10 +4,9 @@ from timeit import repeat import pandas as pd import numpy as np -from numpy.random import MT19937, DSFMT, ThreeFry, PCG64, Philox, \ - Xoshiro256, Xoshiro512 +from numpy.random import MT19937, PCG64, Philox, SFC64 -PRNGS = [DSFMT, MT19937, PCG64, Philox, ThreeFry, Xoshiro256, Xoshiro512] +PRNGS = [MT19937, PCG64, Philox, SFC64] funcs = OrderedDict() integers = 'integers(0, 2**{bits},size=1000000, dtype="uint{bits}")' diff --git a/doc/source/reference/random/performance.rst b/doc/source/reference/random/performance.rst index 07867ee07..3e5c20e3a 100644 --- a/doc/source/reference/random/performance.rst +++ b/doc/source/reference/random/performance.rst @@ -7,13 +7,7 @@ Performance Recommendation ************** -The recommended generator for single use is :class:`~.xoshiro256.Xoshiro256`. -The recommended generator for use in large-scale parallel applications is -:class:`~.xoshiro512.Xoshiro512` where the `jumped` method is used to advance -the state. For very large scale applications -- requiring 1,000+ independent -streams -- is the best choice. For very large scale applications -- requiring -1,000+ independent streams, :class:`~pcg64.PCG64` or :class:`~.philox.Philox` -are the best choices. +The recommended generator for single use is :class:`~PCG64`. Timings ******* @@ -23,8 +17,7 @@ specific distribution. The original :class:`~mt19937.MT19937` generator is much slower since it requires 2 32-bit values to equal the output of the faster generators. -Integer performance has a similar ordering although `dSFMT` is slower since -it generates 53-bit floating point values rather than integer values. +Integer performance has a similar ordering. The pattern is similar for other, more complex generators. The normal performance of the legacy :class:`~mtrand.RandomState` generator is much @@ -36,18 +29,18 @@ The column labeled MT19973 is used the same 32-bit generator as :class:`~generator.Generator`. .. csv-table:: - :header: ,Xoshiro256**,Xoshiro512**,DSFMT,PCG64,MT19937,Philox,RandomState,ThreeFry - :widths: 14,14,14,14,14,14,14,14,14 + :header: ,PCG64,MT19937,Philox,RandomState + :widths: 14,14,14,14,14 - 32-bit Unsigned Ints,2.6,2.9,3.5,3.2,3.3,4.8,3.2,7.6 - 64-bit Unsigned Ints,3.3,4.3,5.7,4.8,5.7,6.9,5.7,12.8 - Uniforms,3.4,4.0,3.2,5.0,7.3,8.0,7.3,12.8 - Normals,7.9,9.0,11.8,11.3,13.0,13.7,34.4,18.1 - Exponentials,4.7,5.2,7.4,6.7,7.9,8.6,40.3,14.7 - Gammas,29.1,27.5,28.5,30.6,34.2,35.1,58.1,47.6 - Binomials,22.7,23.1,21.1,25.7,27.7,28.4,25.9,32.1 - Laplaces,38.5,38.1,36.9,41.1,44.5,45.4,46.9,50.2 - Poissons,46.9,50.9,46.4,58.1,68.4,70.2,86.0,88.2 + 32-bit Unsigned Ints,3.2,3.3,4.8,3.2 + 64-bit Unsigned Ints,4.8,5.7,6.9,5.7 + Uniforms,5.0,7.3,8.0,7.3 + Normals,11.3,13.0,13.7,34.4 + Exponentials,6.7,7.9,8.6,40.3 + Gammas,30.6,34.2,35.1,58.1 + Binomials,25.7,27.7,28.4,25.9 + Laplaces,41.1,44.5,45.4,46.9 + Poissons,58.1,68.4,70.2,86.0 The next table presents the performance in percentage relative to values @@ -55,19 +48,19 @@ generated by the legagy generator, `RandomState(MT19937())`. The overall performance was computed using a geometric mean. .. csv-table:: - :header: ,Xoshiro256**,Xoshiro256**,DSFMT,PCG64,MT19937,Philox,ThreeFry - :widths: 14,14,14,14,14,14,14,14 - - 32-bit Unsigned Ints,124,113,93,100,99,67,43 - 64-bit Unsigned Ints,174,133,100,118,100,83,44 - Uniforms,212,181,229,147,100,91,57 - Normals,438,382,291,304,264,252,190 - Exponentials,851,770,547,601,512,467,275 - Gammas,200,212,204,190,170,166,122 - Binomials,114,112,123,101,93,91,81 - Laplaces,122,123,127,114,105,103,93 - Poissons,183,169,185,148,126,123,98 - Overall,212,194,180,167,145,131,93 + :header: ,PCG64,MT19937,Philox + :widths: 14,14,14,14 + + 32-bit Unsigned Ints,100,99,67 + 64-bit Unsigned Ints,118,100,83 + Uniforms,147,100,91 + Normals,304,264,252 + Exponentials,601,512,467 + Gammas,190,170,166 + Binomials,101,93,91 + Laplaces,114,105,103 + Poissons,148,126,123 + Overall,167,145,131 .. note:: @@ -88,16 +81,16 @@ across tables. 64-bit Linux ~~~~~~~~~~~~ -=================== ======= ========= ======= ======== ========== ============ -Distribution DSFMT MT19937 PCG64 Philox ThreeFry Xoshiro256 -=================== ======= ========= ======= ======== ========== ============ -32-bit Unsigned Int 99.3 100 113.9 72.1 48.3 117.1 -64-bit Unsigned Int 105.7 100 143.3 89.7 48.1 161.7 -Uniform 222.1 100 181.5 90.8 59.9 204.7 -Exponential 110.8 100 145.5 92.5 55.0 177.1 -Normal 113.2 100 121.4 98.3 71.9 162.0 -**Overall** 123.9 100 139.3 88.2 56.0 161.9 -=================== ======= ========= ======= ======== ========== ============ +=================== ========= ======= ======== +Distribution MT19937 PCG64 Philox +=================== ========= ======= ======== +32-bit Unsigned Int 100 113.9 72.1 +64-bit Unsigned Int 100 143.3 89.7 +Uniform 100 181.5 90.8 +Exponential 100 145.5 92.5 +Normal 100 121.4 98.3 +**Overall** 100 139.3 88.2 +=================== ========= ======= ======== 64-bit Windows @@ -105,39 +98,38 @@ Normal 113.2 100 121.4 98.3 71.9 1 The performance on 64-bit Linux and 64-bit Windows is broadly similar. -=================== ======= ========= ======= ======== ========== ============ -Distribution DSFMT MT19937 PCG64 Philox ThreeFry Xoshiro256 -=================== ======= ========= ======= ======== ========== ============ -32-bit Unsigned Int 122.8 100 134.9 44.1 72.3 133.1 -64-bit Unsigned Int 130.4 100 162.7 41.0 77.7 142.3 -Uniform 273.2 100 200.0 44.8 84.6 175.8 -Exponential 135.0 100 167.8 47.4 84.5 166.9 -Normal 115.3 100 135.6 60.3 93.6 169.6 -**Overall** 146.7 100 158.4 47.1 82.2 156.5 -=================== ======= ========= ======= ======== ========== ============ +=================== ========= ======= ======== +Distribution MT19937 PCG64 Philox +=================== ========= ======= ======== +32-bit Unsigned Int 100 134.9 44.1 +64-bit Unsigned Int 100 162.7 41.0 +Uniform 100 200.0 44.8 +Exponential 100 167.8 47.4 +Normal 100 135.6 60.3 +**Overall** 100 158.4 47.1 +=================== ========= ======= ======== 32-bit Windows ~~~~~~~~~~~~~~ The performance of 64-bit generators on 32-bit Windows is much lower than on 64-bit -operating systems due to register width. DSFMT uses SSE2 when available, and so is less -affected by the size of the operating system's register. MT19937, the generator that has been -in NumPy since 2005, operates on 32-bit integers and so is close to DSFMT. - -=================== ======= ========= ======= ======== ========== ============ -Distribution DSFMT MT19937 PCG64 Philox ThreeFry Xoshiro256 -=================== ======= ========= ======= ======== ========== ============ -32-bit Unsigned Int 110.9 100 30.6 28.1 29.2 74.4 -64-bit Unsigned Int 104.7 100 24.2 23.7 22.7 72.7 -Uniform 247.0 100 26.7 28.4 27.8 78.8 -Exponential 110.1 100 32.1 32.6 30.5 89.6 -Normal 107.2 100 36.3 37.5 35.2 93.0 -**Overall** 127.6 100 29.7 29.7 28.8 81.3 -=================== ======= ========= ======= ======== ========== ============ +operating systems due to register width. MT19937, the generator that has been +in NumPy since 2005, operates on 32-bit integers. + +=================== ========= ======= ======== +Distribution MT19937 PCG64 Philox +=================== ========= ======= ======== +32-bit Unsigned Int 100 30.6 28.1 +64-bit Unsigned Int 100 24.2 23.7 +Uniform 100 26.7 28.4 +Exponential 100 32.1 32.6 +Normal 100 36.3 37.5 +**Overall** 100 29.7 29.7 +=================== ========= ======= ======== .. note:: - Linux timings used Ubuntu 18.04 and GCC 7.4. Windows timings were made on Windows 10 - using Microsoft C/C++ Optimizing Compiler Version 19 (Visual Studio 2015). All timings - were produced on a i5-3570 processor. + Linux timings used Ubuntu 18.04 and GCC 7.4. Windows timings were made on + Windows 10 using Microsoft C/C++ Optimizing Compiler Version 19 (Visual + Studio 2015). All timings were produced on a i5-3570 processor. diff --git a/doc/source/reference/routines.char.rst b/doc/source/reference/routines.char.rst index 3f4efdfc5..513f975e7 100644 --- a/doc/source/reference/routines.char.rst +++ b/doc/source/reference/routines.char.rst @@ -1,11 +1,13 @@ String operations ***************** -.. currentmodule:: numpy.core.defchararray +.. currentmodule:: numpy.char -This module provides a set of vectorized string operations for arrays -of type `numpy.string_` or `numpy.unicode_`. All of them are based on -the string methods in the Python standard library. +.. module:: numpy.char + +The `numpy.char` module provides a set of vectorized string +operations for arrays of type `numpy.string_` or `numpy.unicode_`. +All of them are based on the string methods in the Python standard library. String operations ----------------- diff --git a/doc/source/reference/ufuncs.rst b/doc/source/reference/ufuncs.rst index c71c8c9a7..d00e88b34 100644 --- a/doc/source/reference/ufuncs.rst +++ b/doc/source/reference/ufuncs.rst @@ -118,7 +118,7 @@ all output arrays will be passed to the :obj:`~class.__array_prepare__` and the highest :obj:`~class.__array_priority__` of any other input to the universal function. The default :obj:`~class.__array_priority__` of the ndarray is 0.0, and the default :obj:`~class.__array_priority__` of a subtype -is 1.0. Matrices have :obj:`~class.__array_priority__` equal to 10.0. +is 0.0. Matrices have :obj:`~class.__array_priority__` equal to 10.0. All ufuncs can also take output arguments. If necessary, output will be cast to the data-type(s) of the provided output array(s). If a class |