diff options
author | Warren Weckesser <warren.weckesser@gmail.com> | 2019-06-10 23:00:39 -0400 |
---|---|---|
committer | Warren Weckesser <warren.weckesser@gmail.com> | 2019-06-14 16:14:31 -0400 |
commit | b2d2b677c26c8da7052bbc653b20ad9717f078fe (patch) | |
tree | efbeb7af0d5ae96d7ee61cafc8625c937dad5d2b /numpy/core/fromnumeric.py | |
parent | e3eb3986dd87e700a694d6b4151c96ef92dfabe0 (diff) | |
download | numpy-b2d2b677c26c8da7052bbc653b20ad9717f078fe.tar.gz |
MAINT: random: Rewrite the hypergeometric distribution.
Summary of the changes:
* Move the functions random_hypergeometric_hyp, random_hypergeometric_hrua
and random_hypergeometric from distributions.c to legacy-distributions.c.
These are now the legacy implementation of hypergeometric.
* Add the files logfactorial.c and logfactorial.h,
containing the function logfactorial(int64_t k).
* Add the files random_hypergeometric.c and random_hypergeometric.h,
containing the function random_hypergeometric (the new implementation
of the hypergeometric distribution). See more details below.
* Fix two tests in numpy/random/tests/test_generator_mt19937.py that
used values returned by the hypergeometric distribution. The
new implementation changes the stream, so those tests needed to
be updated.
* Remove another test obviated by an added constraint on the arguments
of hypergeometric.
Details of the rewrite:
If you carefully step through the old function rk_hypergeometric_hyp(),
you'll see that the end result is basically the same as the new function
hypergeometric_sample(), but the new function accomplishes the result with
just integers. The floating point calculations in the old code caused
problems when the arguments were extremely large (explained in more detail
in the unmerged pull request https://github.com/numpy/numpy/pull/9834).
The new version of hypergeometric_hrua() is a new translation of
Stadlober's ratio-of-uniforms algorithm for the hypergeometric
distribution. It fixes a mistake in the old implementation that made the
method less efficient than it could be (see the details in the unmerged
pull request https://github.com/numpy/numpy/pull/11138), and uses a faster
function for computing log(k!).
The HRUA algorithm suffers from loss of floating point precision when
the arguments are *extremely* large (see the comments in github issue
11443). To avoid these problems, the arguments `ngood` and `nbad` of
hypergeometric must be less than 10**9. This constraint obviates an
existing regression test that was run on systems with 64 bit long
integers, so that test was removed.
Diffstat (limited to 'numpy/core/fromnumeric.py')
0 files changed, 0 insertions, 0 deletions