| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| | |
ENH: Use AVX for float32 implementation of np.sin & np.cos
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| | |
The assert_array_max_ulp returns only an int since it compares ULP
difference between two float32 numbers.
|
| |
| |
| |
| |
| |
| | |
Without an FMA, the output of AVX2 and AVX512 version differ. This
changes ensures the output across implementations remains exactly the
same.
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit implements vectorized single precision sine and cosine using
AVX2 and AVX512. Both sine and cosine are computed using a
polynomial approximation which are accurate for values between
[-PI/4,PI/4]. The original input is reduced to this range using a 3-step
Cody-Waite's range reduction method. This method is only accurate for
values between [-71476.0625f, 71476.0625f] for cosine and [-117435.992f,
117435.992f] for sine. The algorithm identifies elements outside this
range and calls glibc in a scalar loop to compute their output.
The algorithm is a vectorized version of the methods presented
here: https://stackoverflow.com/questions/30463616/payne-hanek-algorithm-implementation-in-c/30465751#30465751
Accuracy: maximum ULP error = 1.49
Performance: The speed-up this implementation provides is dependent on
the values of the input array. It performs best when all the input
values are within the range specified above. Details of the performance
boost are provided below. Its worst performance is when all the array
elements are outside the range leading to about 1-2% reduction in
performance.
Three different benchmarking data are provided, each of which was benchmarked
using timeit package in python. Each function is executed 1000 times and
this is repeated 100 times. The standard deviation for all the runs was
less than 2% of their mean value and hence not included in the data.
(1) Micro-bencharking:
Array size = 10000, Command = "%timeit np.cos([myarr])"
|---------------+------------+--------+---------+----------+----------|
| Function Name | NumPy 1.16 | AVX2 | AVX512 | AVX2 | AVX512 |
| | | | | speed up | speed up |
|---------------+------------+--------+---------+----------+----------|
| np.cos | 1.5174 | 0.1553 | 0.06634 | 9.77 | 22.87 |
| np.sin | 1.4738 | 0.1517 | 0.06406 | 9.71 | 23.00 |
|---------------+------------+--------+---------+----------+----------|
(2) Package ai.cs provides an API to transform spherical coordinates to
cartesean system:
Array size = 10000, Command = "%timeit ai.cs.sp2cart(r,theta,phi)"
|---------------+------------+--------+--------+----------+----------|
| Function Name | NumPy 1.16 | AVX2 | AVX512 | AVX2 | AVX512 |
| | | | | speed up | speed up |
|---------------+------------+--------+--------+----------+----------|
| ai.cs.sp2cart | 0.6371 | 0.1066 | 0.0605 | 5.97 | 10.53 |
|---------------+------------+--------+--------+----------+----------|
(3) Package photutils provides an API to find the best fit of first and
second harmonic functions to a set of (angle, intensity) pairs:
Array size = 1000, Command = "%timeit fit_first_and_second_harmonics(E, data)"
|--------------------------------+------------+--------+--------+----------+----------|
| Function Name | NumPy 1.16 | AVX2 | AVX512 | AVX2 | AVX512 |
| | | | | speed up | speed up |
|--------------------------------+------------+--------+--------+----------+----------|
| fit_first_and_second_harmonics | 1.598 | 0.8709 | 0.7761 | 1.83 | 2.05 |
|--------------------------------+------------+--------+--------+----------+----------|
|
| |
| |
| |
| |
| | |
(#14070)
* DOC:Add example to clarify "numpy.save" behavior on already unclosed file
|
|\ \
| | |
| | | |
MAINT: Clearer error message while padding with stat_length=0
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Provides a clearer error message if stat_length=0 is the cause of an
exception (mean and median return nan with warnings) as well as tests
covering this behavior.
Note: This shouldn't change the behavior/API except for the content of
the raised ValueError.
|
| | |
| | |
| | |
| | |
| | | |
The original message included "Mismatch: 33.3%". It's not obvious what this
percentage means. This commit changes the text to
"Mismatched elements: 1 / 3 (33.3%)".
|
| | |
| | |
| | | |
* DOC: mention take_along_axis in choose
|
|\ \ \
| | | |
| | | | |
BUG: Fix formatting error in exception message
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This changes the string conversion of an expected
dtype in an error message from e.g.
"<class 'numpy.float64'>" to "float64".
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This commit fixes a simple formatting error in the
generation of an exception message. The message
is supposed to contain the expected vs. the actual
dtype, but instead contained two times the expected
dtype.
|
|\ \ \ \
| | | | |
| | | | | |
BUG: Fixes StopIteration error from 'np.genfromtext' for empty file with skip_header > 0
|
| |/ / / |
|
|/ / / |
|
|\ \ \
| | | |
| | | | |
ENH: add c-imported modules for freeze analysis in np.random
|
| | | | |
|
| | | | |
|
| | | | |
|
|\ \ \ \
| | | | |
| | | | | |
DEP: Deprecate PyArray_FromDimsAndDataAndDescr, PyArray_FromDims
|
| | | | | |
|
| | | | | |
|
| | | | |
| | | | |
| | | | | |
* Deprecate and fix tests for alen
|
| | | | |
| | | | |
| | | | | |
* DOC: new nan_to_num keywords are from 1.17 onwards
|
| | | | | |
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
IntelPython/intel-compiler-binary-search-with-guess
MAINT: Workaround for Intel compiler bug leading to failing test
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
time when comparing with FP exceptions, such as NAN
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | | |
TST: Clean up of test_pocketfft.py
|
| | | | | | | |
|
| | | | | | | |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Relaxed test_fft_with_order for float32.
Infinity norm round-off error of FFT is shown in
G.U. Ramos, "Roundoff Error Analyss of the Fast Fourier Transform,"
Mathematics of Computation, vol. 25, no. 116, Oct. 1971, p. 757
to be bounded by sqrt(N)*K*eps.
|
| |_|_|_|_|/
|/| | | | | |
|
| | | | | | |
|
| | | | | |
| | | | | |
| | | | | | |
There is no output array for allclose as opposed to isclose, so do not reference one.
|
| |/ / / /
|/| | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
(gh-14174)
The check tested the wrong function. In principle a dtype could only implement one of the two slots/functions.
Fix #14173.
|
|/ / / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fancy indexing on read-only subclass makes a read-only copy. This PR avoids using the flags of the original array.
Fixes #14132
* do not use original array flags in fancy indexing
* Update numpy/core/tests/test_indexing.py
Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
|
|\ \ \ \
| | | | |
| | | | | |
MAINT: Ensure array_dealloc does not modify refcount of self
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The increase happens in dealloc and is thus harmless. Ignore it,
since it is in a deprecated code path, so downstream should not
be running into this path anyway (the leak is not a real leak,
so that the only reason to avoid is, is to not trip downstream
testing for reference leaks).
|
| | | | | |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This enables allocating the simple iterator on the stack using a
borrowed reference to the array. Using this approach during
deallocating object arrays (decrefing the elements) avoids changing
the reference count of `self` during dealloc, which leads to problems.
Just INCREF'ing self is not desirable because it increases the global
reference count in debugging mode so that it makes reference count
debugging harder.
|
|\ \ \ \ \
| | | | | |
| | | | | | |
DEP: Remove np.rank which has been deprecated for more than 5 years
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
references #7059
|