summaryrefslogtreecommitdiff
path: root/numpy
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #13368 from r-devulap/sincos-simdMatti Picus2019-08-189-52/+410
|\ | | | | ENH: Use AVX for float32 implementation of np.sin & np.cos
| * BUG: rename avx2_scalef_ps to fma_scalef_psRaghuveer Devulapalli2019-08-031-1/+1
| |
| * TEST: improving test coverage for sin/cos for input > 117435.992fRaghuveer Devulapalli2019-08-031-1/+6
| |
| * MAINT: using an enum to switch between sin/cosRaghuveer Devulapalli2019-08-033-12/+20
| |
| * BUG: eliminate unsed variables warning in cpuidRaghuveer Devulapalli2019-08-031-1/+1
| |
| * TEST: Rounding max tolerable ulp error to an intRaghuveer Devulapalli2019-08-031-4/+4
| | | | | | | | | | The assert_array_max_ulp returns only an int since it compares ULP difference between two float32 numbers.
| * BUG: AVX2 impl of sin/cos requires an FMARaghuveer Devulapalli2019-08-037-52/+76
| | | | | | | | | | | | Without an FMA, the output of AVX2 and AVX512 version differ. This changes ensures the output across implementations remains exactly the same.
| * BUG: use strides and process strided arrays using AVXRaghuveer Devulapalli2019-08-032-21/+49
| |
| * TEST: adding tests to validate AVX sin/cos implementationRaghuveer Devulapalli2019-08-032-3/+14
| |
| * BUG: sin and cos cast float16 to float32Raghuveer Devulapalli2019-08-031-0/+2
| |
| * BUG: fixing NAN handling and adding tests for sin/cosRaghuveer Devulapalli2019-08-032-4/+23
| |
| * ENH: Use AVX for float32 implementation of np.sin & np.cosRaghuveer Devulapalli2019-08-035-5/+266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements vectorized single precision sine and cosine using AVX2 and AVX512. Both sine and cosine are computed using a polynomial approximation which are accurate for values between [-PI/4,PI/4]. The original input is reduced to this range using a 3-step Cody-Waite's range reduction method. This method is only accurate for values between [-71476.0625f, 71476.0625f] for cosine and [-117435.992f, 117435.992f] for sine. The algorithm identifies elements outside this range and calls glibc in a scalar loop to compute their output. The algorithm is a vectorized version of the methods presented here: https://stackoverflow.com/questions/30463616/payne-hanek-algorithm-implementation-in-c/30465751#30465751 Accuracy: maximum ULP error = 1.49 Performance: The speed-up this implementation provides is dependent on the values of the input array. It performs best when all the input values are within the range specified above. Details of the performance boost are provided below. Its worst performance is when all the array elements are outside the range leading to about 1-2% reduction in performance. Three different benchmarking data are provided, each of which was benchmarked using timeit package in python. Each function is executed 1000 times and this is repeated 100 times. The standard deviation for all the runs was less than 2% of their mean value and hence not included in the data. (1) Micro-bencharking: Array size = 10000, Command = "%timeit np.cos([myarr])" |---------------+------------+--------+---------+----------+----------| | Function Name | NumPy 1.16 | AVX2 | AVX512 | AVX2 | AVX512 | | | | | | speed up | speed up | |---------------+------------+--------+---------+----------+----------| | np.cos | 1.5174 | 0.1553 | 0.06634 | 9.77 | 22.87 | | np.sin | 1.4738 | 0.1517 | 0.06406 | 9.71 | 23.00 | |---------------+------------+--------+---------+----------+----------| (2) Package ai.cs provides an API to transform spherical coordinates to cartesean system: Array size = 10000, Command = "%timeit ai.cs.sp2cart(r,theta,phi)" |---------------+------------+--------+--------+----------+----------| | Function Name | NumPy 1.16 | AVX2 | AVX512 | AVX2 | AVX512 | | | | | | speed up | speed up | |---------------+------------+--------+--------+----------+----------| | ai.cs.sp2cart | 0.6371 | 0.1066 | 0.0605 | 5.97 | 10.53 | |---------------+------------+--------+--------+----------+----------| (3) Package photutils provides an API to find the best fit of first and second harmonic functions to a set of (angle, intensity) pairs: Array size = 1000, Command = "%timeit fit_first_and_second_harmonics(E, data)" |--------------------------------+------------+--------+--------+----------+----------| | Function Name | NumPy 1.16 | AVX2 | AVX512 | AVX2 | AVX512 | | | | | | speed up | speed up | |--------------------------------+------------+--------+--------+----------+----------| | fit_first_and_second_harmonics | 1.598 | 0.8709 | 0.7761 | 1.83 | 2.05 | |--------------------------------+------------+--------+--------+----------+----------|
* | DOC:Add example to clarify "numpy.save" behavior on already open file #10445 ↵Omar Merghany2019-08-161-1/+12
| | | | | | | | | | (#14070) * DOC:Add example to clarify "numpy.save" behavior on already unclosed file
* | Merge pull request #14101 from lagru/zero_stat_lengthSebastian Berg2019-08-152-1/+31
|\ \ | | | | | | MAINT: Clearer error message while padding with stat_length=0
| * | MAINT: Clearer error while padding stat_length=0Lars Grueter2019-08-092-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Provides a clearer error message if stat_length=0 is the cause of an exception (mean and median return nan with warnings) as well as tests covering this behavior. Note: This shouldn't change the behavior/API except for the content of the raised ValueError.
* | | ENH: Improve mismatch message of np.testing.assert_array_equal (#14203)Tim Hoffmann2019-08-152-7/+11
| | | | | | | | | | | | | | | The original message included "Mismatch: 33.3%". It's not obvious what this percentage means. This commit changes the text to "Mismatched elements: 1 / 3 (33.3%)".
* | | DOC: mention `take_along_axis` in `choose` (#14224)colinsnyder2019-08-151-7/+8
| | | | | | | | | * DOC: mention take_along_axis in choose
* | | Merge pull request #14270 from aleju/fix_exceptionCharles Harris2019-08-141-1/+1
|\ \ \ | | | | | | | | BUG: Fix formatting error in exception message
| * | | MAINT: Improve error message dtype appearancealeju2019-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This changes the string conversion of an expected dtype in an error message from e.g. "<class 'numpy.float64'>" to "float64".
| * | | BUG: Fix formatting error in exception messagealeju2019-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes a simple formatting error in the generation of an exception message. The message is supposed to contain the expected vs. the actual dtype, but instead contained two times the expected dtype.
* | | | Merge pull request #14252 from maxwell-aladago/genfromtextCharles Harris2019-08-142-5/+10
|\ \ \ \ | | | | | | | | | | BUG: Fixes StopIteration error from 'np.genfromtext' for empty file with skip_header > 0
| * | | | fixes StopIteration error for empty file with skip_header > 0Maxwell Aladago2019-08-112-5/+10
| |/ / /
* | | | DOC: Address typos in dispatch docsMatt McCormick2019-08-141-3/+3
|/ / /
* | | Merge pull request #14141 from KmolYuan/random_freeze_analysisCharles Harris2019-08-081-1/+6
|\ \ \ | | | | | | | | ENH: add c-imported modules for freeze analysis in np.random
| * | | ENH: add c-imported modules for freeze analysis in np.randomYuan2019-07-281-1/+6
| | | |
* | | | BUG: Better err message for normEric Larson2019-08-082-5/+10
| | | |
* | | | DOC: update or remove outdated sourceforge linksmattip2019-08-082-2/+2
| | | |
* | | | Merge pull request #14100 from kritisingh1/dep3Matti Picus2019-08-081-41/+10
|\ \ \ \ | | | | | | | | | | DEP: Deprecate PyArray_FromDimsAndDataAndDescr, PyArray_FromDims
| * | | | Remove comments, decrease reference countkritisingh12019-07-291-4/+2
| | | | |
| * | | | DEP: Deprecate PyArray_FromDimsAndDataAndDescr, PyArray_FromDimskritisingh12019-07-291-41/+12
| | | | |
* | | | | DEP: Deprecate np.alen (#14181)Guilherme Leobas2019-08-083-9/+21
| | | | | | | | | | | | | | | * Deprecate and fix tests for alen
* | | | | DOC: new nan_to_num keywords are from 1.17 onwards (#14219)Géraud Le Falher2019-08-081-1/+9
| | | | | | | | | | | | | | | * DOC: new nan_to_num keywords are from 1.17 onwards
* | | | | Fixed default BitGenerator nameGiuseppe Cuccu2019-08-061-3/+3
| | | | |
* | | | | Merge pull request #14185 from ↵Charles Harris2019-08-041-4/+13
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | IntelPython/intel-compiler-binary-search-with-guess MAINT: Workaround for Intel compiler bug leading to failing test
| * | | | | work-around for compiler erroneously incrementing the iterator one extra ↵Oleksandr Pavlyk2019-08-021-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | time when comparing with FP exceptions, such as NAN
* | | | | | Merge pull request #14178 from IntelPython/clean-up-test-pocketfftCharles Harris2019-08-041-60/+64
|\ \ \ \ \ \ | | | | | | | | | | | | | | TST: Clean up of test_pocketfft.py
| * | | | | | Incremented _tol my factor of 8.Oleksandr Pavlyk2019-08-011-1/+2
| | | | | | |
| * | | | | | Replaced np.sqrt(X.size) with np.sqrt(np.log2(X.size)) per PR reviewOleksandr Pavlyk2019-08-011-1/+1
| | | | | | |
| * | | | | | Replaced assert_array_almost_equal with assert_allcloseOleksandr Pavlyk2019-08-011-60/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Relaxed test_fft_with_order for float32. Infinity norm round-off error of FFT is shown in G.U. Ramos, "Roundoff Error Analyss of the Fast Fourier Transform," Mathematics of Computation, vol. 25, no. 116, Oct. 1971, p. 757 to be bounded by sqrt(N)*K*eps.
* | | | | | | MAINT: Fix the typoGuanqun Lu2019-08-041-1/+1
| |_|_|_|_|/ |/| | | | |
* | | | | | DOC: Fix hermitian argument docs in svdhvy2019-08-031-6/+6
| | | | | |
* | | | | | DOC: Fix misleading `allclose` docstring for `equal_nan` (gh-14183)Antoine Dechaume2019-08-021-1/+1
| | | | | | | | | | | | | | | | | | There is no output array for allclose as opposed to isclose, so do not reference one.
* | | | | | BUG: Check for existence of `fromstr` which used in `fromstr_next_element` ↵Zijie (ZJ) Poh2019-08-021-1/+1
| |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | (gh-14174) The check tested the wrong function. In principle a dtype could only implement one of the two slots/functions. Fix #14173.
* | | | | BUG: Make advanced indexing result on read-only subclass writeable (#14171)jeremiedbb2019-08-012-1/+14
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fancy indexing on read-only subclass makes a read-only copy. This PR avoids using the flags of the original array. Fixes #14132 * do not use original array flags in fancy indexing * Update numpy/core/tests/test_indexing.py Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
* | | | Merge pull request #13871 from seberg/ugly-refcount-changingSebastian Berg2019-07-316-37/+49
|\ \ \ \ | | | | | | | | | | MAINT: Ensure array_dealloc does not modify refcount of self
| * | | | TST: Mark test which increases global reference countSebastian Berg2019-07-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The increase happens in dealloc and is thus harmless. Ignore it, since it is in a deprecated code path, so downstream should not be running into this path anyway (the leak is not a real leak, so that the only reason to avoid is, is to not trip downstream testing for reference leaks).
| * | | | BUG: Fix reference count issue in recursive `dtype` lookup errorSebastian Berg2019-07-261-0/+1
| | | | |
| * | | | MAINT: Remove need for intermediate refcount incs in deallocSebastian Berg2019-07-264-37/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables allocating the simple iterator on the stack using a borrowed reference to the array. Using this approach during deallocating object arrays (decrefing the elements) avoids changing the reference count of `self` during dealloc, which leads to problems. Just INCREF'ing self is not desirable because it increases the global reference count in debugging mode so that it makes reference count debugging harder.
* | | | | Merge pull request #14039 from sameshl/remove_depr_rank_funcSebastian Berg2019-07-313-56/+2
|\ \ \ \ \ | | | | | | | | | | | | DEP: Remove np.rank which has been deprecated for more than 5 years
| * | | | | DEP: Remove np.rank which has been deprecated for more than 5 yearsSamesh2019-07-313-56/+2
| | | | | | | | | | | | | | | | | | | | | | | | references #7059