MAINT/BUG: Clip real and imag parts of corrcoef return to [-1, 1].

The non-nan elements of the result of corrcoef should satisfy the inequality abs(x) <= 1 and the non-nan elements of the diagonal should be exactly one. We can't guarantee those results due to roundoff, but clipping the real and imaginary parts to the interval [-1, 1] improves things to a small degree. Closes #7392.
author: Charles Harris <charlesr.harris@gmail.com> 2016-03-13 13:41:06 -0600
committer: Charles Harris <charlesr.harris@gmail.com> 2016-03-13 20:06:05 -0600
commit: d92ff4cdd1fc3608e39ffbe119ecbb520b678f3e (patch)
tree: 0839e7d776d34ea2b037d3a6783686b91bed6bf9 /numpy/lib/function_base.py
parent: fa107fe361520ceac09131f96a8715473078801e (diff)
download: numpy-d92ff4cdd1fc3608e39ffbe119ecbb520b678f3e.tar.gz
1 files changed, 19 insertions, 5 deletions
diff --git a/numpy/lib/function_base.py b/numpy/lib/function_base.py
index 91034ef37..26e4b0d65 100644
--- a/numpy/lib/function_base.py
+++ b/numpy/lib/function_base.py
@@ -2523,6 +2523,12 @@ def corrcoef(x, y=None, rowvar=1, bias=np._NoValue, ddof=np._NoValue):
 
     Notes
     -----
+    Due to floating point rounding the resulting array may not be Hermitian,
+    the diagonal elements may not be 1, and the elements may not satisfy the
+    inequality abs(a) <= 1. The real and imaginary parts are clipped to the
+    interval [-1,  1] in an attempt to improve on that situation but is not
+    much help in the complex case.
+
     This function accepts but discards arguments `bias` and `ddof`.  This is
     for backwards compatibility with previous versions of this function.  These
     arguments had no effect on the return values of the function and can be
@@ -2536,13 +2542,21 @@ def corrcoef(x, y=None, rowvar=1, bias=np._NoValue, ddof=np._NoValue):
     c = cov(x, y, rowvar)
     try:
         d = diag(c)
-    except ValueError:  # scalar covariance
+    except ValueError:
+        # scalar covariance
         # nan if incorrect value (nan, inf, 0), 1 otherwise
         return c / c
-    d = sqrt(d)
-    # calculate "c / multiply.outer(d, d)" row-wise ... for memory and speed
-    for i in range(0, d.size):
-        c[i,:] /= (d * d[i])
+    stddev = sqrt(d.real)
+    c /= stddev[:, None]
+    c /= stddev[None, :]
+
+    # Clip real and imaginary parts to [-1, 1].  This does not guarantee
+    # abs(a[i,j]) <= 1 for complex arrays, but is the best we can do without
+    # excessive work.
+    np.clip(c.real, -1, 1, out=c.real)
+    if np.iscomplexobj(c):
+        np.clip(c.imag, -1, 1, out=c.imag)
+
     return c
author	Charles Harris <charlesr.harris@gmail.com>	2016-03-13 13:41:06 -0600
committer	Charles Harris <charlesr.harris@gmail.com>	2016-03-13 20:06:05 -0600
commit	d92ff4cdd1fc3608e39ffbe119ecbb520b678f3e (patch)
tree	0839e7d776d34ea2b037d3a6783686b91bed6bf9 /numpy/lib/function_base.py
parent	fa107fe361520ceac09131f96a8715473078801e (diff)
download	numpy-d92ff4cdd1fc3608e39ffbe119ecbb520b678f3e.tar.gz