summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMark Wiebe <mwiebe@enthought.com>2011-06-23 15:38:25 -0500
committerMark Wiebe <mwiebe@enthought.com>2011-06-27 10:32:41 -0500
commit0a858a603798fa65787d71f8e62bf38c6693f36f (patch)
tree5eaaab47f88b0ae4b5884eb728aecc1335bddd41
parentb52abe1a291f61d6bd41d90b7886420d689d07d1 (diff)
downloadnumpy-0a858a603798fa65787d71f8e62bf38c6693f36f.tar.gz
NEP: c-masked-array: Some small improvements and clarifications
-rw-r--r--doc/neps/c-masked-array.rst35
1 files changed, 26 insertions, 9 deletions
diff --git a/doc/neps/c-masked-array.rst b/doc/neps/c-masked-array.rst
index acf7ec6d1..34c49586f 100644
--- a/doc/neps/c-masked-array.rst
+++ b/doc/neps/c-masked-array.rst
@@ -62,6 +62,9 @@ data type for NumPy and have it automatically work with missing values
is one of the reasons the masked approach has been chosen over special
signal values.
+Implementing masks as described in this NEP does not preclude also
+creating data types with special "NA" values.
+
**************************
The Mask as Seen in Python
**************************
@@ -99,6 +102,10 @@ values in views will also unmask them in the original array, and if
a mask is added to an array, it will not be possible to ever remove that
mask except to create a new array copying the data but not the mask.
+It is still possible to temporarily treat an array with a mask without
+giving it one, by first creating a view of the array and then adding a
+mask to that view.
+
Working With Masked Values
==========================
@@ -165,16 +172,17 @@ have to be extended to support masked computation. Because this
is a useful feature in general, even outside the context of
a masked array, in addition to working with masked arrays ufuncs
will take an optional 'mask=' parameter which allows the use
-of boolean arrays to choose where a computation should be done. This
-functions similar to a "where" clause on the ufunc.::
+of boolean arrays to choose where a computation should be done.
+This functions similar to a "where" clause on the ufunc.::
np.add(a, b, out=b, mask=(a > threshold))
+A benefit of having this 'mask=' parameter is that it provides a way
+to temporarily treat an object with a mask without ever creating a
+masked array object.
+
If the 'out' parameter isn't specified, use of the 'mask=' parameter
-will produce a array with a mask as the result. A benefit of this
-operation is that it provides a way to temporarily treat an object
-with a mask, without making it a masked array which adds the mask
-permanently.
+will produce a array with a mask as the result.
Reduction operations like 'sum', 'prod', 'min', and 'max' will operate as
if the values weren't there, applying the operation to the unmasked
@@ -191,7 +199,7 @@ Unresolved Design Questions
Scalars will not be modified to have a mask, so this leaves two options
for what value should be returned when retrieving a single masked value.
-Either 'None', or a one-dimensional masked array. The former follows
+Either 'None', or a zero-dimensional masked array. The former follows
the convention of returning an immutable value from such accesses,
while the later preserves type information, so the correct choice
will require some discussion to resolve.
@@ -204,5 +212,14 @@ array. There would also need to be an 'a.mask.ishard' property.
If the hardmask feature is implemented, boolean indexing could
return a hardmasked array instead of a flattened array with the
-arbitrary choice of C-ordering as it currently does.
-
+arbitrary choice of C-ordering as it currently does. While this
+improves the abstraction of the array significantly, it is not
+a compatible change.
+
+There is some consternation about the conventional True/False
+interpretation of the mask, centered around the name "mask". One
+possibility to deal with this is to call it a "validity mask" in
+all documentation, which more clearly indicates that True means
+valid data. If this isn't sufficient, an alternate name for the
+attribute could be found, like "a.validitymask", "a.validmask",
+or "a.validity".