diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/neps/missing-data.rst | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/doc/neps/missing-data.rst b/doc/neps/missing-data.rst index 52456ac47..7a2c076cb 100644 --- a/doc/neps/missing-data.rst +++ b/doc/neps/missing-data.rst @@ -735,6 +735,9 @@ PyArray_ContainsNA(PyArrayObject* obj) true if the array has NA support AND there is an NA anywhere in the array. +Mask Binary Format +================== + The format of the mask itself is designed to indicate whether an element is masked or not, as well as contain a payload so that multiple different NAs with different payloads can be used in the future. @@ -752,6 +755,11 @@ works as a mask, because it takes on the values 0 for False and 1 for True. Additionally, the payload for npy_bool, which is always zero, dominates over all the other possible payloads. +Since the design involves giving the mask its own dtype, we can +distinguish between masking with a single NA value (npy_bool mask), +and masking with multi-NA (npy_uint8 mask). Initial implementations +will just support the npy_bool mask. + An idea that was discarded is to allow the combination of masks + payloads to be a simple 'min' operation. This can be done by putting the payload in bits 0 through 6, so that the payload is (m&0x7f), and using bit 7 |