Gabriel Becker created ARROW-3263:
-------------------------------------

             Summary: Use R sentinel values for missingness in addition to 
bitmask
                 Key: ARROW-3263
                 URL: https://issues.apache.org/jira/browse/ARROW-3263
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Format
            Reporter: Gabriel Becker


R uses sentinal values to indicate missingness within Atomic vectors (read 
arrays in Arrow parlance, AFAIK). 

Currently according to [~wesmckinn], the current value in the array in memory 
is undefined if the bitmap indicating missingness is set to 1. 

This will force R to copy and modify data whenever adopting Arrow data which 
has missingness present as a native vector.

If the value were written to the relevant sentinal values (INT_MIN for 32 bit 
integers, and NaN with payload 1954 for double precision floats) _in addition 
to_ the bit mask, then R would be able to use Arrow as intended while not 
breaking any other systems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to