Wes McKinney created ARROW-62:
---------------------------------

             Summary: Format: Are the nulls bits 0 or 1 for null values?
                 Key: ARROW-62
                 URL: https://issues.apache.org/jira/browse/ARROW-62
             Project: Apache Arrow
          Issue Type: Bug
          Components: Format
            Reporter: Wes McKinney


As brought up by Dan Robinson on the mailing list (thank you for catching 
this!), there is an inconsistency in the format documents in the representation 
of nulls with the ValueVectors code import -- since I drafted these format 
documents initially I'll take the blame for the inconsistency, but:

* Drill / ValueVectors uses the value 0 for null data, and 1 for non-null data
* The format document currently states the opposite (values are null if the bit 
is set)

I can see arguments both ways, but one argument for the ValueVectors style is 
that values must be explicitly set to be non-null, versus uninitialized values 
being accidentally interpreted as being non-null. When initializing a bitmap, 
one can {{memset}} the bits to 0, then set then to 1 when non-null values are 
appended during construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to