Hello,
The format spec and the C++ implementation disagree on one point: * The spec says that dense union offsets should be increasing: """The respective offsets for each child value array must be in order / increasing.""" (from https://arrow.apache.org/docs/format/Columnar.html#dense-union) * The C++ implementation has long had some tests that used deliberatly non-increasing (even descending) dense union offsets. (see https://issues.apache.org/jira/browse/ARROW-10580) I don't know what other implementations, especially Java, expect. There are obviously two possible solutions: 1) Fix the C++ implementation and its tests to conform to the format spec (which may break compatibility for code producing / consuming dense unions with non-increasing offsets) 2) Relax the format spec to allow arbitrary offsets (which could make dense union more like a polymorphic dictionary). If the first solution is chosen, then another question arises: must the offsets be strictly increasing? Or can a given offset appear several times in a row? (the latter is currently exploited by the C++ implementation: when appending several nulls to a DenseUnionBuilder, only one child null slot is added and the same offset is appended multiple times) Regards Antoine.