It sounds like the "span" type could be implemented as a composite of multiple Arrow arrays / schemas:
array 1 (data) any schema array 2 (view) struct < start: int64, stop: int64 > Unless I'm missing something, this feels like an application-level concern rather than something that needs to be addressed in the columnar format / metadata. On Tue, May 1, 2018 at 9:43 AM, Antoine Pitrou <anto...@python.org> wrote: > > IIUC, the point is to have different logical views over the same data. > So you could have e.g. a "sorted" view. You could also have a view > spanning a tiny fraction of the original data (you can probably also > encode that with a null bitmap, but if most values are nulls that is > less efficient). > > Regards > > Antoine. > > > Le 01/05/2018 à 15:24, Brian Hulette a écrit : >> Yeah I see that difference. I guess my question was really - is there a >> reason not to re-arrange the actual list data so that an offset array >> will work? >> >> Perhaps they actually want to be able to specify lists with overlap? Or >> maybe there is meaning to the original order of the list data? I suppose >> that latter option seems more likely. >> >> Brian >> >> >> On 04/30/2018 05:42 PM, Antoine Pitrou wrote: >>> Le 30/04/2018 à 23:39, Brian Hulette a écrit : >>>> Yes my first reaction to both of these requests is >>>> - would dictionary-encoding work? >>>> - would a List<T> work? >>>> >>>> I think for the former the analogy is more clear, for the latter, >>>> technically a List encodes start and stop indices with an offset array >>>> rather than separate arrays for start and stop indices. Is there a >>>> reason an offset array wouldn't work for the OAMap use-case though? >>> With an offsets array, spans (lists) are contiguous: span N + 1 starts >>> off where span N stops. With separate start/stops array, they needn't >>> be: the logical array can "walk" the physical array in any order. >>> >>> Regards >>> >>> Antoine. >>