Is it possible to read Parquet columns into an Arrow schema that has variable-width types with 64-bit offsets (LargeBinary, LargeList, etc.)?
For my current use case, I prefer the large types because the data overflow 32-bit offsets, and it is easier to waste memory with 8 bytes per offset than it is to work with chunked arrays. (I need to access the Arrow buffers from Java, and the Java library does not yet provide a convenient abstraction for chunked arrays.) I would like an option to use large types when reading Parquet files with the Dataset API. My feature request could be satisfied more generally by enabling users to specify type coercion/promotion when mapping Parquet types to Arrow types. Are other users interested in this feature? Is anyone opposed? Steve Kim