etseidl commented on PR #564: URL: https://github.com/apache/parquet-format/pull/564#issuecomment-4434359670
Submitted https://github.com/apache/parquet-testing/pull/108 Verified the file is read by the arrow-cpp [PoC](https://github.com/apache/arrow/pull/49707) ```shell % python >>> import pyarrow >>> pyarrow.__version__ '24.0.0.dev298+g24f0f4c9a' >>> from pyarrow import parquet as pq >>> df = pq.read_table('src/parquet-testing/data/no_path_in_schema.zstd.parquet') >>> df pyarrow.Table a: map<string, map<int32, bool ('value')> ('a')> child 0, a: struct<key: string not null, value: map<int32, bool ('value')>> not null child 0, key: string not null child 1, value: map<int32, bool ('value')> child 0, value: struct<key: int32 not null, value: bool not null> not null child 0, key: int32 not null child 1, value: bool not null b: int32 not null c: double not null ---- a: [[keys:["a"]values:[keys:[1,2]values:[true,false]],keys:["b"]values:[keys:[1]values:[true]],keys:["c"]values:[null],keys:["d"]values:[keys:[]values:[]],keys:["e"]values:[keys:[1]values:[true]],keys:["f"]values:[keys:[3,4,5]values:[true,false,true]]]] b: [[1,1,1,1,1,1]] c: [[1,1,1,1,1,1]] ``` Working on java...parquet-cli doesn't like non-string map keys: ```shell % pqcli cat ~/src/parquet-testing/data/no_path_in_schema.zstd.parquet Argument error: Map key type must be binary (UTF8): required int32 key ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
