Hi, We didn't implement record batch reader feature for Parquet in C API yet. It's easy to implement. So we can provide the feature in the next release. Can you open a JIRA issue for this feature? You can find "Create" button at https://issues.apache.org/jira/projects/ARROW/issues/
If you can use C++ API, you can use the feature with the current release. Thanks, -- kou In <[email protected]> "Joining Parquet & PostgreSQL" on Thu, 15 Nov 2018 12:56:34 -0500, Korry Douglas <[email protected]> wrote: > Hi all, I’m exploring the idea of adding a foreign data wrapper (FDW) that > will let PostgreSQL read Parquet-format files. > > I have just a few questions for now: > > 1) I have created a few sample Parquet data files using AWS Glue. Glue split > my CSV input into many (48) smaller xxx.snappy.parquet files, each about > 30MB. When I open one of these files using > gparquet_arrow_file_reader_new_path(), I can then call > gparquet_arrow_file_reader_read_table() (and then access the content of the > table). However, …_read_table() seems to read the entire file into memory > all at once (I say that based on the amount of time it takes for > gparquet_arrow_file_reader_read_table() to return). That’s not the behavior > I need. > > I have tried to use garrow_memory_mappend_input_stream_new() to open the > file, followed by garrow_record_batch_stream_reader_new(). The call to > garrow_record_batch_stream_reader_new() fails with the message: > > [record-batch-stream-reader][open]: Invalid: Expected to read 827474256 > metadata bytes, but only read 30284162 > > Does this error occur because Glue split the input data? Or because Glue > compressed the data using snappy? Do I need to uncompress before I can > read/open the file? Do I need to merge the files before I can open/read the > data? > > 2) If I use garrow_record_batch_stream_reader_new() instead of > gparquet_arrow_file_reader_new_path(), will I avoid the overhead of reading > the entire into memory before I fetch the first row? > > > Thanks in advance for help and any advice. > > > ― Korry
