On Wed, Dec 6, 2023 at 8:32 PM Daniel Verite <dan...@manitou-mail.org> wrote: > > Sutou Kouhei wrote: > > > * 2022-04: Apache Arrow [2] > > * 2018-02: Apache Avro, Apache Parquet and Apache ORC [3] > > > > (FYI: I want to add support for Apache Arrow.) > > > > There were discussions how to add support for more formats. [3][4] > > In these discussions, we got a consensus about making COPY > > format extendable. > > > These formats seem all column-oriented whereas COPY is row-oriented > at the protocol level [1]. > With regard to the procotol, how would it work to support these formats? >
They have kind of *RowGroup* concepts, a bunch of rows goes to a RowBatch and the data of the same column goes together. I think they should fit the COPY semantics and there are some FDW out there for these modern formats, like [1]. If we support COPY to deal with the format, it will be easier to interact with them(without creating server/usermapping/foreign table). [1]: https://github.com/adjust/parquet_fdw > > [1] https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-COPY > > > Best regards, > -- > Daniel Vérité > https://postgresql.verite.pro/ > Twitter: @DanielVerite > > -- Regards Junwang Zhao