Re: FileWriterFactory Vs FileAppenderFactory

Ryan Blue Wed, 29 Jun 2022 16:22:03 -0700

Taher,

I typically use the helpers in the `Parquet` class to create Parquet files.
That's probably the easiest way to create individual files.

`FileWriterFactory` and `FileAppenderFactory` are ways to provide object
model support to common write patterns. Flink and Spark both use different
in-memory models, so they create different factories so that common writers
can consume rows in their in-memory models. What you probably want is to
create a table using Iceberg generics or Avro objects, so `Parquet` is the
easy path for that.

Ryan

On Sun, Jun 26, 2022 at 10:39 PM Taher Koitawala <[email protected]> wrote:

> Hi All,
>          I am trying to create a Java service with an Iceberg writer that
> writes data over to FS after reading from various sources. I  came across
> these two interfaces and cannot tell when to implement which one.
>
> Both the FileWriterFactory and FileAppenderFactory have an Equality Delete
> Writer method and PositionDeleteWriter. Apart from that,
> FileAppenderFactory has the newDataWriter method also found in
> FileWriterFactory.
>
> Please can you give more clarity on which one to implement for Parquet
> writes?
> Also would appreciate how to use appending to existing files which will be
> pushed over to s3 later. I suppose I will not be able to append to an s3
> file.
>
> Regards,
> Taher Koitawala
>

-- 
Ryan Blue
Tabular

Re: FileWriterFactory Vs FileAppenderFactory

Reply via email to