Re: FlinkFileIO implementation

Péter Váry Thu, 25 Apr 2024 13:15:56 -0700

I think that the FileIO API is very limited for a good reason. The goal is
to allow a wide variety of implementations, and this is what we are seeing
here.


All of the implementations has their pros and cons, and the actual use-case
defines the best one the user should use.
For example, my first test showed that HadooFileIO is actually faster in
some cases than S3FileIO for a Flink job (I had frequent checkpoints, and
high number of partitions - so the progressive multipart upload didn't help
at all).
Another writer for the same table might have benefited from using S3FileIO.
The missing batch delete feature is only important if someone is deleting
the table, but then it is a must have. The same goes for the Delegation
tokens - some use cases need this feature, for others it is unimportant.

Side question: do we know of any incompatibilities between different FileIO
implementations which could be used to access the same storage medium? (I
sincerely hope not)

These examples highlight that the best FileIO is not defined by the Table
or the Catalog. It is defined by the various use cases. Until they write
the same bytes to the same place, I think, we should not disallow using
them, just because another user of the table benefits more from the other
FileIO implementation.
Maybe we can go further and collect a matrix of recommendations to help
users to choose the best FileIO implementation for their storage medium and
usage patterns.

That still leaves the question if FlinkFileSystemIO is something which we
want to have in the Iceberg codebase. I think it should be a nice addition
to the Flink Iceberg connector code which resides in the Iceberg code base.


On Thu, Apr 25, 2024, 21:08 Jean-Baptiste Onofré <j...@nanthrax.net> wrote:

> Good point about the schemas. That's true, it would be more complicated.
>
> Agree to have more discussion about that. Personally, I think it's not
> a bad idea to have the catalog as the "source" for FileIO, and let the
> engine/client deal with that.
> I think it's an engine/client responsibility (I remember a kind of
> similar discussion in Apache Beam with the runners).
>
> Agree to discuss more :)
>
> Regards
> JB
>
> On Thu, Apr 25, 2024 at 12:41 PM Daniel Weeks <daniel.c.we...@gmail.com>
> wrote:
> >
> > JB,
> >
> > The ResolvingFIleIO is somewhat a different issue and more complicated
> with a concept like FlinkFileIO because the schemes would overlap.
> >
> > The main issue here is around how Flink handles file system operations
> outside of the Iceberg space (e.g. checkpointing) and the confusion it
> causes for people setting up Flink.
> >
> > I'm concerned that the FlinkFileIO approach will ultimately just push
> that problem to the client side, since much of the FileIO configuration for
> a table will come from the catalog (like Ryan pointed out).
> >
> > We need to discuss this a little more and see if there's a way to
> preserve catalog/table managed configuration along with simplifying the
> config for users.
> >
> > -Dan
> >
> > On Thu, Apr 25, 2024 at 9:48 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> >>
> >> Hi Peter,
> >>
> >> On a similar topic, I created a PR to support custom schema in
> >> ResolvingFileIO (https://github.com/apache/iceberg/pull/9884). Maybe
> >> the FlinkIO can be a new schema/extension in the ResolvingFileIO.
> >>
> >> If I agree that it would be interesting to have support for
> >> FlinkFileIO, I'm not sure it's a good idea to have it directly in the
> >> Iceberg. I think it would be great to leverage the extension mechanism
> >> we have in Iceberg (FileIO/ResolvingFileIO).
> >> Iceberg Core should not include engine specific dependency imho.
> >> However, having a "flink:" schema in ResolvingFileIO where we can
> >> leverage FlinkFileIO could be interesting.
> >>
> >> Just thinking out loud :)
> >>
> >> Regards
> >> JB
> >>
> >> On Fri, Apr 19, 2024 at 12:08 PM Péter Váry <
> peter.vary.apa...@gmail.com> wrote:
> >> >
> >> > Hi Iceberg Team,
> >> >
> >> > Flink has its own FileSystem implementation. See:
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/overview/
> .
> >> > This FileSystem already has several implementations:
> >> >
> >> > Hadoop
> >> > Azure
> >> > S3
> >> > Google Cloud Storage
> >> > ...
> >> >
> >> > As a general rule in Flink, one should use this FileSystem to consume
> and persistently store data.
> >> > If these FileSystems are configured, then Flink makes sure that the
> configurations are consistent and available for the JM/TM.
> >> > Also as an added benefit, delegation tokens are handled and
> distributed for these FileSystems automatically. See:
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/security/security-delegation-token/
> >> >
> >> > In house, some of our new users are struggling with parametrizing
> HadooFileIO, and S3FileIO for Iceberg, trying to wrap their head around
> that they have to provide different configurations for the checkpointing
> and for the Iceberg table storage (even if they are stored in the same
> bucket, or on the same HDFS cluster)
> >> >
> >> > I have created a PR, which provides a FileIO implementation which
> uses FlinkFileSystem. Very imaginatively I have named it FlinkFileIO. See:
> https://github.com/apache/iceberg/pull/10151
> >> >
> >> > This would allow the users to configure the FileSystem only once, and
> use this FileSystem to access Iceberg tables. Also, if for whatever reason
> the global nature of flink file system config is limiting, the users still
> could revert back using the other FileIO implementations.
> >> >
> >> > What do you think? Would this be a useful addition to the
> Iceberg-Flink integration?
> >> >
> >> > Thanks,
> >> > Peter
>

Re: FlinkFileIO implementation

Reply via email to