FileIO purposely does not support a rename operation because we wanted to keep a minimal API that handled object stores correctly rather than using a FileSystem concept. While we may need some extensions outside of what the core provides for reading and writing tables, I think we still need to be careful here.
We have also been discouraging the use of HadoopTableOperations for several years now. Maybe updating it to use locks and moving it to a separate module is a good compromise, but my strong preference is for removing it. On Thu, Jul 11, 2024 at 11:08 PM lisoda <lis...@yeah.net> wrote: > Hi,Sir. > I've finished extending the usual distributed locks.I think we'll no need > to extend distributed locks for a long time. > > PR:https://github.com/apache/iceberg/pull/10688 > > As a next step, I'm going to try to extend FileIO to support operations > like rename. It would be great if you could give me your opinion on this. > Also, please let me know if there is anything I can do to support the > creation of views. > > Tks. > > Regards > > lisoda > > > > > > > > At 2024-07-05 16:09:54, "Jean-Baptiste Onofré" <j...@nanthrax.net> wrote: > >Hi, > > > >Actually the JDBC catalog relies on the RDBMS backend of the lock. > >That's one of the reasons why we are using a single RDBMS table for > >both tables and views. So, I don't think we would need a lock > >mechanism for JDBC, the RDBMS one is OK for now. > >About FileIO, we can always extend it, but as it's used in different > >Iceberg layers (like ResolvedFileIO for instance), we have to be > >careful adding new operations here, especially if it's specific for > >HadoopCatalog table/view operations. I will take a look. > > > >Thanks ! > >Regards > >JB > > > >On Thu, Jul 4, 2024 at 4:49 PM lisoda <lis...@yeah.net> wrote: > >> > >> yea.If I'm not mistaken, the jdbc catalog has the same problem with > >> concurrent commits.It doesn't have any locks to control concurrency.In > >> other words, LockManager can be used for jdbcCatalog as well. > >> > >> Also, for the part about unbundling hadoop.I have a suggestion. Can we > >> extend the FileIO interface so that all operations are implemented using > >> FileIO? > >> > >> > >> > >> > >> > >> > >> 在 2024-07-04 23:38:30,"Jean-Baptiste Onofré" <j...@nanthrax.net> 写道: > >> >Yeah, I agree with the distributed locking service. Maybe we can > >> >imagine a pluggable (by configuration) lock service depending of the > >> >user infra. > >> > > >> >For the view support, I can take a look (as I worked on the JDBC > >> >catalog view support). > >> > > >> >Anyway, I'm gonna take a look at your PR. Thanks again for your > >> >contribution ! > >> > > >> >Regards > >> >JB > >> > > >> >On Thu, Jul 4, 2024 at 4:05 PM lisoda <lis...@yeah.net> wrote: > >> >> > >> >> Hello. > >> >> Yea. Improving the commit mechanism is just the beginning.We also need > >> >> to implement a distributed locking service for users who use object > >> >> stores.I think the next step is to support iceberg-view and such. > >> >> But I've never used iceberg's views before.It will take me some time to > >> >> familiarise myself with the functionality of the view section, if I'm > >> >> to be of any assistance. But if you need my help, I'll do anything what > >> >> I can. > >> >> Anyway, I'm glad to hear from you. > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> 在 2024-07-04 22:04:17,"Jean-Baptiste Onofré" <j...@nanthrax.net> 写道: > >> >> >Hi, > >> >> > > >> >> >Thanks for the heads up and working on this ! > >> >> > > >> >> >My understanding of the HadoopCatalog is that we would need more than > >> >> >an improved commit mechanism to be production ready (I'm thinking on > >> >> >scalability, or view support). What's your thoughts? > >> >> >By the way, I'm happy to take a look at adding view support if it > >> >> >helps. > >> >> > > >> >> >Regards > >> >> >JB > >> >> > > >> >> >On Thu, Jul 4, 2024 at 8:27 AM lisoda <lis...@yeah.net> wrote: > >> >> >> > >> >> >> Hi Team. > >> >> >> I've refactored the logic of the commit method in > >> >> >> HadoopTableOptions.With this refactoring, I believe that > >> >> >> hadoopCatalog is ready to be used in a production environment. Now > >> >> >> HadoopTableOptions can implement atomic commits while being > >> >> >> compatible with the differences in behaviour between block and > >> >> >> object stores.Concurrency control is also supported.if anyone can > >> >> >> assist me in reiewing this PR, that would be great. > >> >> >> Also, any FileSystemCatalog's user can comment on this PR. Any > >> >> >> advice would be invaluable to me. > >> >> >> Thank you all. > >> >> >> > >> >> >> PR:https://github.com/apache/iceberg/pull/10623 > >> >> >> SLACK:https://apache-iceberg.slack.com/archives/C03LG1D563F/p1719993403208859 > > -- Ryan Blue Databricks