Thank you for your reply.I re-examined the jdbc-catalog implementation and it cleverly uses UpdatedRecords for the checksum.So there is nothing wrong with the implementation.That was my mistake, thanks for pointing it out.
At 2024-07-05 16:09:54, "Jean-Baptiste Onofré" <j...@nanthrax.net> wrote: >Hi, > >Actually the JDBC catalog relies on the RDBMS backend of the lock. >That's one of the reasons why we are using a single RDBMS table for >both tables and views. So, I don't think we would need a lock >mechanism for JDBC, the RDBMS one is OK for now. >About FileIO, we can always extend it, but as it's used in different >Iceberg layers (like ResolvedFileIO for instance), we have to be >careful adding new operations here, especially if it's specific for >HadoopCatalog table/view operations. I will take a look. > >Thanks ! >Regards >JB > >On Thu, Jul 4, 2024 at 4:49 PM lisoda <lis...@yeah.net> wrote: >> >> yea.If I'm not mistaken, the jdbc catalog has the same problem with >> concurrent commits.It doesn't have any locks to control concurrency.In other >> words, LockManager can be used for jdbcCatalog as well. >> >> Also, for the part about unbundling hadoop.I have a suggestion. Can we >> extend the FileIO interface so that all operations are implemented using >> FileIO? >> >> >> >> >> >> >> 在 2024-07-04 23:38:30,"Jean-Baptiste Onofré" <j...@nanthrax.net> 写道: >> >Yeah, I agree with the distributed locking service. Maybe we can >> >imagine a pluggable (by configuration) lock service depending of the >> >user infra. >> > >> >For the view support, I can take a look (as I worked on the JDBC >> >catalog view support). >> > >> >Anyway, I'm gonna take a look at your PR. Thanks again for your >> >contribution ! >> > >> >Regards >> >JB >> > >> >On Thu, Jul 4, 2024 at 4:05 PM lisoda <lis...@yeah.net> wrote: >> >> >> >> Hello. >> >> Yea. Improving the commit mechanism is just the beginning.We also need to >> >> implement a distributed locking service for users who use object stores.I >> >> think the next step is to support iceberg-view and such. >> >> But I've never used iceberg's views before.It will take me some time to >> >> familiarise myself with the functionality of the view section, if I'm to >> >> be of any assistance. But if you need my help, I'll do anything what I >> >> can. >> >> Anyway, I'm glad to hear from you. >> >> >> >> >> >> >> >> >> >> >> >> >> >> 在 2024-07-04 22:04:17,"Jean-Baptiste Onofré" <j...@nanthrax.net> 写道: >> >> >Hi, >> >> > >> >> >Thanks for the heads up and working on this ! >> >> > >> >> >My understanding of the HadoopCatalog is that we would need more than >> >> >an improved commit mechanism to be production ready (I'm thinking on >> >> >scalability, or view support). What's your thoughts? >> >> >By the way, I'm happy to take a look at adding view support if it helps. >> >> > >> >> >Regards >> >> >JB >> >> > >> >> >On Thu, Jul 4, 2024 at 8:27 AM lisoda <lis...@yeah.net> wrote: >> >> >> >> >> >> Hi Team. >> >> >> I've refactored the logic of the commit method in >> >> >> HadoopTableOptions.With this refactoring, I believe that hadoopCatalog >> >> >> is ready to be used in a production environment. Now >> >> >> HadoopTableOptions can implement atomic commits while being compatible >> >> >> with the differences in behaviour between block and object >> >> >> stores.Concurrency control is also supported.if anyone can assist me >> >> >> in reiewing this PR, that would be great. >> >> >> Also, any FileSystemCatalog's user can comment on this PR. Any advice >> >> >> would be invaluable to me. >> >> >> Thank you all. >> >> >> >> >> >> PR:https://github.com/apache/iceberg/pull/10623 >> >> >> SLACK:https://apache-iceberg.slack.com/archives/C03LG1D563F/p1719993403208859