Hi Vladislav, Thanks for the PR and the update. In addition to internal API usage of Delta Kernel API, one info I would like to add is that Delta Kernel java is preferring TableManager <https://github.com/delta-io/delta/blob/master/kernel/kernel-api/src/main/java/io/delta/kernel/TableManager.java>(new API) versus Table <https://github.com/delta-io/delta/blob/master/kernel/kernel-api/src/main/java/io/delta/kernel/Table.java>(the old one). Happy to help but let's try to avoid depending on the most up to date kernel code.
Thanks Xin On Wed, Jun 24, 2026 at 5:43 AM Vladislav Sidorovich via dev < [email protected]> wrote: > Hi everyone, > > Just sharing a quick update on the progress of PR (#15407 > <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$> > ) . > > I have addressed the recent review comments and updated the PR summary to > reflect the current state. To quickly recap the last update, the code is > completely functional, but *not set* as the default implementation here > org.apache.iceberg.delta.DeltaLakeToIcebergMigrationActionsProvider#snapshotDeltaLakeTable > <https://urldefense.com/v3/__https://github.com/apache/iceberg/blob/main/delta-lake/src/main/java/org/apache/iceberg/delta/DeltaLakeToIcebergMigrationActionsProvider.java*L34__;Iw!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2uTvBAr-$> > . > > Regarding the next steps, I would like to align with @Anoop Johnson > <[email protected]> on how we are handling the Delta internal API usage. > Anoop, I agree with your earlier points. Let me know your thoughts on how > we should proceed with this specific part of the implementation. > > Thanks again to everyone for the reviews and feedback so far. > > Best, > > On Sun, Apr 19, 2026 at 8:06 PM Vladislav Sidorovich < > [email protected]> wrote: > >> Hi everyone, >> >> I wanted to send a quick, gentle reminder regarding the Delta-to-Iceberg >> conversion PR (#15407 >> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$>) >> and share a couple of new updates: >> >> - >> >> *Current Status:* The code is fully functional. The CLI Client has >> reached feature parity with the previous version of the tool, adding >> support for Delta v3 and Deletion Vectors (DVs). >> - >> >> *Motivation & Roadmap:* To provide more context and outline the next >> steps, I have put together a document detailing the motivation and >> development plan >> >> <https://urldefense.com/v3/__https://docs.google.com/document/d/1eWH7N_9Mo2b1cDm1jFM9R7I16iYa5xZUvtU_9eLozjA/edit?tab=t.0__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2uUDJ4qA$> >> . >> >> Could you please take a look and review the current code version when you >> have a moment? >> >> Additionally, if anyone is interested in this area and would like to >> collaborate, please feel free to check out the document above and join the >> development! >> >> Thanks again for your time and support. >> >> Best, >> >> Vladislav >> >> On Sun, Mar 22, 2026 at 4:06 PM Vladislav Sidorovich < >> [email protected]> wrote: >> >>> Hi everyone, >>> >>> I have an update on this PR: I've just implemented support for Deletion >>> Vectors (DVs) conversion. >>> >>> The final remaining work for this PR is addressing Anoop's feedback >>> regarding the use of internal Delta Kernel classes and replacing the >>> previous implementation. However, before I tackle that refactoring, I would >>> love to get a community review on the current state of the code. >>> >>> Because the PR now covers the majority of Delta-to-Iceberg conversion >>> scenarios, I want to ensure the core logic is correct and we are fully >>> aligned before I swap out the underlying classes. >>> >>> Planned for *follow-up* PRs: >>> >>> 1. Implement incremental conversion support. >>> 2. Add more Delta tables from the delta/golden dataset to expand our >>> test coverage. >>> >>> Thanks again for your time and ongoing feedback! >>> >>> Best, >>> >>> Vladislav >>> >>> On Sun, Mar 8, 2026 at 6:22 PM Vladislav Sidorovich < >>> [email protected]> wrote: >>> >>>> Hi everyone, >>>> >>>> As a quick update on this PR: the current version has now reached full >>>> feature parity with the existing code in the main branch, but with the >>>> added benefit of supporting Delta reader version 3 and writer version 7. >>>> >>>> Since we've hit this baseline milestone, could I please get another >>>> round of reviews on the current state of the code? >>>> >>>> Once reviewed, my immediate next steps for the PR will be: >>>> >>>> 1. Refactoring to remove the internal Delta Kernel classes >>>> (addressing Anoop's feedback). >>>> 2. Adding support for Deletion Vectors (DVs) conversions. >>>> 3. Implementing incremental conversion. >>>> >>>> Thanks in advance for your time and feedback! >>>> >>>> Best, Vladislav >>>> >>>> On Mon, Mar 2, 2026 at 9:46 PM Vladislav Sidorovich < >>>> [email protected]> wrote: >>>> >>>>> Nice to hear, so we work on it in parallel. >>>>> >>>>> On Mon, Mar 2, 2026 at 8:33 PM Anoop Johnson <[email protected]> wrote: >>>>> >>>>>> > A major challenge with UniForm right now is its limitation >>>>>> regarding Deletion Vectors (DVs). Support for this is critical for many >>>>>> users migrating their workloads. >>>>>> >>>>>> The reason why Uniform v1/v2 blocked DVs was because Iceberg v1/v2 >>>>>> had a different positional delete representation than Delta Lake. But >>>>>> that >>>>>> changed in Iceberg v3. So the upcoming version of Uniform ( >>>>>> IcebergCompatV3 >>>>>> <https://urldefense.com/v3/__https://github.com/delta-io/delta/blob/master/protocol_rfcs/iceberg-compat-v3.md__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2iggbUbU$>) >>>>>> will lift this restriction. >>>>>> >>>>>> On Mon, Mar 2, 2026 at 10:48 AM Vladislav Sidorovich via dev < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Anoop, >>>>>>> >>>>>>> Thanks for the feedback and for raising these important points. >>>>>>> >>>>>>> Regarding the technical feedback on minimizing the use of internal >>>>>>> Delta Kernel classes: I completely agree. Relying on internal APIs like >>>>>>> AddFile introduces an unnecessary maintenance burden. My plan is to >>>>>>> refactor the code (e.g., transitioning to the Row API) once we have >>>>>>> alignment on the core features this PR will support. I will also put >>>>>>> together a list of the gaps I've encountered in the Kernel API (such as >>>>>>> change detection) so we can file those upstream, as you suggested. >>>>>>> >>>>>>> As a quick update on the PR's progress: I’ve recently added support >>>>>>> for UPDATE and DELETE operations, along with expanded test >>>>>>> coverage. At this stage, the PR is roughly at feature parity with the >>>>>>> existing tool (excluding VACUUM) but supports newer Delta versions. >>>>>>> As outlined in the PR description, the next features on the roadmap are: >>>>>>> >>>>>>> 1. VACUUM support >>>>>>> 2. Deletion Vectors (DVs) support >>>>>>> 3. Incremental conversion >>>>>>> >>>>>>> >>>>>>> *Bigger question*. To address your broader question about whether >>>>>>> we should consider sunsetting the Delta Lake module in favor of Delta >>>>>>> UniForm: based on my experience and observations, there are still >>>>>>> compelling reasons to maintain a native Iceberg-driven conversion tool. >>>>>>> >>>>>>> - >>>>>>> >>>>>>> *Feature Limitations:* A major challenge with UniForm right now >>>>>>> is its limitation regarding Deletion Vectors (DVs). Support for this >>>>>>> is >>>>>>> critical for many users migrating their workloads. >>>>>>> - >>>>>>> >>>>>>> *User Preference:* I've observed that teams looking to migrate >>>>>>> to Iceberg strongly prefer "native" tooling maintained by the >>>>>>> technology >>>>>>> they are migrating *to*, rather than relying on the ecosystem >>>>>>> they are trying to move *from*. Having an in-house Iceberg tool >>>>>>> gives the community more control over the migration experience. >>>>>>> >>>>>>> Let me know your thoughts on the above, particularly regarding the >>>>>>> long-term need for a native migration path. >>>>>>> >>>>>>> Best, Vladislav >>>>>>> >>>>>>> On Thu, Feb 26, 2026 at 8:07 PM Anoop Johnson <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Vladislav, >>>>>>>> >>>>>>>> We should minimize the usage of internal Delta kernel classes as >>>>>>>> much as possible. There are no guarantees about the stability of the >>>>>>>> internal APIs, and it will be a maintenance burden on the Iceberg >>>>>>>> project. >>>>>>>> For instance, instead of using the internal `AddFile` class use the >>>>>>>> `Row` >>>>>>>> API using ordinals defined by the scan file schema. I do recognize that >>>>>>>> there are some gaps in the kernel API (you mentioned change >>>>>>>> detection): do >>>>>>>> you have a list? It would be worth filing an issue against Delta >>>>>>>> kernel, it >>>>>>>> is possible some of these like providing file changes might be in their >>>>>>>> roadmap. >>>>>>>> >>>>>>>> *I have a higher level question to the community:* should we >>>>>>>> consider sunsetting the Delta lake module? Delta Lake's Uniform >>>>>>>> <https://urldefense.com/v3/__https://docs.delta.io/delta-uniform/__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2im3L29G$> >>>>>>>> can >>>>>>>> already generate Iceberg metadata: it is incremental, and already >>>>>>>> handles >>>>>>>> several features such as column mapping. Do we need to duplicate all of >>>>>>>> that work? Obviously it is better to have less code and less >>>>>>>> components to >>>>>>>> maintain. >>>>>>>> >>>>>>>> Best, >>>>>>>> Anoop >>>>>>>> >>>>>>>> Disclosure: I work on Delta also as part of my day job. >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Feb 25, 2026 at 1:44 PM Vladislav Sidorovich < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Anoop, >>>>>>>>> >>>>>>>>> Thanks a lot for the initial review. >>>>>>>>> >>>>>>>>> Data correctness guards: >>>>>>>>> 1. I will add support for Remove action soon, work on the PR is in >>>>>>>>> progress. >>>>>>>>> 2. Sure, let's do reject for `column mapping` feature for now for >>>>>>>>> the safety. Later I will try to provide support of this feature as >>>>>>>>> well. >>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, the PR depends on `*internal*` API of the delta-kernel. I do >>>>>>>>> not see a simple way to replace it with the public API. As an option >>>>>>>>> I can >>>>>>>>> replace these classes with our `in-house` classes that would rely on >>>>>>>>> the >>>>>>>>> Dela protocol spec, it will be safe in terms of runtime but it will be >>>>>>>>> additional code that we will need to support. >>>>>>>>> >>>>>>>>> What do you think if I will continue work with `*internal*` delta >>>>>>>>> API for now and refactor this logic before merging the PR once we will >>>>>>>>> agree on some solutions? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Feb 24, 2026 at 5:29 AM Anoop Johnson <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, Vladislav - >>>>>>>>>> >>>>>>>>>> I've done an initial review of the PR >>>>>>>>>> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$>. >>>>>>>>>> Moving to the Delta kernel is the right direction, so thank you for >>>>>>>>>> doing >>>>>>>>>> this. Here's a summary of my initial feedback (full details are in >>>>>>>>>> the PR): >>>>>>>>>> >>>>>>>>>> Data correctness guards: >>>>>>>>>> 1. If we encounter `Remove` actions, it should fail fast rather >>>>>>>>>> than silently skip it. Otherwise tables with DML will produce >>>>>>>>>> duplicate >>>>>>>>>> rows in the Iceberg table. >>>>>>>>>> 2. Tables with column mapping enabled) will produce silent data >>>>>>>>>> corruption because the Parquet files will have physical column names >>>>>>>>>> that >>>>>>>>>> don't match the logical schema. We should validate this and reject >>>>>>>>>> until >>>>>>>>>> column mapping support is added (which can be done as a separate PR). >>>>>>>>>> >>>>>>>>>> The PR relies heavily on io.delta.kernel.internal.* classes, >>>>>>>>>> which can be fragile. We should consider replacing them with the >>>>>>>>>> public >>>>>>>>>> kernel APIs. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Anoop >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Feb 23, 2026 at 12:29 AM Vladislav Sidorovich via dev < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Iceberg Community, >>>>>>>>>>> >>>>>>>>>>> I recently opened a PR to update the existing Delta Lake to >>>>>>>>>>> Iceberg migration functionality to support recent Delta Lake table >>>>>>>>>>> versions >>>>>>>>>>> (read: 3, write: 7). I would appreciate it if anyone take a look >>>>>>>>>>> and share >>>>>>>>>>> thoughts on the architecture and initial implementation >>>>>>>>>>> >>>>>>>>>>> *PR Link:* https://github.com/apache/iceberg/pull/15407 >>>>>>>>>>> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$> >>>>>>>>>>> >>>>>>>>>>> The main motivation for sharing this now is to get some early >>>>>>>>>>> feedback from the community on the approach and the initial >>>>>>>>>>> implementation. >>>>>>>>>>> >>>>>>>>>>> To make reviewing easier, this PR doesn't remove or overwrite >>>>>>>>>>> the old logic. Instead, I’ve added a new interface implementation >>>>>>>>>>> utilizing >>>>>>>>>>> the *Delta Lake Kernel library* (replacing the deprecated Delta >>>>>>>>>>> Lake standalone library). This side-by-side approach allows for >>>>>>>>>>> easier >>>>>>>>>>> comparison and shouldn't introduce any issues with current usage >>>>>>>>>>> scenarios. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Current PR Scope:* >>>>>>>>>>> >>>>>>>>>>> - Maintains support for the existing migration interface. >>>>>>>>>>> - Migrates the underlying engine to the Delta Lake Kernel >>>>>>>>>>> library. >>>>>>>>>>> - Contains the basic migration flow. >>>>>>>>>>> - Successfully converts all data types, table schemas, and >>>>>>>>>>> partition specs. >>>>>>>>>>> - Currently supports INSERT operations only (Delta Lake Add >>>>>>>>>>> action). >>>>>>>>>>> - *Testing:* Includes unit tests for all supported data >>>>>>>>>>> types (including complex arrays and structures) and integration >>>>>>>>>>> tests for >>>>>>>>>>> insert-only scenarios using Spark 3.5. >>>>>>>>>>> >>>>>>>>>>> *Future Steps (Next PRs):* >>>>>>>>>>> >>>>>>>>>>> Once we align on this foundation, I plan to follow up with: >>>>>>>>>>> >>>>>>>>>>> - Adding support for UPDATE and DELETE (Delta Lake Remove >>>>>>>>>>> action). >>>>>>>>>>> - Supporting all remaining Delta Lake actions. >>>>>>>>>>> - Handling edge cases for partitions and generated columns. >>>>>>>>>>> - Adding Schema Evolution support. >>>>>>>>>>> - Adding Deletion Vector (DV) support. >>>>>>>>>>> - Enabling Incremental Conversion (from/to specific Delta >>>>>>>>>>> versions). >>>>>>>>>>> - Adding all tables from the Delta golden tables for robust >>>>>>>>>>> testing. *(Note: The current integration test will be >>>>>>>>>>> updated for newer Delta Lake versions once the old standalone >>>>>>>>>>> solution is >>>>>>>>>>> fully deprecated/deleted).* >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best regards, >>>>>>>>>>> Vladislav Sidorovich >>>>>>>>>>> >>>>>>>>>>> Feedback: *go/feedback-for-vladislav >>>>>>>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >>>>>>>>>>> * >>>>>>>>>>> [image: Google Logo] >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best regards, >>>>>>>>> Vladislav Sidorovich >>>>>>>>> >>>>>>>>> Feedback: *go/feedback-for-vladislav >>>>>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >>>>>>>>> * >>>>>>>>> [image: Google Logo] >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best regards, >>>>>>> Vladislav Sidorovich >>>>>>> >>>>>>> Feedback: *go/feedback-for-vladislav >>>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >>>>>>> * >>>>>>> [image: Google Logo] >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Vladislav Sidorovich >>>>> >>>>> Feedback: *go/feedback-for-vladislav >>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >>>>> * >>>>> [image: Google Logo] >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Best regards, >>>> Vladislav Sidorovich >>>> >>>> Feedback: *go/feedback-for-vladislav >>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >>>> * >>>> [image: Google Logo] >>>> >>>> >>>> >>> >>> -- >>> Best regards, >>> Vladislav Sidorovich >>> >>> Feedback: *go/feedback-for-vladislav >>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >>> * >>> [image: Google Logo] >>> >>> >>> >> >> -- >> Best regards, >> Vladislav Sidorovich >> >> Feedback: *go/feedback-for-vladislav >> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$> >> * >> [image: Google Logo] >> >> >> > > -- > Best regards, > Vladislav Sidorovich > > Feedback: *go/feedback-for-vladislav * > [image: Google Logo] > > >
