Hi Vladislav,

Thanks for the PR and the update. In addition to internal API usage of
Delta Kernel API, one info I would like to add is that Delta Kernel java is
preferring TableManager
<https://github.com/delta-io/delta/blob/master/kernel/kernel-api/src/main/java/io/delta/kernel/TableManager.java>(new
API) versus Table
<https://github.com/delta-io/delta/blob/master/kernel/kernel-api/src/main/java/io/delta/kernel/Table.java>(the
old one).  Happy to help but let's try to avoid depending on the most up to
date kernel code.

Thanks
Xin

On Wed, Jun 24, 2026 at 5:43 AM Vladislav Sidorovich via dev <
[email protected]> wrote:

> Hi everyone,
>
> Just sharing a quick update on the progress of PR (#15407
> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$>
> ) .
>
> I have addressed the recent review comments and updated the PR summary to
> reflect the current state. To quickly recap the last update, the code is
> completely functional, but *not set* as the default implementation here
> org.apache.iceberg.delta.DeltaLakeToIcebergMigrationActionsProvider#snapshotDeltaLakeTable
> <https://urldefense.com/v3/__https://github.com/apache/iceberg/blob/main/delta-lake/src/main/java/org/apache/iceberg/delta/DeltaLakeToIcebergMigrationActionsProvider.java*L34__;Iw!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2uTvBAr-$>
> .
>
> Regarding the next steps, I would like to align with @Anoop Johnson
> <[email protected]> on how we are handling the Delta internal API usage.
> Anoop, I agree with your earlier points. Let me know your thoughts on how
> we should proceed with this specific part of the implementation.
>
> Thanks again to everyone for the reviews and feedback so far.
>
> Best,
>
> On Sun, Apr 19, 2026 at 8:06 PM Vladislav Sidorovich <
> [email protected]> wrote:
>
>> Hi everyone,
>>
>> I wanted to send a quick, gentle reminder regarding the Delta-to-Iceberg
>> conversion PR (#15407
>> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$>)
>> and share a couple of new updates:
>>
>>    -
>>
>>    *Current Status:* The code is fully functional. The CLI Client has
>>    reached feature parity with the previous version of the tool, adding
>>    support for Delta v3 and Deletion Vectors (DVs).
>>    -
>>
>>    *Motivation & Roadmap:* To provide more context and outline the next
>>    steps, I have put together a document detailing the motivation and
>>    development plan
>>    
>> <https://urldefense.com/v3/__https://docs.google.com/document/d/1eWH7N_9Mo2b1cDm1jFM9R7I16iYa5xZUvtU_9eLozjA/edit?tab=t.0__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2uUDJ4qA$>
>>    .
>>
>> Could you please take a look and review the current code version when you
>> have a moment?
>>
>> Additionally, if anyone is interested in this area and would like to
>> collaborate, please feel free to check out the document above and join the
>> development!
>>
>> Thanks again for your time and support.
>>
>> Best,
>>
>> Vladislav
>>
>> On Sun, Mar 22, 2026 at 4:06 PM Vladislav Sidorovich <
>> [email protected]> wrote:
>>
>>> Hi everyone,
>>>
>>> I have an update on this PR: I've just implemented support for Deletion
>>> Vectors (DVs) conversion.
>>>
>>> The final remaining work for this PR is addressing Anoop's feedback
>>> regarding the use of internal Delta Kernel classes and replacing the
>>> previous implementation. However, before I tackle that refactoring, I would
>>> love to get a community review on the current state of the code.
>>>
>>> Because the PR now covers the majority of Delta-to-Iceberg conversion
>>> scenarios, I want to ensure the core logic is correct and we are fully
>>> aligned before I swap out the underlying classes.
>>>
>>> Planned for *follow-up* PRs:
>>>
>>>    1. Implement incremental conversion support.
>>>    2. Add more Delta tables from the delta/golden dataset to expand our
>>>    test coverage.
>>>
>>> Thanks again for your time and ongoing feedback!
>>>
>>> Best,
>>>
>>> Vladislav
>>>
>>> On Sun, Mar 8, 2026 at 6:22 PM Vladislav Sidorovich <
>>> [email protected]> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> As a quick update on this PR: the current version has now reached full
>>>> feature parity with the existing code in the main branch, but with the
>>>> added benefit of supporting Delta reader version 3 and writer version 7.
>>>>
>>>> Since we've hit this baseline milestone, could I please get another
>>>> round of reviews on the current state of the code?
>>>>
>>>> Once reviewed, my immediate next steps for the PR will be:
>>>>
>>>>    1. Refactoring to remove the internal Delta Kernel classes
>>>>    (addressing Anoop's feedback).
>>>>    2. Adding support for Deletion Vectors (DVs) conversions.
>>>>    3. Implementing incremental conversion.
>>>>
>>>> Thanks in advance for your time and feedback!
>>>>
>>>> Best, Vladislav
>>>>
>>>> On Mon, Mar 2, 2026 at 9:46 PM Vladislav Sidorovich <
>>>> [email protected]> wrote:
>>>>
>>>>> Nice to hear, so we work on it in parallel.
>>>>>
>>>>> On Mon, Mar 2, 2026 at 8:33 PM Anoop Johnson <[email protected]> wrote:
>>>>>
>>>>>> > A major challenge with UniForm right now is its limitation
>>>>>> regarding Deletion Vectors (DVs). Support for this is critical for many
>>>>>> users migrating their workloads.
>>>>>>
>>>>>> The reason why Uniform v1/v2 blocked DVs was because Iceberg v1/v2
>>>>>> had a different positional delete representation than Delta Lake. But 
>>>>>> that
>>>>>> changed in Iceberg v3. So the upcoming version of Uniform (
>>>>>> IcebergCompatV3
>>>>>> <https://urldefense.com/v3/__https://github.com/delta-io/delta/blob/master/protocol_rfcs/iceberg-compat-v3.md__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2iggbUbU$>)
>>>>>> will lift this restriction.
>>>>>>
>>>>>> On Mon, Mar 2, 2026 at 10:48 AM Vladislav Sidorovich via dev <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Anoop,
>>>>>>>
>>>>>>> Thanks for the feedback and for raising these important points.
>>>>>>>
>>>>>>> Regarding the technical feedback on minimizing the use of internal
>>>>>>> Delta Kernel classes: I completely agree. Relying on internal APIs like
>>>>>>> AddFile introduces an unnecessary maintenance burden. My plan is to
>>>>>>> refactor the code (e.g., transitioning to the Row API) once we have
>>>>>>> alignment on the core features this PR will support. I will also put
>>>>>>> together a list of the gaps I've encountered in the Kernel API (such as
>>>>>>> change detection) so we can file those upstream, as you suggested.
>>>>>>>
>>>>>>> As a quick update on the PR's progress: I’ve recently added support
>>>>>>> for UPDATE and DELETE operations, along with expanded test
>>>>>>> coverage. At this stage, the PR is roughly at feature parity with the
>>>>>>> existing tool (excluding VACUUM) but supports newer Delta versions.
>>>>>>> As outlined in the PR description, the next features on the roadmap are:
>>>>>>>
>>>>>>>    1. VACUUM support
>>>>>>>    2. Deletion Vectors (DVs) support
>>>>>>>    3. Incremental conversion
>>>>>>>
>>>>>>>
>>>>>>> *Bigger question*. To address your broader question about whether
>>>>>>> we should consider sunsetting the Delta Lake module in favor of Delta
>>>>>>> UniForm: based on my experience and observations, there are still
>>>>>>> compelling reasons to maintain a native Iceberg-driven conversion tool.
>>>>>>>
>>>>>>>    -
>>>>>>>
>>>>>>>    *Feature Limitations:* A major challenge with UniForm right now
>>>>>>>    is its limitation regarding Deletion Vectors (DVs). Support for this 
>>>>>>> is
>>>>>>>    critical for many users migrating their workloads.
>>>>>>>    -
>>>>>>>
>>>>>>>    *User Preference:* I've observed that teams looking to migrate
>>>>>>>    to Iceberg strongly prefer "native" tooling maintained by the 
>>>>>>> technology
>>>>>>>    they are migrating *to*, rather than relying on the ecosystem
>>>>>>>    they are trying to move *from*. Having an in-house Iceberg tool
>>>>>>>    gives the community more control over the migration experience.
>>>>>>>
>>>>>>> Let me know your thoughts on the above, particularly regarding the
>>>>>>> long-term need for a native migration path.
>>>>>>>
>>>>>>> Best, Vladislav
>>>>>>>
>>>>>>> On Thu, Feb 26, 2026 at 8:07 PM Anoop Johnson <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Vladislav,
>>>>>>>>
>>>>>>>> We should minimize the usage of internal Delta kernel classes as
>>>>>>>> much as possible. There are no guarantees about the stability of the
>>>>>>>> internal APIs, and it will be a maintenance burden on the Iceberg 
>>>>>>>> project.
>>>>>>>> For instance, instead of using the internal `AddFile` class use the 
>>>>>>>> `Row`
>>>>>>>> API using ordinals defined by the scan file schema. I do recognize that
>>>>>>>> there are some gaps in the kernel API (you mentioned change 
>>>>>>>> detection): do
>>>>>>>> you have a list? It would be worth filing an issue against Delta 
>>>>>>>> kernel, it
>>>>>>>> is possible some of these like providing file changes might be in their
>>>>>>>> roadmap.
>>>>>>>>
>>>>>>>> *I have a higher level question to the community:* should we
>>>>>>>> consider sunsetting the Delta lake module? Delta Lake's Uniform
>>>>>>>> <https://urldefense.com/v3/__https://docs.delta.io/delta-uniform/__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2im3L29G$>
>>>>>>>>  can
>>>>>>>> already generate Iceberg metadata: it is incremental, and already 
>>>>>>>> handles
>>>>>>>> several features such as column mapping. Do we need to duplicate all of
>>>>>>>> that work? Obviously it is better to have less code and less 
>>>>>>>> components to
>>>>>>>> maintain.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Anoop
>>>>>>>>
>>>>>>>> Disclosure: I work on Delta also as part of my day job.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 25, 2026 at 1:44 PM Vladislav Sidorovich <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Anoop,
>>>>>>>>>
>>>>>>>>> Thanks a lot for the initial review.
>>>>>>>>>
>>>>>>>>> Data correctness guards:
>>>>>>>>> 1. I will add support for Remove action soon, work on the PR is in
>>>>>>>>> progress.
>>>>>>>>> 2. Sure, let's do reject for `column mapping` feature for now for
>>>>>>>>> the safety. Later I will try to provide support of this feature as 
>>>>>>>>> well.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, the PR depends on `*internal*` API of the delta-kernel. I do
>>>>>>>>> not see a simple way to replace it with the public API.  As an option 
>>>>>>>>> I can
>>>>>>>>> replace these classes with our `in-house` classes that would rely on 
>>>>>>>>> the
>>>>>>>>> Dela protocol spec, it will be safe in terms of runtime but it will be
>>>>>>>>> additional code that we will need to support.
>>>>>>>>>
>>>>>>>>> What do you think if I will continue work with `*internal*` delta
>>>>>>>>> API for now and refactor this logic before merging the PR once we will
>>>>>>>>> agree on some solutions?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 24, 2026 at 5:29 AM Anoop Johnson <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi, Vladislav -
>>>>>>>>>>
>>>>>>>>>> I've done an initial review of the PR
>>>>>>>>>> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$>.
>>>>>>>>>> Moving to the Delta kernel is the right direction, so thank you for 
>>>>>>>>>> doing
>>>>>>>>>> this. Here's a summary of my initial feedback (full details are in 
>>>>>>>>>> the PR):
>>>>>>>>>>
>>>>>>>>>> Data correctness guards:
>>>>>>>>>> 1. If we encounter `Remove` actions, it should fail fast rather
>>>>>>>>>> than silently skip it. Otherwise tables with DML will produce 
>>>>>>>>>> duplicate
>>>>>>>>>> rows in the Iceberg table.
>>>>>>>>>> 2. Tables with column mapping enabled) will produce silent data
>>>>>>>>>> corruption because the Parquet files will have physical column names 
>>>>>>>>>> that
>>>>>>>>>> don't match the logical schema. We should validate this and reject 
>>>>>>>>>> until
>>>>>>>>>> column mapping support is added (which can be done as a separate PR).
>>>>>>>>>>
>>>>>>>>>> The PR relies heavily on io.delta.kernel.internal.* classes,
>>>>>>>>>> which can be fragile. We should consider replacing them with the 
>>>>>>>>>> public
>>>>>>>>>> kernel APIs.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Anoop
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 23, 2026 at 12:29 AM Vladislav Sidorovich via dev <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Iceberg Community,
>>>>>>>>>>>
>>>>>>>>>>> I recently opened a PR to update the existing Delta Lake to
>>>>>>>>>>> Iceberg migration functionality to support recent Delta Lake table 
>>>>>>>>>>> versions
>>>>>>>>>>> (read: 3, write: 7). I would appreciate it if anyone take a look 
>>>>>>>>>>> and share
>>>>>>>>>>> thoughts on the architecture and initial implementation
>>>>>>>>>>>
>>>>>>>>>>> *PR Link:* https://github.com/apache/iceberg/pull/15407
>>>>>>>>>>> <https://urldefense.com/v3/__https://github.com/apache/iceberg/pull/15407__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2mt2XNZc$>
>>>>>>>>>>>
>>>>>>>>>>> The main motivation for sharing this now is to get some early
>>>>>>>>>>> feedback from the community on the approach and the initial 
>>>>>>>>>>> implementation.
>>>>>>>>>>>
>>>>>>>>>>> To make reviewing easier, this PR doesn't remove or overwrite
>>>>>>>>>>> the old logic. Instead, I’ve added a new interface implementation 
>>>>>>>>>>> utilizing
>>>>>>>>>>> the *Delta Lake Kernel library* (replacing the deprecated Delta
>>>>>>>>>>> Lake standalone library). This side-by-side approach allows for 
>>>>>>>>>>> easier
>>>>>>>>>>> comparison and shouldn't introduce any issues with current usage 
>>>>>>>>>>> scenarios.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Current PR Scope:*
>>>>>>>>>>>
>>>>>>>>>>>    - Maintains support for the existing migration interface.
>>>>>>>>>>>    - Migrates the underlying engine to the Delta Lake Kernel
>>>>>>>>>>>    library.
>>>>>>>>>>>    - Contains the basic migration flow.
>>>>>>>>>>>    - Successfully converts all data types, table schemas, and
>>>>>>>>>>>    partition specs.
>>>>>>>>>>>    - Currently supports INSERT operations only (Delta Lake Add
>>>>>>>>>>>    action).
>>>>>>>>>>>    - *Testing:* Includes unit tests for all supported data
>>>>>>>>>>>    types (including complex arrays and structures) and integration 
>>>>>>>>>>> tests for
>>>>>>>>>>>    insert-only scenarios using Spark 3.5.
>>>>>>>>>>>
>>>>>>>>>>> *Future Steps (Next PRs):*
>>>>>>>>>>>
>>>>>>>>>>> Once we align on this foundation, I plan to follow up with:
>>>>>>>>>>>
>>>>>>>>>>>    - Adding support for UPDATE and DELETE (Delta Lake Remove
>>>>>>>>>>>    action).
>>>>>>>>>>>    - Supporting all remaining Delta Lake actions.
>>>>>>>>>>>    - Handling edge cases for partitions and generated columns.
>>>>>>>>>>>    - Adding Schema Evolution support.
>>>>>>>>>>>    - Adding Deletion Vector (DV) support.
>>>>>>>>>>>    - Enabling Incremental Conversion (from/to specific Delta
>>>>>>>>>>>    versions).
>>>>>>>>>>>    - Adding all tables from the Delta golden tables for robust
>>>>>>>>>>>    testing. *(Note: The current integration test will be
>>>>>>>>>>>    updated for newer Delta Lake versions once the old standalone 
>>>>>>>>>>> solution is
>>>>>>>>>>>    fully deprecated/deleted).*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Vladislav Sidorovich
>>>>>>>>>>>
>>>>>>>>>>> Feedback: *go/feedback-for-vladislav
>>>>>>>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>>>>>>>>>>  *
>>>>>>>>>>> [image: Google Logo]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Vladislav Sidorovich
>>>>>>>>>
>>>>>>>>> Feedback: *go/feedback-for-vladislav
>>>>>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>>>>>>>>  *
>>>>>>>>> [image: Google Logo]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Vladislav Sidorovich
>>>>>>>
>>>>>>> Feedback: *go/feedback-for-vladislav
>>>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>>>>>>  *
>>>>>>> [image: Google Logo]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Vladislav Sidorovich
>>>>>
>>>>> Feedback: *go/feedback-for-vladislav
>>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>>>>  *
>>>>> [image: Google Logo]
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Vladislav Sidorovich
>>>>
>>>> Feedback: *go/feedback-for-vladislav
>>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>>>  *
>>>> [image: Google Logo]
>>>>
>>>>
>>>>
>>>
>>> --
>>> Best regards,
>>> Vladislav Sidorovich
>>>
>>> Feedback: *go/feedback-for-vladislav
>>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>>  *
>>> [image: Google Logo]
>>>
>>>
>>>
>>
>> --
>> Best regards,
>> Vladislav Sidorovich
>>
>> Feedback: *go/feedback-for-vladislav
>> <https://urldefense.com/v3/__https://goto.google.com/feedback-for-vladislav__;!!LIr3w8kk_Xxm!o5ujHGgOykJCVLnUl3lDTYNdLmKJd5IJ6KOoXABBQ5IBtGgHs5mAIgZcdvogYuOo9Wem6woDKfuuhxPj2njVpl24$>
>>  *
>> [image: Google Logo]
>>
>>
>>
>
> --
> Best regards,
> Vladislav Sidorovich
>
> Feedback: *go/feedback-for-vladislav *
> [image: Google Logo]
>
>
>

Reply via email to