Hey Ryan,

The goal of the deprecation is to avoid other implementations to produce
it. PyIceberg for example, does not support this and I think it would be
good to avoid having others (rust, go, etc) to support this. Regarding the
removal, Amogh expressed the same concern on the PR
<https://github.com/apache/iceberg/pull/11586#discussion_r1848789823>.

In my quest to make the Java implementation follow the spec as closely as
possible, I noticed that we use a DummyFileIO to mimic a ManifestList. I
ran into this when turning
<https://github.com/apache/iceberg/pull/11626/files#r1853683623>503:
added_snapshot_id
<https://github.com/apache/iceberg/pull/11626/files#r1853683623> into a
required field
<https://github.com/apache/iceberg/pull/11626/files#r1853683623>. So the
value is in removing paths, as Shezon pointed out. When removing support
for the embedded manifest list, we can remove all that logic and keep the
codebase nice and tidy.

It would be good to start the discussion of deprecating support for older
formats at some point, however, for a V2 reader is it fairly easy to
project V1 metadata as V2. Except when embedded manifests are being used,
marking this kind of oddities as deprecated I think will enable readers to
support reading older versions for a longer time. My suggestion would be to
mark the field as deprecated and revisit the actual removal. I've marked it
up for removal in Java 2.0 for now to give it enough time.

Kind regards,
Fokko



Op do 21 nov 2024 om 20:52 schreef rdb...@gmail.com <rdb...@gmail.com>:

> Can we safely deprecate and remove this? The manifest list is required in
> v2, but the spec has stated for a long time that v1 tables can use
> manifests rather than a manifest list. It’s unlikely, but it would be
> valid for other implementations to produce it.
>
> I would understand if other implementations chose to fail tables that
> don’t have a manifest list to avoid adding code to handle manifests, but
> I don’t think that there’s much value in removing support from the Java
> implementation.
>
> Instead, what about discussing how to deprecate support for older format
> versions? That seems like the main issue here. Once the majority of
> implementations move to newer versions, we would like to deprecate the old
> ones.
>
> On Thu, Nov 21, 2024 at 11:01 AM Szehon Ho <szehon.apa...@gmail.com>
> wrote:
>
>> +1, great to have less possible paths.
>>
>> Thanks
>> Szehon
>>
>> On Thu, Nov 21, 2024 at 10:33 AM Steve Zhang
>> <hongyue_zh...@apple.com.invalid> wrote:
>>
>>> +1 to deprecate
>>>
>>> Thanks,
>>> Steve Zhang
>>>
>>>
>>>
>>> On Nov 19, 2024, at 3:32 AM, Fokko Driesprong <fo...@apache.org> wrote:
>>>
>>> Hi everyone,
>>>
>>> I would like to propose to deprecate embedded manifests
>>> <https://github.com/apache/iceberg/pull/11586>. This has been used
>>> before the manifest-list was introduced, but I don't think they are used
>>> since the project has been open-sourced, and it would be good to
>>> officially deprecate them from the spec. It is only supported by Iceberg
>>> Java today, and I haven't seen any requests for PyIceberg to add support
>>> for this.
>>>
>>> Any questions or concerns about deprecating the embedded manifests?
>>>
>>> Kind regards,
>>> Fokko Driesprong
>>>
>>>
>>>

Reply via email to