Appreciate the thoughtful comments!

On Thu, Jul 18, 2024 at 10:29 AM Jack Ye <> wrote:

> Thank you for bringing this up Ryan. I have been also in the camp of
> saying HadoopCatalog is not recommended, but after thinking about this more
> deeply last night, I now have mixed feelings about this topic. Just to
> comment on the reasons you listed first:
> * For reason 1 & 2, it looks like the root cause is that people try to use
> HadoopCatalog outside native HDFS because there are HDFS connectors to
> other storages like S3AFileSystem. However, the norm for such usage has
> been that those connectors do not strictly follow HDFS semantics, and it is
> assumed that people acknowledge the implication of such usage and accept
> the risk. For example, S3AFileSystem was there even before S3 was strongly
> consistent, but people have been using that to write files.
> * For reason 3, there are multiple catalogs that do not support all
> operations (e.g. Glue for atomic table rename) and people still widely use
> it.
> * For reason 4, I see that more as a missing feature. More features could
> definitely be developed in that catalog implementation.
> So the key question to me is, how can we prevent people from using
> HadoopCatalog outside native HDFS. We know HadoopCatalog is popular because
> it is a storage only solution. For object storages specifically,
> HadoopCatalog is not suitable for 2 reasons:
> (1) file write does not enforce mutual exclusion, thus cannot enforce
> Iceberg optimistic concurrency requirement (a.k.a. cannot do atomic and
> swap)
> (2) directory-based design is not preferred in object storage and will
> result in bad performance.
> However, now I look at these 2 issues, they are getting outdated.
> (1) object storage is starting to enforce file mutual exclusion. GCS
> supports file generation number [1] that increments monotonically, and can
> use x-goog-if-generation-match [2] to perform atomic swap. Similar feature
> [3] exists in Azure Blob Storage. I cannot speak for the S3 team roadmap.
> But Amazon S3 is clearly falling behind in this domain, and with market
> competition, it is very clear that similar features will come in reasonably
> near future.
> (2) directory bucket is becoming the norm. Amazon S3 announced directory
> bucket in 2023 re:invent [4], which does not have the same performance
> limitation even if you have very nested folders and many objects in a
> folder. GCS also has a similar feature launched in preview [5] right now.
> Azure also already has this feature since 2021 [6].
> With these new developments in the industry, a storage-only Iceberg
> catalog becomes very attractive. It is simple with only one service
> dependency. It can safely perform atomic compare-and-swap. It is performant
> without the need to worry about folder and file organization. If you want
> to add additional features for things like access control, there are also
> integrations like access grant [7] that can be integrated to do it in a
> very scalable way.
> I know the direction in the community so far is to go with the REST
> catalog, and I am personally a big advocate for that. However, that
> requires either building a full REST catalog, or choosing a catalog vendor
> that supports REST. There are many capabilities that REST would unlock, but
> those are visions which I expect will take many years down the road for the
> community to continue to drive consensus and build those features. If I am
> the CTO of a small company and I just want an Iceberg data lake(house)
> right now, do I choose REST, or do I choose (or even just build) a
> storage-only Iceberg catalog? I feel I would actually choose the later.
> Going back to the discussion points, my current take of this topic is that:
> (1) +1 for clarifying that HadoopCatalog should only work with HDFS in the
> spec.
> (2) +1 if we want to block non-HDFS use cases in HadoopCatalog by default
> (e.g. fail if using S3A), but we should allow a feature flag to unblock the
> usage so that people can use it after understanding the implications and
> risks, just like how people use S3A today.
> (3) +0 for removing HadoopCatalog from the core library. It could be in a
> different module like iceberg-hdfs if that is more suitable.
> (4) -1 for moving HadoopCatalog to tests, because HDFS is still a valid
> use case for Iceberg. After the measures 1-3 above, people actually having
> a HDFS use case should be able to continue to innovate and optimize the
> HadoopCatalog implementation. Although "HDFS is becoming much less common",
> looking at GitHub issues and discussion forums, it still has a pretty big
> user base.
> (5) In general, I propose we separate the discussion of HadoopCatalog from
> a "storage only catalog" that also deals with other object stages when
> evaluating it. With these latest industry developments, we should evaluate
> the direction for building a storage only Iceberg catalog and see if the
> community has an interest in that. I could help raise a thread about it
> after this discussion is closed.
> Best,
> Jack Ye
> [1]
> [2]
> [3]
> [4]
> [5]
> [6]
> [7]
> On Thu, Jul 18, 2024 at 7:16 AM Eduard Tudenhöfner <
>> wrote:
>> +1 on deprecating now and removing them from the codebase with Iceberg 2.0
>> On Thu, Jul 18, 2024 at 10:40 AM Ajantha Bhat <>
>> wrote:
>>> +1 on deprecating the `File System Tables` from spec and
>>> `HadoopCatalog`, `HadoopTableOperations` in code for now
>>> and removing them permanently during 2.0 release.
>>> For testing we can use `InMemoryCatalog` as others mentioned.
>>> I am not sure about moving to test or keeping them only for HDFS.
>>> Because, it leads to confusion to existing users of Hadoop catalog.
>>> I wanted to have it deprecated 2 years ago
>>> <>
>>> and I remember that we discussed it in sync that time and left it as it is.
>>> Also, when the user brought this up in slack
>>> <>
>>> recently about lockmanager and refactoring the HadoopTableOperations,
>>> I have asked to open this discussion on the mailing list. So, that we
>>> can conclude it once and for all.
>>> - Ajantha
>>> On Thu, Jul 18, 2024 at 12:49 PM Fokko Driesprong <>
>>> wrote:
>>>> Hey Ryan and others,
>>>> Thanks for bringing this up. I would be in favor of removing the
>>>> HadoopTableOperations, mostly because of the reasons that you already
>>>> mentioned, but also about the fact that it is not fully in line with the
>>>> first principles of Iceberg (being object store native) as it uses
>>>> file-listing.
>>>> I think we should deprecate the HadoopTables to raise the attention of
>>>> their users. I would be reluctant to move it to test to just use it for
>>>> testing purposes, I'd rather remove it and replace its use in tests with
>>>> the InMemoryCatalog.
>>>> Regarding the StaticTable, this is an easy way to have a read-only
>>>> table by directly pointing to the metadata. This also lives in Java under
>>>> StaticTableOperations
>>>> <>.
>>>> It isn't a full-blown catalog where you can list {tables,schemas},
>>>> update tables, etc. As ZENOTME pointed out already, it is all up to the
>>>> user, for example, there is no listing of directories to determine which
>>>> tables are in the catalog.
>>>> is there a probability that the strategy used by HadoopCatalog is not
>>>>> compatible with the table managed by other catalogs?
>>>> Yes, so they are different, you can see in the spec the section on File
>>>> System tables
>>>> <>,
>>>> is used by the HadoopTable implementation. Whereas the other catalogs
>>>> follow the Metastore Tables
>>>> <>
>>>> .
>>>> Kind regards,
>>>> Fokko
>>>> Op do 18 jul 2024 om 07:19 schreef NOTME ZE <>:
>>>>> According to our requirements, this function is for some users who
>>>>> want to read iceberg tables without relying on any catalogs, I think the
>>>>> StaticTable may be more flexible and clear in semantics. For StaticTable,
>>>>> it's the user's responsibility to decide which metadata of the table to
>>>>> read. But for read-only HadoopCatalog, the metadata may be decided by
>>>>> Catalog, is there a probability that the strategy used by HadoopCatalog is
>>>>> not compatible with the table managed by other catalogs?
>>>>> Renjie Liu <> 于2024年7月18日周四 11:39写道:
>>>>>> I think there are two ways to do this:
>>>>>> 1. As Xuanwo said, we refactor HadoopCatalog to be read only, and
>>>>>> throw unsupported operation exception for other operations that 
>>>>>> manipulate
>>>>>> tables.
>>>>>> 2. Totally deprecate HadoopCatalog, and add StaticTable as we did in
>>>>>> pyiceberg or iceberg-rust.
>>>>>> On Thu, Jul 18, 2024 at 11:26 AM Xuanwo <> wrote:
>>>>>>> Hi, Renjie
>>>>>>> Are you suggesting that we refactor HadoopCatalog as a
>>>>>>> FileSystemCatalog to enable direct reading from file systems like HDFS, 
>>>>>>> S3,
>>>>>>> and Azure Blob Storage? This catalog will be read-only that don't 
>>>>>>> support
>>>>>>> write operations.
>>>>>>> On Thu, Jul 18, 2024, at 10:23, Renjie Liu wrote:
>>>>>>> Hi, Ryan:
>>>>>>> Thanks for raising this. I agree that HadoopCatalog is dangerous in
>>>>>>> manipulating tables/catalogs given limitations of different file 
>>>>>>> systems.
>>>>>>> But I see that there are some users who want to read iceberg tables 
>>>>>>> without
>>>>>>> relying on any catalogs, this is also the motivational use case of
>>>>>>> StaticTable in pyiceberg and iceberg-rust, is there similar things in 
>>>>>>> java
>>>>>>> implementation?
>>>>>>> On Thu, Jul 18, 2024 at 7:01 AM Ryan Blue <> wrote:
>>>>>>> Hey everyone,
>>>>>>> There has been some recent discussion about improving
>>>>>>> HadoopTableOperations and the catalog based on those tables, but we've
>>>>>>> discouraged using file system only table (or "hadoop" tables) for years 
>>>>>>> now
>>>>>>> because of major problems:
>>>>>>> * It is only safe to use hadoop tables with HDFS; most local file
>>>>>>> systems, S3, and other common object stores are unsafe
>>>>>>> * Despite not providing atomicity guarantees outside of HDFS, people
>>>>>>> use the tables in unsafe situations
>>>>>>> * HadoopCatalog cannot implement atomic operations for rename and
>>>>>>> drop table, which are commonly used in data engineering
>>>>>>> * Alternative file names (for instance when using metadata file
>>>>>>> compression) also break guarantees
>>>>>>> While these tables are useful for testing in non-production
>>>>>>> scenarios, I think it's misleading to have them in the core module 
>>>>>>> because
>>>>>>> there's an appearance that they are a reasonable choice. I propose we
>>>>>>> deprecate the HadoopTableOperations and HadoopCatalog implementations 
>>>>>>> and
>>>>>>> move them to tests the next time we can make breaking API changes (2.0).
>>>>>>> I think we should also consider similar fixes to the table spec. It
>>>>>>> currently describes how HadoopTableOperations works, which does not 
>>>>>>> work in
>>>>>>> object stores or local file systems. HDFS is becoming much less common 
>>>>>>> and
>>>>>>> I propose that we note that the strategy in the spec should ONLY be used
>>>>>>> with HDFS.
>>>>>>> What do other people think?
>>>>>>> Ryan
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>> Xuanwo

Reply via email to