Re: [Discuss] Replace Hadoop Catalog Examples with JDBC Catalog in Documentation

Renjie Liu Wed, 09 Oct 2024 18:51:09 -0700

I would also vote for jdbc catalog, ideally using sqlite as backend as it
doesn't require setting up other databases.


On Thu, Oct 10, 2024 at 8:42 AM Manu Zhang <[email protected]> wrote:

> I'd vote for JDBC catalog as it's simple for a quick-start guide. Setting
> up a REST Service with docker image could be cumbersome.
> We can have another page for REST Catalog.
>
> Regards,
> Manu
>
> On Thu, Oct 10, 2024 at 2:50 AM Marc Cenac
> <[email protected]> wrote:
>
>> I support the idea of updating the docs to replace the Hadoop catalog
>> example, but I'm wondering why not use a REST Catalog example instead?  I
>> saw Ajantha proposed adding Docker images for a REST Catalog adapter [1] so
>> we could potentially use this with a JDBC Catalog backed by SQLite file as
>> a convenient quickstart example which shows a REST Catalog configuration.
>> I'm thinking the REST Catalog would be preferred to the JDBC catalog as a
>> best practice, since it's technology agnostic (on the server side) and the
>> protocol allows for more advanced functionality (ie. multi table commits,
>> credentials vending, etc).
>>
>> [1] https://lists.apache.org/thread/xl1cwq7vmnh6zgfd2vck2nq7dfd33ncq
>>
>> On Tue, Oct 8, 2024 at 1:18 PM Kevin Liu <[email protected]> wrote:
>>
>>> Hi all,
>>>
>>> I wanted to bring up a suggestion regarding our current documentation.
>>> The existing examples for Iceberg often use the Hadoop catalog, as seen in:
>>>
>>>    - Adding a Catalog - Spark Quickstart [1]
>>>    - Adding Catalogs - Spark Getting Started [2]
>>>
>>> Since we generally advise against using Hadoop catalogs in production
>>> environments, I believe it would be beneficial to replace these examples
>>> with ones that use the JDBC catalog. The JDBC catalog, configured with a
>>> local SQLite database file, offers similar convenience but aligns better
>>> with production best practices.
>>>
>>> I've created an issue [3] and a PR [4] to address this. Please take a
>>> look, and I'd love to hear your thoughts on whether this is a direction we
>>> want to pursue.
>>>
>>> Best,
>>> Kevin Liu
>>>
>>> [1] https://iceberg.apache.org/spark-quickstart/#adding-a-catalog
>>> [2]
>>> https://iceberg.apache.org/docs/nightly/spark-getting-started/#adding-catalogs
>>> [3] https://github.com/apache/iceberg/issues/11284
>>> [4] https://github.com/apache/iceberg/pull/11285
>>>
>>>

Re: [Discuss] Replace Hadoop Catalog Examples with JDBC Catalog in Documentation

Reply via email to