peterhunter99001-cyber opened a new issue, #10486:
URL: https://github.com/apache/gravitino/issues/10486
### Version
main branch
### Describe what's wrong
**Body:**
```markdown
## Summary
When connecting ClickHouse's `DataLakeCatalog` engine (with `catalog_type =
'rest'`)
to Gravitino's Iceberg REST auxiliary service (`/iceberg/v1/`), the
connection always
fails with a `NoSuchCatalogException`. This makes Gravitino incompatible
with
ClickHouse's REST catalog client, while other engines (Trino, StarRocks,
Spark)
work fine against the same endpoint.
## Environment
| Component | Version |
|-------------|---------|
| Gravitino | 1.2.0 |
| ClickHouse | 26.2 |
| Iceberg REST backend | SQLite (JDBC catalog backend) |
| Object Storage | MinIO / RustFS (S3-compatible, path-style) |
## Steps to Reproduce
**1. Gravitino configuration (iceberg-rest auxiliary service):**
```properties
gravitino.iceberg-rest.catalog-backend = jdbc
gravitino.iceberg-rest.uri = jdbc:sqlite:/catalog/iceberg.db
gravitino.iceberg-rest.warehouse = s3a://iceberg-warehouse/
gravitino.iceberg-rest.io-impl = org.apache.iceberg.aws.s3.S3FileIO
```
**2. ClickHouse SQL to create the DataLakeCatalog database:**
```sql
SET allow_experimental_database_iceberg = 1;
CREATE DATABASE gravitino_catalog
ENGINE = DataLakeCatalog('http://gravitino:9001/iceberg/', '', '')
SETTINGS
catalog_type = 'rest',
storage_endpoint = 'http://minio:9000/iceberg-warehouse',
warehouse = 'lakehouse';
```
**3. Attempt to use the catalog:**
```sql
USE gravitino_catalog;
SHOW TABLES;
```
## Actual Error
```
Code: 1060. DB::Exception: Failed to get config from REST catalog:
NoSuchCatalogException: Catalog 'lakehouse' does not exist
```
ClickHouse's REST client sends the following HTTP request on startup:
```
GET /iceberg/v1/config?warehouse=lakehouse
```
Gravitino's Iceberg REST service interprets the `warehouse` query parameter
as
a **catalog name** within its internal metalake registry — looking for a
catalog
named `lakehouse` — which does not exist. The request fails before any table
operations can be performed.
## Root Cause Analysis
The Apache Iceberg REST Catalog specification defines `warehouse` as:
> *"An optional identifier for the target warehouse"* (from `GET /v1/config`)
The spec intentionally leaves the **semantic interpretation** of this
parameter
to the server implementation. This has led to divergent behaviors across
implementations:
| Implementation | `warehouse` parameter meaning |
|---|---|
| **Nessie** | Logical warehouse name (pre-registered named storage
location) |
| **Lakekeeper** | Logical warehouse name (pre-registered) |
| **Gravitino** | Treated as a Gravitino catalog name (internal registry
lookup) |
| **AWS Glue REST** | Account ID + S3 table bucket name |
| **Databricks Unity** | Catalog name |
**ClickHouse's `DataLakeCatalog` C++ client** requires `warehouse` as a
mandatory
routing key and always includes it in `GET /v1/config?warehouse=<value>`,
following
the Nessie/Lakekeeper convention. Gravitino's behavior of looking up the
value
as a catalog name in its metalake registry is incompatible with this
expectation.
**Why Trino and StarRocks work:** Both use the Apache Iceberg Java SDK's
REST
catalog client, which treats `warehouse` as **optional** — when not
configured,
the parameter is simply omitted from the request. Gravitino then returns its
default configuration without triggering the catalog lookup. This is
accidental
compatibility, not intentional support.
## Expected Behavior
One of the following would resolve this incompatibility:
**Option A (Recommended): When `warehouse` parameter is unrecognized, fall
back
to default configuration instead of throwing an exception.**
```
GET /iceberg/v1/config?warehouse=<any_value>
→ If <any_value> does not match a known catalog name,
return the default catalog configuration (HTTP 200)
instead of raising NoSuchCatalogException (HTTP 404/400)
```
**Option B: Document a supported `warehouse` value that ClickHouse users can
configure to successfully connect.**
For example, if there is a specific string (e.g., the configured warehouse
path,
or an empty string) that Gravitino accepts without triggering an internal
lookup,
documenting this would allow ClickHouse users to work around the issue.
**Option C: Add support for a named warehouse identifier in the Iceberg REST
auxiliary service** — similar to Nessie's multi-warehouse model — so that
`warehouse=<logical_name>` routes to the correct storage configuration.
## Impact
- ClickHouse `DataLakeCatalog` with `catalog_type = 'rest'` **cannot connect
to
Gravitino's Iceberg REST service** at all
- Users cannot use Gravitino as a shared Iceberg catalog for multi-engine
environments that include ClickHouse
- The workaround is to use ClickHouse's `Iceberg` table engine directly
(bypassing Gravitino entirely), which prevents true multi-engine metadata
sharing
## References
- [ClickHouse DataLakeCatalog — REST Catalog
docs](https://clickhouse.com/docs/use-cases/data-lake/rest-catalog)
- [ClickHouse DataLakeCatalog — Lakekeeper
docs](https://clickhouse.com/docs/use-cases/data-lake/lakekeeper-catalog)
(shows `warehouse = 'demo'` as logical name)
- [Apache Iceberg REST Catalog spec — GET
/v1/config](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml)
- [Nessie warehouse
semantics](https://projectnessie.org/guides/iceberg-rest/#warehouses--storage-locations)
```
### Error message and/or stacktrace
Code: 1060. DB::Exception: Failed to get config from REST catalog:
NoSuchCatalogException: Catalog 'lakehouse' does not exist
### How to reproduce
Gravitino 1.2.0
### Additional context
Gravitino 1.2.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]