Thank you all for your valuable input on this topic. I apologize for not being
able to keep up with the discussion last week, as I was traveling during the
U.S. Thanksgiving holiday.
To summarize, I think we have the agreement on following points:
- tableExists API shall indicate whether a table
+1 from my side too.
I wanted to make sure that the community is aware of this change which will
bring behavioral difference compared to other catalogs. This is why I have
asked Steve to start this thread.
On Thu, Nov 28, 2024, 02:10 Szehon Ho wrote:
> Yea, I think that part is definitely kept
Yea, I think that part is definitely kept.
Thanks
Szehon
On Wed, Nov 27, 2024 at 12:02 PM rdb...@gmail.com wrote:
> I'd support changing the behavior if we still have a way to match the
> intent, which is to return true if the table exists in Hive and is an
> Iceberg table.
>
> On Wed, Nov 27,
I'd support changing the behavior if we still have a way to match the
intent, which is to return true if the table exists in Hive and is an
Iceberg table.
On Wed, Nov 27, 2024 at 11:26 AM Szehon Ho wrote:
> Hm I think the thread got a bit sidetracked by the other question.
>
> The initial propos
Hm I think the thread got a bit sidetracked by the other question.
The initial proposal by Steve is a performance improvement for
HiveCatalog's tableExists(). Currently it loads both Hive and Iceberg
table metadata, and if successful returns true. The proposal is to load
from Hive only, and retu
What kind of corruption are you referring to? I would expect corruption to
result in an exception when loading the table, but that the table should
still exist. The problem is likely that we determine if a table exists by
attempting to load it. We could fix that by not attempting to load the
table.
>
> The current behavior's intent is not to check whether the metadata is
> valid, it is to detect whether the table is an Iceberg table.
Is there a way to detect this from HiveCatalog without loading the table?
On Wed, Nov 27, 2024 at 2:01 PM Péter Váry
wrote:
> I think we have an agreement,
I think we have an agreement, not to change the behavior wrt existing
non-Iceberg tables, and throw an exception.
Are we also in agreement with the original proposal to return true when the
table exists but the metadata is somehow corrupted? Note: this is the
proposed change of behavior why the th
I'd argue against changing this. The current behavior's intent is not to
check whether the metadata is valid, it is to detect whether the table is
an Iceberg table. It ignores non-Iceberg tables. Changing that behavior
would be surprising, especially if we started throwing exceptions.
On Fri, Nov
> Should add, my personal preference is probably not to change the existing
behavior for this part
+1. I realized that this is not a new behavior. The `loadTable`
implementation has this problem too.
It would be good to have a test case specifically for this edge case and
maybe call this out in th
Should add, my personal preference is probably not to change the existing
behavior for this part (false, if exists a Hive table with same name) at
the moment, just adding another possibility for consideration.
Thanks
Szehon
On Fri, Nov 22, 2024 at 2:00 AM Szehon Ho wrote:
> Thanks Kevin and Gab
Thanks Kevin and Gabor, this is an interesting discussion. I guess a third
option instead of returning true/false in this case, is to change it to
throw an NoSuchIcebergTableException if its a non-Iceberg table, which I
think is actually what this pr does?
Thanks
Szehon
On Fri, Nov 22, 2024 at 1
Hey,
I think what Kevin says makes sense. However, it would then confuse the
opposite use case of this function. Let's assume that we change the
implementation of tableExists() to not load the table internally:
if (tableExists(table_name)) {
table = loadTable(table_name);
}
Here, you find th
Hi Steve,
This makes sense to me. The semantics of `tableExists` focus on whether a
table's name exists in the catalog. For the Hive catalog, checking the HMS
entry should be sufficient.
I do have a question about usage, though. Typically, I would use `
tableExists` like this:
```
if (!tableExis
Hi,
It's a good performance find and improvement. Left some comment on the PR.
IMO, the behavior actually more matches the API javadoc ("Check whether
table exists"), not whether it is corrupted or not, so I'm supportive of it.
Thanks
Szehon
On Thu, Nov 21, 2024 at 10:57 AM Steve Zhang
wrote
Hi Iceberger,
I have a proposal to simplify the tableExists API in the Hive catalog, which
involves a behavior change, and I’d like to hear your thoughts.
Currently, in our catalog interface[1], the tableExists method is implemented
as a default API by invoking the loadTable method. It retu
16 matches
Mail list logo