Re: Drop table behavior

2021-11-23 Thread Jack Ye
Regarding the object storage mode use case that Yufei mentioned, we did some experiments that advocate people to use the root of a bucket as the data location and share the data location across multiple tables. That minimizes throttling, but we do see some complaints related to table ownership such

Re: Drop table behavior

2021-11-23 Thread Yan Yan
Thank you all for the feedback! To clarify, *dropTable* method implementation in Iceberg library does do its work of cleaning up all data + delete files correctly in normal circumstances, and it's mostly the past metadata.json files and the directories that are not cleaned up. Also after looking u

Re: New gitbox/github repository created: iceberg-docs.git

2021-11-23 Thread Ryan Blue
Hi everyone, I just created an iceberg-docs repository. Since we didn't really discuss this on the dev list yet, I want to start a thread. We've been working lately on updating the ASF Iceberg site so that we have docs for multiple Iceberg versions. Sam has been doing the work and has found a real

Re: location field in TableMetadataParser

2021-11-23 Thread Russell Spitzer
For looking up the details of the Iceberg Format it always pays to check the https://iceberg.apache.org/#spec/ . These are the official requirements for an Iceberg table. There you will see that "location" is The table’s base location. This is used by writers to determine where to store data files

location field in TableMetadataParser

2021-11-23 Thread S P
Hi, I am new to iceberg and trying to understand the table metadata layout. I see the "location" field in the table metadata. My understanding is that the data files in Iceberg can be

Re: Drop table behavior

2021-11-23 Thread Yufei Gu
Piotr made a good point. The major use case to customize data file paths is the s3 path randomization due to the throttling issue. It looks like an exceptional use case. I’d also prefer to think of it that way, what if the s3 throttling issue is resolved, or mitigated to a way users can ignore it i

Re: Drop table behavior

2021-11-23 Thread Piotr Findeisen
Hi, When you come from storage perspective, then the current design of 'not owning' location makes sense. However, if you come from SQL perspective, then all this is impractical limitation. Analysts and other SQL users want to be able to delete their data and must have confidence that all the da