has a
different commit time). Will we be able to store additional stats, e.g. commit
times, per data file or partition in the tagged snapshot?
From: Szehon Ho
Sent: Monday, March 7, 2022 1:40 PM
To: Iceberg Dev List
Subject: Re: Getting last modified timestamp/other stats per partition
2
ome recommendation on the amount of history for
>> snapshots.
>>
>> 2. How can we distinguish between snapshots where new data was
>> added vs snapshots where compaction was done?
>>
>>
>>
>> Thanks,
>>
>> Mayur
>>
>>
gt;
> *From:* Mayur Srivastava
> *Sent:* Thursday, February 24, 2022 7:27 AM
> *To:* dev@iceberg.apache.org
> *Subject:* RE: Getting last modified timestamp/other stats per partition
>
>
>
> Thanks Szehon. I’ll give this a try.
>
>
>
> *From:* Szehon Ho
> *Se
data was added vs
snapshots where compaction was done?
Thanks,
Mayur
From: Mayur Srivastava
Sent: Thursday, February 24, 2022 7:27 AM
To: dev@iceberg.apache.org
Subject: RE: Getting last modified timestamp/other stats per partition
Thanks Szehon. I’ll give this a try.
From: Szehon Ho
Thanks Szehon. I’ll give this a try.
From: Szehon Ho
Sent: Wednesday, February 23, 2022 1:38 PM
To: Iceberg Dev List
Subject: Re: Getting last modified timestamp/other stats per partition
Hi
Probably the metadata tables can help with this.
For the size/num_rows of partitions, you can query
Hi
Probably the metadata tables can help with this.
For the size/num_rows of partitions, you can query the files table,
https://iceberg.apache.org/docs/latest/spark-queries/#files. (Because
Iceberg keeps stats for files, and not necessary partitions).
SELECT partition, sum(file_size_in_bytes),