+1 , excited for this one too, we've seen the current metrics maps blow up
the memory and hope can improve that.
On the Geo front, this could allow us to add supplementary metrics that
don't conform to the geo type, like S2 Cell Ids.
Thanks
Szehon
On Mon, Jun 2, 2025 at 6:14 AM Eduard Tudenhöfne
Thanks, Peter, for bringing these ideas forward! and you also raised a
great point about clarifying the goal of indexing. I’ve been considering it
with the intention of eventually enabling fast upserts through DVs. To
support that, we need an index that maps primary keys to both the data file
and t
I am interested in this idea and looking forward to collaboration.
Thanks,
Huang-Hsiang
> On Jun 2, 2025, at 10:14 AM, namratha mk wrote:
>
> Hello,
>
> I am interested in contributing to this effort.
>
> Thanks,
> Namratha
>
> On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <2am...@gmail.
Hello,
I am interested in contributing to this effort.
Thanks,
Namratha
On Thu, May 29, 2025 at 1:36 PM Amogh Jahagirdar <2am...@gmail.com> wrote:
> Thanks for kicking this thread off Ryan, I'm interested in helping out
> here! I've been working on a proposal in this area and it would be great
Hi Steven,
Could you please include the following PRs, all related to authentication:
https://github.com/apache/iceberg/pull/13215
https://github.com/apache/iceberg/pull/12562
https://github.com/apache/iceberg/pull/12563
The first one is a fix for a performance degradation in request
signing and
Hi Steven,
I would like to get in these prs too
https://github.com/apache/iceberg/pull/13111
https://github.com/apache/iceberg/pull/13212
Thanks
Talat
On Thu, May 29, 2025 at 7:26 PM Prashant Singh
wrote:
> Thank you so much for driving this release !
> It will be really helpful in getting th
so this'll cut down on #of manifest files read, won't it? so improving
query planning
Does anyone have an estimate of what benefit this is likely to have in
production deployments?
On Thu, 29 May 2025 at 21:25, Ryan Blue wrote:
> Hi everyone,
>
> Like Russell’s recent note, I’m starting a threa
Hey everyone,
I'm starting a thread to connect folks interested in improving the existing
way of collecting column-level statistics (often referred to as *metrics*
in the code). I've already started a proposal, which can be found at
https://s.apache.org/iceberg-column-stats.
*Motivation*
Column
Hi Bart,
Thanks for your answer!
I’ve pulled out some text from your thorough and well-organized response to
make it easier to highlight my comments.
> It would be well possible to tune parquet writers to write very large row
groups when a large string column dominates. [..]
What would you do, i
On Fri, May 30, 2025 at 8:35 PM Péter Váry
wrote:
> Consider this example
> Imagine a table with one large string column and many small numeric
> columns.
>
> Scenario 1: Single File
>
>- All columns are written into a single file.
>- The RowGroup size is small due to the large string col
Hi everyone,
I would like to encourage everybody who wants to participate in the
discussion of the topic to share their thoughts either on the doc, or on
the PRs.
I would like to finalize, merge the API in 1.10, so we can merge the
implementation early 1.11. This would allow more throughout testin
11 matches
Mail list logo