Community sync notes - 8 October 2019

2019-10-08 Thread Ryan Blue
Hi everyone, Here's a link to the notes that Dan took for today's sync: https://docs.google.com/document/d/1YjDIMXg8iZ07U2r2EN7DUemdG2WEYUhsbvf4HvjFwl4/edit?ts=5d9cc0cf Thanks for attending everyone! I'll send an invite for the next one that is in the evening PDT so that we can include people in

Re: Hidden partitioning clarification

2019-10-08 Thread Ryan Blue
> It is not clear to me how partition keys are distributed with respect to actual files and what constraints exist for partition evolution. The requirement is that a file contains rows that have the same values for all partition columns. If you partition by log_level and date(ts), then for any giv

Open issues against Vectorized Iceberg Read milestone

2019-10-08 Thread Anjali Norwood
Thank you Gautam for the summary of the discussion. Hello Devs, The follow up vectorized iceberg tasks are now captured as issues against the milestone. Listed below for convenience. https://github.com/apache/incubator-iceberg/issues/518 https://github.com/apache/incubator-iceberg/issues/519 http

Re: Hidden partitioning clarification

2019-10-08 Thread Elliot West
‘If would’ → ‘it would’ ‘original schema’ → ‘original scheme’ On Tue, 8 Oct 2019 at 18:00, Elliot West wrote: > Hello, > > I'm trying to understand the underlying partitioning model in Iceberg. It > is not clear to me how partition keys are distributed with respect to > actual files and what con

Hidden partitioning clarification

2019-10-08 Thread Elliot West
Hello, I'm trying to understand the underlying partitioning model in Iceberg. It is not clear to me how partition keys are distributed with respect to actual files and what constraints exist for partition evolution. My expectation is that to achieve reasonable read performance, sets of keys must b