+1 9 am PST on Tues/Wednesday works.
On Mon, Oct 7, 2019 at 4:50 AM Jacques Nadeau wrote:
> Tuesdays work best for me.
>
> On Sun, Oct 6, 2019, 4:18 PM Anton Okolnychyi
> wrote:
>
>> Tuesday/Wednesday/Thursday works fine for me. Anything up to 19:00 UTC /
>> 20:00 BST / 12:00 PDT is OK if 09:0
+1 for once a month. Could we set an alternate time for CCT guys?
On Mon, Oct 7, 2019 at 3:23 PM Gautam wrote:
>
> +1 9 am PST on Tues/Wednesday works.
>
> On Mon, Oct 7, 2019 at 4:50 AM Jacques Nadeau wrote:
>>
>> Tuesdays work best for me.
>>
>> On Sun, Oct 6, 2019, 4:18 PM Anton Okolnychyi
Hi,
We are using the iceberg spark datasource with spark structured streaming.
The issue with this is, of course, the problem that the incoming partitions
are not sorted.
We have implemented our own streaming partition writer (extending
DataSourceWriter, StreamWriter).
We started by keeping the F
Okay, let's set it for Tuesday the 8th (tomorrow) at 16:00 UTC since that
works for most people. I'll send out an invite to everyone on this thread.
If you'd like to be included, just send me a direct email. Everyone is
welcome.
We'll schedule the next one for a time when people in CCT can make it
Hello Devs,
We met to discuss progress and next steps on Vectorized
read path in Iceberg. Here are my notes from the sync. Feel free to reply
with clarifications in case I mis-quoted or missed anything.
*Attendees*:
Anjali Norwood
Padma Pennumarthy
Ryan Blue
Samarth Jain
Gautam Ko
The approach sounds okay to me. It's usually preferable to repartition the
data by your partition dimensions to keep the number of data files that
each writer needs to create to a minimum.
Also, if buffering in memory starts taking too much memory, you can switch
to using Avro instead of Parquet f