date:20210525

RE: Non-microsecond timestamps

2021-05-25 Thread Tina Luo

If we were to use two columns (timestamp and nanos as a long) , how would partitioning and sorting work? I imagine we’d just partition on the timestamp column but sort on the timestamp and nanos columns? From: Ryan Blue Sent: Friday, May 21, 2021 6:09 PM To: dev@iceberg.apache.org Subject: Re:

Re: Next community sync

2021-05-25 Thread Ryan Blue

I started a google group to track people that want to be included in the sync invite: https://groups.google.com/g/iceberg-sync I'll add that group to the invite and people can add themselves to the group. I think that will work and so the next sync I'll remove everyone that was added directly. Ple

Re: Spark DELETE FROM parallelism

2021-05-25 Thread Huadong Liu

Thank you Anton! Follow up questions inline. On Tue, May 25, 2021 at 10:31 AM Anton Okolnychyi wrote: > The performance degradation you see is because Spark cannot push down > subquery expressions. That’s why Iceberg ends up scanning the complete > table. You will still rewrite only files that h

Re: Spark DELETE FROM parallelism

2021-05-25 Thread Anton Okolnychyi

The performance degradation you see is because Spark cannot push down subquery expressions. That’s why Iceberg ends up scanning the complete table. You will still rewrite only files that have matches but I assume joining the complete table will be expensive. In order to benefit from partition a

Re: Next community sync

2021-05-25 Thread Steven Wu

Ryan, please add me to the community sync too. Thanks! On Tue, May 25, 2021 at 9:53 AM Kyle Bendickson wrote: > Hi Ryan, > > Can you please add my new work email to the community sync? kbendickson > [at] apple [dot ]com > > Thanks, > Kyle! > >  > > *Kyle Bendickson* > Software Engineer > Apple

Re: Next community sync

2021-05-25 Thread Kyle Bendickson

Hi Ryan, Can you please add my new work email to the community sync? kbendickson [at] apple [dot ]com Thanks, Kyle!  Kyle Bendickson Software Engineer Apple ACS Data One Apple Park Way, Cupertino, CA 95014, USA kbendick...@apple.com This email and any attachments

Spark DELETE FROM parallelism

2021-05-25 Thread Huadong Liu

Hi iceberg-dev, I have a table that is partitioned by id (custom partitioning at the moment, not iceberg hidden partitioning) and event time. Individual DELETE finishes reasonably fast, for example: *sql("DELETE FROM table where id_shard=111 and id=111456")* *sql("DELETE FROM table where id_shard

RE: Non-microsecond timestamps

Re: Next community sync

Re: Spark DELETE FROM parallelism

Re: Spark DELETE FROM parallelism

Re: Next community sync

Re: Next community sync

Spark DELETE FROM parallelism

7 matches

Site Navigation

Mail list logo

Footer information