Congratulations Ratandeep!! Keep up the good work!
On Mon, Feb 17, 2020 at 6:26 AM Anjali Norwood
wrote:
> Congratulations Ratandeep!!
>
> regards.
> Anjali.
>
> On Mon, Feb 17, 2020 at 12:19 AM Manish Malhotra <
> manish.malhotra.w...@gmail.com> wrote:
>
>> Congratulations 🎉!!
>>
>> On Sun, Feb
I am trying to prepare pull request for review but looks like python build
is failing because of unrelated error.
I have seen some other build fail with this error too. any one knows how to
fix this issue?
https://github.com/apache/incubator-iceberg/pull/1046
https://travis-ci.org/github/apache/in
y 20, 2020 at 10:25 AM Sud wrote:
>
>> I am trying to prepare pull request for review but looks like python
>> build is failing because of unrelated error.
>> I have seen some other build fail with this error too. any one knows how
>> to fix this issue?
>>
>> h
Thank you!
On Tue, May 19, 2020 at 9:49 PM Ryan Blue wrote:
> Merged! Sorry I didn't notice the problem today. Hopefully you're
> unblocked.
>
> On Tue, May 19, 2020 at 7:41 PM Sud wrote:
>
>> Thanks for quick reply. I will wait for this to get merged to
Hello Devs,
I want help reviewing the PR
https://github.com/apache/incubator-iceberg/pull/1046
There can be more complex scenarios for union schema please feel free to
make suggestions and I will add tests.
there is no rush to merge this master but I wanted to validate approach and
get feedback fr
HI Iceberg-devs
We are trying to root cause issue where driver get stuck when trying to
read comparatively large tables ( > 2000 snapshots)
When I tried to look at the thread dump of the driver's main thread I see
that thread is stuck in planning tasks. I also noticed that iceberg-worker-pool
is
Li wrote:
> Hi Sud,
>
> The batch read of the Iceberg table should just read the latest snapshot.
> I think this case is that your large tables have a large number of
> manifest files.
>
> 1.The simple way is reducing manifest file numbers:
> - For reducing manifest
table scan: %s", scan);
}
}
return tasks;
}
On Fri, Jul 17, 2020 at 9:35 AM Sud wrote:
> Thanks @Jingsong for reply
>
> Yes one additional data point about the table.
> This table is avro table and generated from stream ingestion. We expect a
> couple of thousan
can push
operators before getting stats.
*/
On Fri, Jul 17, 2020 at 12:35 PM Sud wrote:
> ok after adding more instrumentation I see that Reader::estimateStatistics
> may be a culprit.
>
> looks like estimated stats may be performing full table estimate and thats
> why it is so
issues
for TODOs
--
Thanks
On Fri, Jul 17, 2020 at 9:25 PM Jingsong Li wrote:
> Thanks Sud for in-depth debugging. And thanks Ryan for the explanation.
>
> +1 to have a table property to disable stats estimation.
>
> IIUC, the difference between stats estimation and scan with filt
We are using incremental read for iceberg tables which gets quite few
appends ( ~500- 1000 per hour) . but instead of using timestamp we use
snapshot ids and track state of last read snapshot Id.
We are using timestamp as fallback when the state is incorrect, but as you
mentioned if timestamps are
HI Iceberg-dev's
I am investigating the connection leak issue we are seeing after upgrading
to the latest iceberg.
I have narrowed down investigation to following PR and testing fix now
https://github.com/apache/iceberg/commit/7060c928390c59e24dc207ec86f99132f6c1a828#diff-9726b2a5391d8755f6c5c849
submitted PR with fix please review
https://github.com/apache/iceberg/pull/1474/files
On Thu, Sep 17, 2020 at 3:56 PM Sud wrote:
> HI Iceberg-dev's
>
> I am investigating the connection leak issue we are seeing after upgrading
> to the latest iceberg.
> I have narrowed d
This feature will definitely help cases where we saw a file not found
exception after creating the new file using s3a (spark use to retry task in
that case).
On Wed, Dec 2, 2020 at 2:11 AM Jungtaek Lim
wrote:
> What about S3FileIO implementation? I see some issue filed that even with
> Hive cata
14 matches
Mail list logo