It works for me.
With a quick thought, there may be a few concerns about consolidated fashion
storage.
1). Maintaining the consolidated storage may be a bit more complex;
2). It may make collecting index while writing data file (i.e., online index
building) more complex (e.g., we need to consid
Great, thank you for planning to join! I definitely want to get your input
on this as well.
On Wed, Mar 3, 2021 at 6:06 PM OpenInx wrote:
> It will be 1:00 AM (China Standard Time) on 18 March, and it works for
> our Asia people. I'd love to attend this discussion, Thanks.
>
> On Thu, Mar 4,
It will be 1:00 AM (China Standard Time) on 18 March, and it works for
our Asia people. I'd love to attend this discussion, Thanks.
On Thu, Mar 4, 2021 at 9:50 AM Ryan Blue wrote:
> Thanks for putting this together, Guy! I just did a pass over the doc and
> it looks like a really reasonable
Thanks for putting this together, Guy! I just did a pass over the doc and
it looks like a really reasonable proposal for being able to inject custom
file filter implementations.
One of the main things we need to think about is how to store and track the
index data. There's a comment in the doc abo
David, we already have Hive support in Iceberg, so there is no need to
create a separate project. I think the problem is that we can't make
changes to Hive that are needed for that support. We're reaching the limits
of what can be done in an external project, so we can either add/update
interfaces
I agree with the concern about caching splits, but doesn't the API cause us
to collect all of the splits into memory anyway? I thought there was no way
to return splits as an `Iterator` that lazily loads them. If that's the
case, then we primarily need to worry about cleanup and how long they are
k
Hello Team,
I'm not sure how far out you want to scope this, but I think we have enough
sub-projects as it is within the Hive core project. To build the entire
project takes a considerable amount of time.
Would it be possible to roll this out like Jackson or DataNucleaus?
https://github.com/apa
Yes, I think we should move forward with reads that don't need to merge
deletes and have a check that there are no deletes to merge. That will work
in many cases and we can add read support for v2 later.
On Wed, Mar 3, 2021 at 3:42 AM Mayur Srivastava <
mayur.srivast...@twosigma.com> wrote:
> >>
I think that this direction sounds reasonable. It makes sense to start
building the integration in Hive because it will be easier to iterate
there. Iceberg is quite different in some areas and I think that would
probably mean that Hive needs to change to provide a really great
experience. That was
On Wed, Mar 3, 2021 at 1:48 AM Peter Vary
wrote:
> Quick question @Edgar: Am I right that the table is created by Spark? I
> think if it is created from Hive and we inserted the data from Hive, then
> we should have the basic stats already collected and we should not need the
> estimation (we mig
I did something similar to visualize the snapshots and files. But instead
of using the static website, I was using the Java API to get the metadata
from HDFS and send it back to the frontend.
Something like this:
https://observablehq.com/@capkurmagati/iceberg-metadata-visualization
My actual implem
>> Should we proceed with this pr and later add support for vectorized reads in
>> a separate pr?
I meant support deletes in the vectorized reader.
Thanks,
Mayur
From: Mayur Srivastava
Sent: Wednesday, March 3, 2021 6:41 AM
To: dev@iceberg.apache.org
Cc: Ryan Blue
Subject: RE: Reading data fro
Thanks for finding out Peter.
Should we proceed with this pr and later add support for vectorized reads in a
separate pr?
There are also some other limitations in the current pr (listed in the pr)
which could be addressed in subsequent prs.
Thanks,
Mayur
From: Peter Vary
Sent: Tuesday, March
Hi Iceberg and Hive Teams,
As some of you already know we are working on making Iceberg available as a
first class storage layer for Hive.
Folks on the Iceberg side made a good job on utilizing the existing Hive SerDe
API for the released Hive 2.3.8 and 3.1.2 versions. Thanks to their efforts w
14 matches
Mail list logo