Awesome! In Impala we created our own implementations so far, but it
will be nice to join forces and have a common library.
Looking forward to the Slack channel.
Cheers,
Zoltan
On Fri, Nov 22, 2024 at 5:01 PM Gang Wu wrote:
>
> I have created an issue [1] to collect initial ideas for the i
store the ETag, for other catalogs some other information for the
>> same purpose.
>> 2) I checked this description of ETags, and even though we discussed earlier
>> that this is some server generated information, for me it seems that it can
>> be basically anything:
&g
just add their clever tricks to make it more
efficient.
Cheers,
Zoltan
On Thu, Nov 21, 2024 at 9:53 AM Zoltán Borók-Nagy wrote:
>
> Hi,
>
> I agree with Gabor that the support of efficiently reloading Iceberg
> tables is a generic problem that applies to all catalog
> impleme
es and that the cross-language compatible REST catalog becomes the
> primary catalog for Iceberg.
>
> - API Perspective: Given the above, I may not be in the best position to
> comment on Java APIs. However, regarding Gabor’s proposed API (Table
> loadTable(Table existingTable)), I
Hey Everyone,
Thanks Gábor, I think the proposed interface would be very useful to any
engine that employs caching, e.g. Impala.
And it is pretty neat that it is catalog-agnostic, i.e. we just give
all the information we have about the table and let the catalog
implementation efficiently reload it
pala
> > also produced correct Parquet files, but that's beyond our control and
> > there's, no doubt, a ton of data already in that format.
> >
> > This could also be part of our v3 work, where I think we intend to add
> > binary to string type promotion to t
Hey Everyone,
Thank you for raising this issue and reaching out to the Impala community.
Let me clarify that the problem only happens when there is a legacy Hive
table written by Impala, which is then converted to Iceberg. When Impala
writes into an Iceberg table there is no problem with interope
Although Javac is not an
>>> optimizing compiler and there should not be much difference in performance
>>> of the jars produced by different compilers, these changes might be worth
>>> for the project to declare a newer compile-time JDK across all modules, and
>
As a reference, Impala can also do Hive-style CREATE TABLE x LIKE y for
Iceberg tables.
You can see various examples at
https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/iceberg-create-table-like-table.test
- Zoltan
On Wed, Apr 26, 2023 at 4:10 AM
Besides Hive, neither Impala is compatible with Java11 right now. This work
is in-progress: https://issues.apache.org/jira/browse/IMPALA-11360
- Zoltan
On Mon, Apr 24, 2023 at 11:07 AM Mass Dosage wrote:
> I agree with Ryan, unless you can change the source version there's not
> that much point
Hi,
I am also interested in the discussion, all those times work for me.
Cheers,
Zoltan
On Wed, Apr 12, 2023 at 4:17 AM Chao Sun wrote:
> We are also interested in this discussion. Internally, we have been
> working on something similar in Rust, so it'd be great if we can
> combine the ef
Hi Taher,
I think most of your questions are answered in the Scan Planning section at
the Iceberg spec page: https://iceberg.apache.org/spec/#scan-planning
To give you some specific answers as well:
Equality Deletes: data and delete files have sequence numbers from which
readers can infer the rel
Hi Iceberg/Impala Team,
We've been working on adding read support for Iceberg V2 tables in Impala.
In the first round we're focusing on position deletes.
We are thinking about different approaches so I've written a design doc
about it:
https://docs.google.com/document/d/1WF_UOanQ61RUuQlM4LaiRWI0Y
Hi,
You can find information of type mappings here:
https://iceberg.apache.org/spec/#parquet
1. Iceberg timestamps have microseconds precision. In Parquet they are
stored as INT64s with TIMESTAMP_MICROS annotation.
2. Iceberg limits decimal precision to 38:
https://iceberg.apache.org/spec/#primit
-online.nosdn.127.net%2Fwzpmmc%2Fb04ea4676f5ca1dc236a340a5d9d3031.jpg&items=%5B%22%E9%82%AE%E7%AE%B1yong.sunny%40163.com+from+phone%22%5D>
>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88> 定制
>
> On 05/27/2021 16:54, Zoltán Borók-Nagy wrote:
> Hi Yong Y
Hi Yong Yang,
It is supported by Iceberg, and this is exactly how Impala is working. I.e.
Impala's Parquet writer writes the data files, then we use Iceberg's API to
append them to the table.
You can find the relevant code here:
https://github.com/apache/impala/blob/822e8373d1f1737865899b80862c2be
> explicit. If you want to overwrite a day, you pass a filter for that day.
> Another way around this problem is to support MERGE INTO, which will detect
> the files that need to be changed and correctly rewrite them, wherever they
> are in the table.
>
> rb
>
> On Fri, Jan 2
Hey everyone,
I'm currently working on the INSERT OVERWRITE statement for Iceberg tables
in Impala.
Seems like ReplacePartitions is the perfect interface for this job:
https://github.infra.cloudera.com/CDH/iceberg/blob/cdpd-master/api/src/main/java/org/apache/iceberg/ReplacePartitions.java
IIUC
Congrats, Peter!
On Tue, Jan 26, 2021 at 5:47 AM ForwardXu wrote:
> Congratulations Peter!
>
>
> -- 原始邮件 --
> *发件人:* "dev" ;
> *发送时间:* 2021年1月26日(星期二) 凌晨4:25
> *收件人:* "dev";
> *主题:* Re: Welcoming Peter Vary as a new committer!
>
> Congratulations!
>
> Op ma 25 jan
se, users don’t know what to do to
>pass table properties from Hive or Impala. If we exclude a prefix or
>specific properties, then everything but the properties reserved for
>locating the table are passed as the user would expect.
>
> I don't have a strong opinion about
Thanks, Peter. I answered inline.
On Mon, Nov 30, 2020 at 3:13 PM Peter Vary
wrote:
> Hi Zoltan,
>
> Answers below:
>
> On Nov 30, 2020, at 14:19, Zoltán Borók-Nagy <
> borokna...@cloudera.com.INVALID> wrote:
>
> Hi,
>
> Thanks for the replies. My take fo
-case basis.
>
>
> Based on this:
>
>- Shall we move the "how to get to" properties to SERDEPROPERTIES?
>- Shall we define a prefix for setting Iceberg table properties from
>Hive queries and omitting other engine specific properties?
>
>
> Tha
Hi,
The above aligns with what we did in Impala, i.e. we store information
about table loading in HMS table properties. We are just a bit more
explicit about which catalog to use.
We have table property 'iceberg.catalog' to determine the catalog type,
right now the supported values are 'hadoop.tab
Hi Everyone,
In Impala we face the same challenges. I think a strict 1-to-1 type mapping
would be beneficial because that way we could derive the Iceberg schema
from the Hive schema, not just the other way around. So we could just
naturally create Iceberg tables via DDL.
We should use the same ty
Hi,
I'm willing to add INSERT support for Iceberg tables in Impala.
For start I created the following design doc:
https://docs.google.com/document/d/1_KL0YptDKwhiXvJyx4Vb-yZjggrPQAW2yjeGV4C0vMU/edit?usp=sharing
All comments are welcome.
Thanks,
Zoltan
25 matches
Mail list logo