Congrats Huaxin!
Best,
Gang
On Thu, Feb 6, 2025 at 5:10 PM Szehon Ho wrote:
> Hi everyone,
>
> The Project Management Committee (PMC) for Apache Iceberg has
> invited Huaxin Gao to become a committer, and I am happy to announce that
> she has accepted. Huaxin has done a lot of impressive work
Generally it makes sense to define separate language-specific
configurations.
I think we need to think about the following items:
1. Is it python-specific to add the prefix? Should Rust/Go be -rs/-go as
the convention?
2. Which part of the spec is the best place to describe this? It seems that
we
+1 (non-binding)
On Thu, Jan 16, 2025 at 2:30 PM Péter Váry
wrote:
> +1
>
> Steven Wu ezt írta (időpont: 2025. jan. 16., Cs,
> 0:46):
>
>> +1
>>
>> On Wed, Jan 15, 2025 at 9:00 AM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> Hi Everyone!
>>>
>>> PR: https://github.com/apache/ic
IIUC, iceberg-parquet depends on iceberg-arrow for the vectored reader
implementation (though partially supported). Should we relocate
iceberg-arrow together?
Since I have mentioned that the vectored reader implementation is partially
supported, is it a direction that needs to be improved? There i
For C++, I think it is aimed for a full featured C++ library (not for
puffin implementation only).
On Thu, Dec 12, 2024 at 6:14 AM rdb...@gmail.com wrote:
> I'll update it. Thanks!
>
> (By the way, the Avro default value support was in the Java section)
>
> On Wed, Dec 11, 2024 at 2:00 PM Matt T
Congrats Matt!
On Tue, Dec 10, 2024 at 8:57 PM Sung Yun wrote:
> Congratulations Matt!
>
> On 2024/12/10 12:49:25 Alex Dutra wrote:
> > Congratulations, Matt! Go!!
> >
> > On Tue, Dec 10, 2024 at 1:08 PM Péter Váry
> > wrote:
> >
> > > Congratulations Matt!
> > >
> > > On Tue, Dec 10, 2024, 12:
umplido wrote:
>
>> This sounds awesome. I am looking forward to the slack channel being
>> available so I can also help!
>>
>> El vie, 22 nov 2024 a las 10:03, Gang Wu () escribió:
>> >
>> > Thanks for the support, Fokko and JB!
>> >
>> &g
from
> the Impala community we could add some additional auxiliary functionality
> for the V3 positional deletes later on.
> > 2) I learned that a part of the community is interested in having a C++
> implementation of the Iceberg lib in general for their C++ engine. cc @Gang
&g
d that a part of the community is interested in having a C++
> implementation of the Iceberg lib in general for their C++ engine. cc @Gang
> Wu
>
> There seemed to be general support from the community to start up such a
> sub-project, so I'm reaching out now to ask for some gu
Thanks Russell for bringing this up!
+1 on deprecating equality deletes.
IMHO, this is something that should reside only in the ingestion engine.
Best,
Gang
On Thu, Oct 31, 2024 at 5:07 AM Russell Spitzer
wrote:
> Background:
>
> 1) Position Deletes
>
>
> Writers determine what rows are delet
+1 (non-binding)
Best,
Gang
On Wed, Oct 30, 2024 at 5:46 AM Anton Okolnychyi
wrote:
> Hi folks,
>
> We have been discussing the new layout for position deletes in V3 for a
> while now. It seems the community reached consensus. I'd like to start a
> vote on adding deletion vectors to the V3 spec
Hi,
It won't be an issue if there is already an iceberg-cpp implementation.
However, it is unfortunate to see duplicate efforts from different query
engines to implement their own C++ Iceberg reader and writers. Is it a good
chance to add official C++ implementation by providing a puffin
reader/wr
>>>
>>> This could be developed separately and then be represented in Arrow
>>> using an extension type (perhaps a canonical one as in
>>> https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html).
>>>
>>> What do other Arrow developers
gt;>
>> Hi Gang,
>>
>> Sorry, but can you give a pointer to the start of this discussion thread
>> in a readable format (for example a mailing-list archive)? It appears
>> that dev@arrow wasn't cc'ed from the start and that can make it
>> difficult to und
usion
> > extension that operates on this [1], and already have some ideas on how
> > such an extension type might be defined. I'm not yet caught up on the
> > shredded specification, but I think having just the binary format would
> be
> > beneficial for in-memory an
Hi Micah,
If we go with the approach that type promotion results in a change in the
field-id, what happens when a certain field has been changed
multiple times? Does it mean that we end up with tracking the lineage of
field change history?
Thanks,
Gang
On Tue, Aug 20, 2024 at 7:34 AM Micah Kornf
.
>
> It is worth noting that we also need to standardize many functions
> related to it.
>
> A neutral place to maintain it is a great choice.
>
> - As Gang Wu said, a standalone project is good, just like RoaringBitmap
> [1].
> - As Ryan said, Parquet community is a ne
different and I don't think this should
>> block forking the spec, but we should make sure that the decision is
>> publicly documented within both communities.
>>
>> Thanks,
>> Micah
>>
>> On Thu, Aug 15, 2024 at 7:47 AM Russell Spitzer <
>> russel
Sorry for chiming in late.
>From the discussion in
https://lists.apache.org/thread/xcyytoypgplfr74klg1z2rgjo6k5b0sq, I don't
quite understand why it is logistically complicated to create a sub-project
to hold the variant spec and impl.
IMHO, coping the variant type spec into Apache Iceberg has so
Just give my two cents. Not all tables have partition definition and
table-level stats would
benefit these tables. In addition, NDV might not be easily populated from
partition-level
statistics.
Thanks,
Gang
On Tue, Aug 6, 2024 at 9:48 PM Xianjin YE wrote:
> Thanks for raising the discussion Hu
Congrats!
On Tue, Jul 23, 2024 at 10:17 PM Russell Spitzer
wrote:
> "so many" :)
>
> On Tue, Jul 23, 2024 at 9:14 AM Russell Spitzer
> wrote:
>
>> This is truly an exciting day. To have to many qualified folks being
>> recognized by the Iceberg project fills me with pride. I can't wait to see
>
> The min/max stats are discussed in the doc (Phase 2), depending on the
non-trivial encoding.
Just want to add that min/max stats filtering could be supported by file
format natively. Adding geometry type to parquet spec
is under discussion: https://github.com/apache/parquet-format/pull/240
Best
> We may need some guidance on just how many we need to look at;
> we were planning on Spark and Trino, but weren't sure how much
> further down the rabbit hole we needed to go。
There are some engines living outside the Java world. It would be
good if the proposal could cover the effort it takes t
Hi,
This sounds very interesting!
IIUC, the current variant type in the Apache Spark stores data in the
BINARY type. When it comes to subcolumnarization, does it require the file
format (e.g. Apache Parquet/ORC/Avro) to support variant type natively?
Best,
Gang
On Sat, May 11, 2024 at 1:07 PM T
24 matches
Mail list logo