Hi folks,

I forgot to provide some background about this thread. The reason for
this thread is because I think it's important to give visibility to
our community, not necessarily with strong dates, but more about when
roughly what could be expected. Without this, it's pretty hard for our
users to define their own roadmap.

We have this page https://iceberg.apache.org/roadmap/. I'm not sure
it's actually up to date.
I also proposed this https://github.com/apache/iceberg/pull/9666 to
give a rough idea.

So I think it would be good to have a consensus about the roadmap and
update roadmap page on the website to have some visibility (it would
be helpful for us too :)).

Thoughts ?

Regards
JB

On Thu, Mar 7, 2024 at 7:43 PM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
>
> Hi Ryan
>
> Yeah I agree to separate discussions on each topic. Actually that was my 
> intention ;)
>
> I just wanted to have thoughts from everyone about roadmap/timeline.
>
> Jack and I will start a dedicated thread about REST catalog.
>
> Thanks !
>
> Regards
> JB
>
>
> Le jeu. 7 mars 2024 à 18:34, Ryan Blue <b...@tabular.io> a écrit :
>>
>> Hi JB,
>>
>> Specs and libraries are versioned separately. In fact, the v2 spec has 
>> already been voted on and adopted. The next spec version is v3.
>>
>> I think we do want to get to a 2.0 of the Java library sometime soon to drop 
>> some deprecated APIs and clean up a few things, but I don't think that we're 
>> quite ready to take that on right now, which is likely why there has been 
>> little activity on this thread.
>>
>> I also think that most of these things are going to be discussion points 
>> that we cover as separate topics, rather than one big "everything 2.0" 
>> thread. It just doesn't seem manageable to me to cover them all at once. 
>> Maybe that's just me though.
>>
>> Ryan
>>
>> On Thu, Mar 7, 2024 at 7:49 AM Jean-Baptiste Onofré <j...@nanthrax.net> 
>> wrote:
>>>
>>> Hi guys,
>>>
>>> Let me ping again on this thread ;)
>>>
>>> I think it would be great to give some visibility to the community,
>>> especially about Spec v3 and Iceberg 2.0.0.
>>>
>>> Any comments about Spec V2 / Iceberg 2.0.0 ?
>>>
>>> Thanks !
>>> Regards
>>> JB
>>>
>>> On Fri, Feb 16, 2024 at 4:52 PM Jean-Baptiste Onofré <j...@nanthrax.net> 
>>> wrote:
>>> >
>>> > Hi guys,
>>> >
>>> > During the last community meeting, we started to quickly discuss Iceberg 
>>> > 2.0.
>>> > I was quite surprised it came during the community meeting because I
>>> > don't remember having a previous discussion (on the mailing list)
>>> > about that.
>>> >
>>> > So, I would like to have to start an open discussion about our
>>> > community driven roadmap.
>>> >
>>> > I see the following topics that should be discussed (maybe as proposed
>>> > by Brian we can have corresponding GitHub issues tagged with
>>> > "discussion" flag). That's open questions, feel free to add points I
>>> > missed:
>>> >
>>> > * Spec v3
>>> >     We have the discussion about ts_nanosecond, and other enhancements
>>> > in the spec. Do we plan to have Iceberg 2.0 with Spec v3 ? What do we
>>> > plan to include in spec v3 as a target ?
>>> > * Catalogs
>>> >     We have a consensus that we have too many catalogs, especially
>>> > with different capabilities/issues. Jack already started the
>>> > discussion to deprecate DynamoDBCatalog. The discussion is:
>>> >      - Where do we want the catalog to leave (repository) ?
>>> >      - What catalogs do we want to deprecate (HadoopCatalog for instance 
>>> > :)) ?
>>> >      - Do we want to have the REST Catalog as a kind of façade for
>>> > other catalog/backend ?
>>> > * REST Catalog
>>> >    If we want to use the REST Catalog as a façade, what are the
>>> > requirements to have it even more pluggable for both backend (other
>>> > catalogs) and the REST itself (authentication/authorization, runtime,
>>> > etc) ? Jack also started a discussion about permission on the REST
>>> > catalog.
>>> > * Engines
>>> >    What engines (and version) do we plan to still support ? What new
>>> > engines do we plan (for instance I can work on an Apache Beam and an
>>> > Apache Karaf powered engine) ?
>>> > * Data file formats / Table formats
>>> >    Do we plan to add/remove/update data file formats for 2.0 (Parquet,
>>> > ORC, ...) ?
>>> >    Same question about table formats ? Do we plan a kind of "tool" to
>>> > move data from table formats to Iceberg ?
>>> > * Data Injection (e.g. Kafka Connect sink)
>>> >    Iceberg 1.5.0 will include the first bricks of Kafka Connect, new
>>> > ones will come with 1.6+.
>>> >    What do we plan for Iceberg 2.0 on this front ? Do we plan an
>>> > additional layer next to Kafka Connect (for instance why not provide
>>> > an Apache Camel for read/write data to Iceberg) ?
>>> > * Rough date: depending on all previous points (and maybe others :)),
>>> > when do we target 2.0.0 ?
>>> >
>>> > That's a raw discussion start, I propose to create a GitHub
>>> > "Discussion" issue (flagged with 2.0.0 milestone) for each topic where
>>> > we have consensus.
>>> >
>>> > Thoughts ?
>>> >
>>> > Regards
>>> > JB
>>
>>
>>
>> --
>> Ryan Blue
>> Tabular

Reply via email to