I'd like to also propose adding the following in the external section:
1. the PrestoDB equivalent for each item listed for Trino. I am not sure
what's the best way to track them, but I feel it's better to list and track
them separately. I have talked with related people currently maintaining
the PrestoDB Iceberg connector (mostly in Twitter), and they would like to
take a different route from Trino to fully remove Hive dependencies in the
connector. This means the 2 connectors will likely diverge in
implementation in the near future.
2. adding a medium item for Trino and PrestoDB Avro support
3. adding a small item for Trino and PrestoDB full system table support
(the system table schema in them are diverging from core, and missing a few
latest system tables)

For the items listed with "Spec" and "Spec v3", what are the key
differences? I thought we are treating any new spec changes after the
format v2 vote as v3.

Best,
Jack Ye

On Mon, Sep 13, 2021 at 7:13 AM Gidon Gershinsky <gg5...@gmail.com> wrote:

> Hi Ryan,
>
> I just wonder if the encryption should be a Spec v3 category. We have the
> key_metadata fields in both data_file and manifest_file structs, which
> might be sufficient for a reasonable basic encryption support.
> But I certainly agree this is an L-sized project.
>
> Cheers, Gidon
>
>
> On Sat, Sep 11, 2021 at 12:38 AM Ryan Blue <b...@tabular.io> wrote:
>
>> Hi everyone,
>>
>> At the last sync meeting, we brought up publishing a community roadmap
>> and brainstormed the many features and initiatives that the community is
>> working on. In this thread, I want to make sure that we have a good list of
>> what people are thinking about and I think we should try to categorize the
>> projects by size and general priority. When we reach a rough agreement,
>> I’ll write this up and post it on the ASF site along with links to some
>> projects in Github.
>>
>> My rationale for attempting to prioritize projects is that if we try to
>> do too many things, it will be slower progress across everything rather
>> than getting a few important items done. I know that priorities don’t align
>> very cleanly in practice, but it is hopefully worth trying. To come up with
>> a priority, I’m trying to keep top priority items to a minimum by including
>> only one from each group (Spark, Flink, Python, etc.). The remaining items
>> are split between priority 2 and 3. Priority 3 is not urgent, including
>> things that can be plugged in (like other IO libraries), docs, etc.
>> Everything else is priority 2.
>>
>> That something isn’t priority 1 doesn’t mean it isn’t important or
>> progressing, just that it isn’t the current focus. I think of it this way:
>> if someone has extra time to review something, what should be next? That’s
>> top priority.
>>
>> Here’s my rough categorization. If you disagree, please speak up:
>>
>>    - If you think that something should be top priority, what gets moved
>>    to priority 2?
>>    - Should the priority for a project in 2 or 3 change?
>>    - Is the S/M/L size of a project wrong?
>>
>> Top priority, 1:
>>
>>    - API: Iceberg 1.0 [medium]
>>    - Spark: Merge-on-read plans [large]
>>    - Maintenance: Delete file compaction [medium]
>>    -
>>
>>    Flink: Upgrade to 1.13.2 (document compatibility) [medium]
>>    -
>>
>>    Python: Pythonic refactor [medium]
>>
>> Priority 2:
>>
>>    - ORC: Support delete files stored as ORC [small]
>>    - Spark: DSv2 streaming improvements [small]
>>    - Flink: Inline file compaction [small]
>>    - Flink: Support UPSERT [small]
>>    - Views: Spec [medium]
>>    - Spec: Z-ordering / Space-filling curves [medium]
>>    - Spec: Snapshot tagging and branching [small]
>>    - Spec: Secondary indexes [large]
>>    - Spec v3: Encryption [large]
>>    -
>>
>>    Spec v3: Relative paths [large]
>>    -
>>
>>    Spec v3: Default field values [medium]
>>
>> Priority 3:
>>
>>    - Docs: versioned docs [medium]
>>    - IO: Support Aliyun OSS/DLF [medium]
>>    - IO: Support Dell ECS [medium]
>>
>> External:
>>
>>    - Trino: Bucketed joins [small]
>>    - Trino: Row-level delete support [medium]
>>    - Trino: Merge-on-read plans [medium]
>>    - Trino: Multi-catalog support [small]
>>
>> --
>> Ryan Blue
>> Tabular
>>
>

Reply via email to