I'd like to also propose adding the following in the external section: 1. the PrestoDB equivalent for each item listed for Trino. I am not sure what's the best way to track them, but I feel it's better to list and track them separately. I have talked with related people currently maintaining the PrestoDB Iceberg connector (mostly in Twitter), and they would like to take a different route from Trino to fully remove Hive dependencies in the connector. This means the 2 connectors will likely diverge in implementation in the near future. 2. adding a medium item for Trino and PrestoDB Avro support 3. adding a small item for Trino and PrestoDB full system table support (the system table schema in them are diverging from core, and missing a few latest system tables)
For the items listed with "Spec" and "Spec v3", what are the key differences? I thought we are treating any new spec changes after the format v2 vote as v3. Best, Jack Ye On Mon, Sep 13, 2021 at 7:13 AM Gidon Gershinsky <gg5...@gmail.com> wrote: > Hi Ryan, > > I just wonder if the encryption should be a Spec v3 category. We have the > key_metadata fields in both data_file and manifest_file structs, which > might be sufficient for a reasonable basic encryption support. > But I certainly agree this is an L-sized project. > > Cheers, Gidon > > > On Sat, Sep 11, 2021 at 12:38 AM Ryan Blue <b...@tabular.io> wrote: > >> Hi everyone, >> >> At the last sync meeting, we brought up publishing a community roadmap >> and brainstormed the many features and initiatives that the community is >> working on. In this thread, I want to make sure that we have a good list of >> what people are thinking about and I think we should try to categorize the >> projects by size and general priority. When we reach a rough agreement, >> I’ll write this up and post it on the ASF site along with links to some >> projects in Github. >> >> My rationale for attempting to prioritize projects is that if we try to >> do too many things, it will be slower progress across everything rather >> than getting a few important items done. I know that priorities don’t align >> very cleanly in practice, but it is hopefully worth trying. To come up with >> a priority, I’m trying to keep top priority items to a minimum by including >> only one from each group (Spark, Flink, Python, etc.). The remaining items >> are split between priority 2 and 3. Priority 3 is not urgent, including >> things that can be plugged in (like other IO libraries), docs, etc. >> Everything else is priority 2. >> >> That something isn’t priority 1 doesn’t mean it isn’t important or >> progressing, just that it isn’t the current focus. I think of it this way: >> if someone has extra time to review something, what should be next? That’s >> top priority. >> >> Here’s my rough categorization. If you disagree, please speak up: >> >> - If you think that something should be top priority, what gets moved >> to priority 2? >> - Should the priority for a project in 2 or 3 change? >> - Is the S/M/L size of a project wrong? >> >> Top priority, 1: >> >> - API: Iceberg 1.0 [medium] >> - Spark: Merge-on-read plans [large] >> - Maintenance: Delete file compaction [medium] >> - >> >> Flink: Upgrade to 1.13.2 (document compatibility) [medium] >> - >> >> Python: Pythonic refactor [medium] >> >> Priority 2: >> >> - ORC: Support delete files stored as ORC [small] >> - Spark: DSv2 streaming improvements [small] >> - Flink: Inline file compaction [small] >> - Flink: Support UPSERT [small] >> - Views: Spec [medium] >> - Spec: Z-ordering / Space-filling curves [medium] >> - Spec: Snapshot tagging and branching [small] >> - Spec: Secondary indexes [large] >> - Spec v3: Encryption [large] >> - >> >> Spec v3: Relative paths [large] >> - >> >> Spec v3: Default field values [medium] >> >> Priority 3: >> >> - Docs: versioned docs [medium] >> - IO: Support Aliyun OSS/DLF [medium] >> - IO: Support Dell ECS [medium] >> >> External: >> >> - Trino: Bucketed joins [small] >> - Trino: Row-level delete support [medium] >> - Trino: Merge-on-read plans [medium] >> - Trino: Multi-catalog support [small] >> >> -- >> Ryan Blue >> Tabular >> >