Hi Ryan, I just wonder if the encryption should be a Spec v3 category. We have the key_metadata fields in both data_file and manifest_file structs, which might be sufficient for a reasonable basic encryption support. But I certainly agree this is an L-sized project.
Cheers, Gidon On Sat, Sep 11, 2021 at 12:38 AM Ryan Blue <b...@tabular.io> wrote: > Hi everyone, > > At the last sync meeting, we brought up publishing a community roadmap and > brainstormed the many features and initiatives that the community is > working on. In this thread, I want to make sure that we have a good list of > what people are thinking about and I think we should try to categorize the > projects by size and general priority. When we reach a rough agreement, > I’ll write this up and post it on the ASF site along with links to some > projects in Github. > > My rationale for attempting to prioritize projects is that if we try to do > too many things, it will be slower progress across everything rather than > getting a few important items done. I know that priorities don’t align very > cleanly in practice, but it is hopefully worth trying. To come up with a > priority, I’m trying to keep top priority items to a minimum by including > only one from each group (Spark, Flink, Python, etc.). The remaining items > are split between priority 2 and 3. Priority 3 is not urgent, including > things that can be plugged in (like other IO libraries), docs, etc. > Everything else is priority 2. > > That something isn’t priority 1 doesn’t mean it isn’t important or > progressing, just that it isn’t the current focus. I think of it this way: > if someone has extra time to review something, what should be next? That’s > top priority. > > Here’s my rough categorization. If you disagree, please speak up: > > - If you think that something should be top priority, what gets moved > to priority 2? > - Should the priority for a project in 2 or 3 change? > - Is the S/M/L size of a project wrong? > > Top priority, 1: > > - API: Iceberg 1.0 [medium] > - Spark: Merge-on-read plans [large] > - Maintenance: Delete file compaction [medium] > - > > Flink: Upgrade to 1.13.2 (document compatibility) [medium] > - > > Python: Pythonic refactor [medium] > > Priority 2: > > - ORC: Support delete files stored as ORC [small] > - Spark: DSv2 streaming improvements [small] > - Flink: Inline file compaction [small] > - Flink: Support UPSERT [small] > - Views: Spec [medium] > - Spec: Z-ordering / Space-filling curves [medium] > - Spec: Snapshot tagging and branching [small] > - Spec: Secondary indexes [large] > - Spec v3: Encryption [large] > - > > Spec v3: Relative paths [large] > - > > Spec v3: Default field values [medium] > > Priority 3: > > - Docs: versioned docs [medium] > - IO: Support Aliyun OSS/DLF [medium] > - IO: Support Dell ECS [medium] > > External: > > - Trino: Bucketed joins [small] > - Trino: Row-level delete support [medium] > - Trino: Merge-on-read plans [medium] > - Trino: Multi-catalog support [small] > > -- > Ryan Blue > Tabular >