Thanks, Yan! To summarize that doc a bit, the main blockers are: * Finish updating the spec for NaN counters and behavior * Fix the issue with partition transforms and values before 1970 (#1680) * Partition evolution: Add lastPartitionFieldId to table metadata and update docs * Add order id column to manifest files * Track the schema of each snapshot
Only the last one is a somewhat large task, but even that should be fairly quick. I think we can take care of those in the first couple months of 2021 after the 0.11.0 release is out. On Fri, Dec 18, 2020 at 12:59 AM OpenInx <open...@gmail.com> wrote: > Thanks Yan for the document, I will take a look at it, and see what I can > do. > > On Fri, Dec 18, 2020 at 3:38 AM Yan Yan <yyany...@gmail.com> wrote: > >> Hi OpenInx, >> >> Thanks for bringing this up. I am currently working on Format v2 blocking >> tasks, and am maintaining a full list of blocking tasks with their >> description and current status here >> <https://docs.google.com/document/d/1FyLJyvzcZbfbjwDMEZd6Dj-LYCfrzK1zC-Bkb3OiICc/edit?usp=sharing> >> after >> speaking with Ryan a while ago, which covers all open issues listed in the >> github milestone <https://github.com/apache/iceberg/milestone/7> plus >> some others brought up by people during community sync. It would be great >> if you are interested in collaborating/code reviewing! >> >> Everyone please feel free to let me know/update the doc if you see any >> item missing/described inaccurately. >> >> Thanks, >> Yan >> >> On Wed, Dec 16, 2020 at 11:03 PM OpenInx <open...@gmail.com> wrote: >> >>> Hi >>> >>> I wrote this email to align with the community about the time to expose >>> format v2 to end users. >>> >>> In iceberg format v2, we've accomplished the row-level delete. It's >>> designed for two user cases: >>> >>> 1. Execute a single query to update or delete lots of rows. It's a >>> typical batch update/delete job, which is suitable for GDPR or the case >>> that we want to correct the wrong data. >>> 2. Write the real-time CDC/UPSERT stream to the iceberg table, so that >>> the upper layer compute engines could analyze the change log in minutes. >>> It's almost ready in the current master branch for flink integration. >>> >>> >>> I'm not quite sure what's the blocker about the iceberg format v2 now. >>> I'd love to resolve those blockers if there're some. >>> >>> Thanks. >>> >> -- Ryan Blue Software Engineer Netflix