> I'm reluctant to do this without an explicit call from the user or in a > service. The problem is when to expire snapshots. Iceberg is called regularly > to read and write tables. That might seem like a good time to expire > snapshots, but it doesn't make sense for either one to have a side effect of > physically deleting data files and discarding metadata. That's going beyond > user expectations to do destructive tasks. Plus, it changes the guarantees of > those operations, where reads should be as fast as possible and there may be > guarantees relying on writes not doing additional operations that could cause > failures.
Yep, makes sense. It is better to explain the need of expiring snapshots to the user and let them decide. > For Flink, we're creating a UUID for each checkpoint that writes files, > writing that into the snapshot summary, and then checking whether a known > snapshot had that ID when the write resumes after a failure. That sounds like > what you're suggesting here, but using queryId/epochId as the write ID. > Sounds like a good plan to me. Alright, I created two issues: - https://github.com/apache/incubator-iceberg/issues/178 <https://github.com/apache/incubator-iceberg/issues/178> (sink) - https://github.com/apache/incubator-iceberg/issues/179 <https://github.com/apache/incubator-iceberg/issues/179> (source) Thanks, Anton > On 6 May 2019, at 23:30, Ryan Blue <rb...@netflix.com.INVALID> wrote: > > Replies inline. > > On Mon, May 6, 2019 at 3:01 PM Anton Okolnychyi <aokolnyc...@apple.com > <mailto:aokolnyc...@apple.com>> wrote: > I am also wondering whether it makes sense to have a config that limits the > number of snapshot we want to track. This config can be based on the number > of snapshots (e.g. keep only 10000 snapshots) or based on time (e.g. keep > snapshots for the last 7 days). We can implement both, actually. AFAIK, the > expiration of snapshots is manual right now. Would it make sense to control > this via config options or do we expect that users do this? > > I'm reluctant to do this without an explicit call from the user or in a > service. The problem is when to expire snapshots. Iceberg is called regularly > to read and write tables. That might seem like a good time to expire > snapshots, but it doesn't make sense for either one to have a side effect of > physically deleting data files and discarding metadata. That's going beyond > user expectations to do destructive tasks. Plus, it changes the guarantees of > those operations, where reads should be as fast as possible and there may be > guarantees relying on writes not doing additional operations that could cause > failures. > > Spark provides queryId and epochId/batchId to all sinks, which must ensure > that all writes are idempotent. Spark might try to commit the same batch > multiple times. So, we need to know the latest committed batchId for every > query. One option is to store this information in the table metadata. > However, this breaks time traveling and rollbacks. We need to have this > mapping per snapshot. Snapshot summary seems like a reasonable choice. Would > it make sense to do smth similar to “total-records” and “total-files” to keep > the latest committed batch id for each query? Any other ideas are welcome. > > For Flink, we're creating a UUID for each checkpoint that writes files, > writing that into the snapshot summary, and then checking whether a known > snapshot had that ID when the write resumes after a failure. That sounds like > what you're suggesting here, but using queryId/epochId as the write ID. > Sounds like a good plan to me. > > rb > > -- > Ryan Blue > Software Engineer > Netflix