Not directly related to this topic, but still pretty interesting as we mentioned the PR for rewriting manifests.
Ryan, could you, also, share some insights on how you do compactions? Do you compact metadata separately from bin-packing files? How frequently do you expire snapshots? Do you expose SQL APIs for this or it is all happening automatically? Thanks, Anton > On 3 Jun 2019, at 22:28, Erik Wright <erik.wri...@shopify.com.INVALID> wrote: > > Thanks for sharing those observations. They are very pertinent. > > On Mon, Jun 3, 2019 at 5:19 PM Ryan Blue <rb...@netflix.com> wrote: > Repeated conflicts is something that we keep an eye on in our infrastructure. > We have streaming tables that are written to every 10 minutes from multiple > regions, commits to move the files back to a single region, and compaction > all happening at the same time. We don't really see a significant problem > with several writers. The manifest list files are generally small enough that > it's okay. Definitely better than keeping all that information in the root > metadata file. > > On Mon, Jun 3, 2019 at 2:13 PM Erik Wright <erik.wri...@shopify.com> wrote: > Thanks for the response, Ryan. I can certainly see the benefits of manifest > files are. I can see that with potentially long lists of valid snapshots, > each having long lists of manifest files, the mere process of committing a > new snapshot could, itself, become costly and increase the likelihood of > commit conflicts. > > I gather that the potential for repeated commit conflicts due to the cost of > rewriting the manifest list file after each failed attempt is not something > that has really materialized yet. > > On Mon, Jun 3, 2019 at 4:50 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > Hi Erik, > > Manifest lists serve two purposes: > > • Reduce the amount of data tracked by the root metadata file > • Provide a rough index over manifest files to cut down on planning time > Manifests are reused to cut down on the amount of work required in a commit, > but by doing this we end up with a large number of manifests. That list gets > expensive if it is added to the root metadata, which includes all valid > snapshots. So moving that list to its own file allows Iceberg to avoid > reading the list unless it is used, and to avoid re-writing the list for > every valid snapshot. > > As long as the list is written to its own file, we may as well write metadata > about partitions in each manifest so that we can skip manifests that don’t > match a query. That’s where the rough index comes from, and it really does > speed up queries. In fact, we have a new PR out to rewrite manifests to take > advantage of this: https://github.com/apache/incubator-iceberg/pull/200/files > > Does that answer your question? > > > On Mon, Jun 3, 2019 at 1:38 PM Erik Wright <erik.wri...@shopify.com.invalid> > wrote: > In the process of following up on the "Updates/Deletes/Upserts" thread, I'm > re-reading the table spec. I have a question about Manifest List files. > > If I understand correctly, the manifest list files are separate files that > are created prior to attempting to commit a new snapshot. Each snapshot may > have a single manifest list file. The manifest list file references _all_ > manifest files included in the snapshot. > > During a commit collision, two writers will produce new manifest list files. > Assuming the two writes are compatible (one is append, one is replace, for > example) the loser should be able to re-process their commit without > rewriting any data files but will, nonetheless, need to rewrite their > manifest list file in addition to rewriting their snapshot file. > > I was under the impression that it was a design objective to minimize the > amount of work required in order to retry a commit. The inability to compose > multiple manifest list files together seems like it adds mandatory read and > write steps to almost every commit collision. > > Can someone clarify what the philosophy is with regards to minimizing the > cost of commit retries? > > Thanks! > > -Erik > > > -- > Ryan Blue > Software Engineer > Netflix > > > -- > Ryan Blue > Software Engineer > Netflix