Also, small correction from earlier, there are 4 volumes of 256 GiB so that's 1 TiB total.
On Sat, Dec 12, 2020 at 10:08 AM Rex Fenley <r...@remind101.com> wrote: > Our first big test run we wanted to eliminate as many variables as > possible, so this is on 1 machine with 1 task manager and 1 parallelism. > The machine has 4 disks though, and as you can see, they mostly all use > around the same space for storage until a savepoint is triggered. > > Could it be that given a parallelism of 1, certain operator's states are > pinned to specific drives and as it's doing compaction it's moving > everything over to that drive into a single file? > In which case, would greater parallelism distribute the work more evenly? > > Thanks! > > > On Sat, Dec 12, 2020 at 2:35 AM David Anderson <dander...@apache.org> > wrote: > >> RocksDB does do compaction in the background, and incremental checkpoints >> simply mirror to S3 the set of RocksDB SST files needed by the current set >> of checkpoints. >> >> However, unlike checkpoints, which can be incremental, savepoints are >> always full snapshots. As for why one host would have much more state than >> the others, perhaps you have significant key skew, and one task manager is >> ending up with more than its share of state to manage. >> >> Best, >> David >> >> On Sat, Dec 12, 2020 at 12:31 AM Rex Fenley <r...@remind101.com> wrote: >> >>> Hi, >>> >>> We're using the Rocks state backend with incremental checkpoints and >>> savepoints setup for S3. We notice that every time we trigger a savepoint, >>> one of the local disks on our host explodes in disk usage. >>> What is it that savepoints are doing which would cause so much disk to >>> be used? >>> Our checkpoints are a few GiB in size, is the savepoint combining all >>> the checkpoints together at once on disk? >>> I figured that incremental checkpoints would compact over time in the >>> background, is that correct? >>> >>> Thanks >>> >>> Graph here. Parallelism is 1 and volume size is 256 GiB. >>> [image: Screen Shot 2020-12-11 at 2.59.59 PM.png] >>> >>> >>> -- >>> >>> Rex Fenley | Software Engineer - Mobile and Backend >>> >>> >>> Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> >>> | FOLLOW US <https://twitter.com/remindhq> | LIKE US >>> <https://www.facebook.com/remindhq> >>> >> > > -- > > Rex Fenley | Software Engineer - Mobile and Backend > > > Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | > FOLLOW US <https://twitter.com/remindhq> | LIKE US > <https://www.facebook.com/remindhq> > -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | FOLLOW US <https://twitter.com/remindhq> | LIKE US <https://www.facebook.com/remindhq>