Hi Jacob, As I said previously I am not 100% sure what can be causing this behavior, but this is a related thread here: https://lists.apache.org/thread.html/r3bfa2a3368a9c7850cba778e4decfe4f6dba9607f32addb69814f43d%40%3Cuser.flink.apache.org%3E
Which you can re-post your problem and monitor for answers. Cheers, Kostas On Wed, Mar 4, 2020 at 7:02 PM Jacob Sevart <jsev...@uber.com> wrote: > > Kostas and Gordon, > > Thanks for the suggestions! I'm on RocksDB. We don't have that setting > configured so it should be at the default 1024b. This is the full "state.*" > section showing in the JobManager UI. > > > > Jacob > > On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> > wrote: >> >> Hi Jacob, >> >> Apart from what Klou already mentioned, one slightly possible reason: >> >> If you are using the FsStateBackend, it is also possible that your state is >> small enough to be considered to be stored inline within the metadata file. >> That is governed by the "state.backend.fs.memory-threshold" configuration, >> with a default value of 1024 bytes, or can also be configured with the >> `fileStateSizeThreshold` argument when constructing the `FsStateBackend`. >> The purpose of that threshold is to ensure that the backend does not create >> a large amount of very small files, where potentially the file pointers are >> actually larger than the state itself. >> >> Cheers, >> Gordon >> >> >> >> On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com> wrote: >>> >>> Hi Jacob, >>> >>> Could you specify which StateBackend you are using? >>> >>> The reason I am asking is that, from the documentation in [1]: >>> >>> "Note that if you use the MemoryStateBackend, metadata and savepoint >>> state will be stored in the _metadata file. Since it is >>> self-contained, you may move the file and restore from any location." >>> >>> I am also cc'ing Gordon who may know a bit more about state formats. >>> >>> I hope this helps, >>> Kostas >>> >>> [1] >>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/savepoints.html >>> >>> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com> wrote: >>> > >>> > Per the documentation: >>> > >>> > "The meta data file of a Savepoint contains (primarily) pointers to all >>> > files on stable storage that are part of the Savepoint, in form of >>> > absolute paths." >>> > >>> > I somehow have a _metadata file that's 1.9GB. Running strings on it I >>> > find 962 strings, most of which look like HDFS paths, which leaves a lot >>> > of that file-size unexplained. What else is in there, and how exactly >>> > could this be happening? >>> > >>> > We're running 1.6. >>> > >>> > Jacob > > > > -- > Jacob Sevart > Software Engineer, Safety