Thanks, I will monitor that thread. I'm having a hard time following the serialization code, but if you know anything about the layout, tell me if this makes sense. What I see in the hex editor is, first, many HDFS paths. Then gigabytes of unreadable data. Then finally another HDFS path at the end.
If it is putting state in there, under normal circumstances, does it make sense that it would be interleaved with metadata? I would expect all the metadata to come first, and then state. Jacob Jacob On Thu, Mar 5, 2020 at 10:53 AM Kostas Kloudas <kklou...@gmail.com> wrote: > Hi Jacob, > > As I said previously I am not 100% sure what can be causing this > behavior, but this is a related thread here: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_r3bfa2a3368a9c7850cba778e4decfe4f6dba9607f32addb69814f43d-2540-253Cuser.flink.apache.org-253E&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=awEv6FqKY6dZ8NIA4KEFc_qQ6aadR_jTAWnO17wtAus&s=P3Xd0IFKJTDIG2MMeP-hOSfY4ohoCEUMQEJhvGecSlI&e= > > Which you can re-post your problem and monitor for answers. > > Cheers, > Kostas > > On Wed, Mar 4, 2020 at 7:02 PM Jacob Sevart <jsev...@uber.com> wrote: > > > > Kostas and Gordon, > > > > Thanks for the suggestions! I'm on RocksDB. We don't have that setting > configured so it should be at the default 1024b. This is the full "state.*" > section showing in the JobManager UI. > > > > > > > > Jacob > > > > On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> > wrote: > >> > >> Hi Jacob, > >> > >> Apart from what Klou already mentioned, one slightly possible reason: > >> > >> If you are using the FsStateBackend, it is also possible that your > state is small enough to be considered to be stored inline within the > metadata file. > >> That is governed by the "state.backend.fs.memory-threshold" > configuration, with a default value of 1024 bytes, or can also be > configured with the `fileStateSizeThreshold` argument when constructing the > `FsStateBackend`. > >> The purpose of that threshold is to ensure that the backend does not > create a large amount of very small files, where potentially the file > pointers are actually larger than the state itself. > >> > >> Cheers, > >> Gordon > >> > >> > >> > >> On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com> > wrote: > >>> > >>> Hi Jacob, > >>> > >>> Could you specify which StateBackend you are using? > >>> > >>> The reason I am asking is that, from the documentation in [1]: > >>> > >>> "Note that if you use the MemoryStateBackend, metadata and savepoint > >>> state will be stored in the _metadata file. Since it is > >>> self-contained, you may move the file and restore from any location." > >>> > >>> I am also cc'ing Gordon who may know a bit more about state formats. > >>> > >>> I hope this helps, > >>> Kostas > >>> > >>> [1] > https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.6_ops_state_savepoints.html&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=awEv6FqKY6dZ8NIA4KEFc_qQ6aadR_jTAWnO17wtAus&s=fw0c-Ct21HHJv4MzZRicIaltqHLQOrNvqchzNgCdwkA&e= > >>> > >>> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com> wrote: > >>> > > >>> > Per the documentation: > >>> > > >>> > "The meta data file of a Savepoint contains (primarily) pointers to > all files on stable storage that are part of the Savepoint, in form of > absolute paths." > >>> > > >>> > I somehow have a _metadata file that's 1.9GB. Running strings on it > I find 962 strings, most of which look like HDFS paths, which leaves a lot > of that file-size unexplained. What else is in there, and how exactly could > this be happening? > >>> > > >>> > We're running 1.6. > >>> > > >>> > Jacob > > > > > > > > -- > > Jacob Sevart > > Software Engineer, Safety > -- Jacob Sevart Software Engineer, Safety