Re: Very large _metadata file

Jacob Sevart Fri, 13 Mar 2020 19:23:00 -0700

Running *Checkpoints.loadCheckpointMetadata *under a debugger, I found
something:
*subtaskState.managedOperatorState[0].sateNameToPartitionOffsets("startup-times").offsets.value
*weights
43MB (5.3 million longs).


"startup-times" is an operator state of mine (union list of
java.time.Instant). I see a way to end up fewer items in the list, but I'm
not sure how the actual size is related to the number of offsets. Can you
elaborate on that?

Incidentally, 42.5MB is the number I got out of
https://issues.apache.org/jira/browse/FLINK-14618. So I think my two
problems are closely related.

Jacob

On Mon, Mar 9, 2020 at 6:36 AM Congxian Qiu <qcx978132...@gmail.com> wrote:

> Hi
>
> As Gordon said, the metadata will contain the ByteStreamStateHandle, when
> writing out the ByteStreamStateHandle, will write out the handle name --
> which is a path(as you saw). The ByteStreamStateHandle will be created when
> state size is small than `state.backend.fs.memory-threshold`(default is
> 1024).
>
> If you want to verify this, you can ref the unit test
> `CheckpointMetadataLoadingTest#testLoadAndValidateSavepoint` and load the
> metadata, you can find out that there are many `ByteStreamStateHandle`, and
> their names are the strings you saw in the metadata.
>
> Best,
> Congxian
>
>
> Jacob Sevart <jsev...@uber.com> 于2020年3月6日周五 上午3:57写道：
>
>> Thanks, I will monitor that thread.
>>
>> I'm having a hard time following the serialization code, but if you know
>> anything about the layout, tell me if this makes sense. What I see in the
>> hex editor is, first, many HDFS paths. Then gigabytes of unreadable data.
>> Then finally another HDFS path at the end.
>>
>> If it is putting state in there, under normal circumstances, does it make
>> sense that it would be interleaved with metadata? I would expect all the
>> metadata to come first, and then state.
>>
>> Jacob
>>
>>
>>
>> Jacob
>>
>> On Thu, Mar 5, 2020 at 10:53 AM Kostas Kloudas <kklou...@gmail.com>
>> wrote:
>>
>>> Hi Jacob,
>>>
>>> As I said previously I am not 100% sure what can be causing this
>>> behavior, but this is a related thread here:
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_r3bfa2a3368a9c7850cba778e4decfe4f6dba9607f32addb69814f43d-2540-253Cuser.flink.apache.org-253E&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=awEv6FqKY6dZ8NIA4KEFc_qQ6aadR_jTAWnO17wtAus&s=P3Xd0IFKJTDIG2MMeP-hOSfY4ohoCEUMQEJhvGecSlI&e=
>>>
>>> Which you can re-post your problem and monitor for answers.
>>>
>>> Cheers,
>>> Kostas
>>>
>>> On Wed, Mar 4, 2020 at 7:02 PM Jacob Sevart <jsev...@uber.com> wrote:
>>> >
>>> > Kostas and Gordon,
>>> >
>>> > Thanks for the suggestions! I'm on RocksDB. We don't have that setting
>>> configured so it should be at the default 1024b. This is the full "state.*"
>>> section showing in the JobManager UI.
>>> >
>>> >
>>> >
>>> > Jacob
>>> >
>>> > On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <
>>> tzuli...@apache.org> wrote:
>>> >>
>>> >> Hi Jacob,
>>> >>
>>> >> Apart from what Klou already mentioned, one slightly possible reason:
>>> >>
>>> >> If you are using the FsStateBackend, it is also possible that your
>>> state is small enough to be considered to be stored inline within the
>>> metadata file.
>>> >> That is governed by the "state.backend.fs.memory-threshold"
>>> configuration, with a default value of 1024 bytes, or can also be
>>> configured with the `fileStateSizeThreshold` argument when constructing the
>>> `FsStateBackend`.
>>> >> The purpose of that threshold is to ensure that the backend does not
>>> create a large amount of very small files, where potentially the file
>>> pointers are actually larger than the state itself.
>>> >>
>>> >> Cheers,
>>> >> Gordon
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com>
>>> wrote:
>>> >>>
>>> >>> Hi Jacob,
>>> >>>
>>> >>> Could you specify which StateBackend you are using?
>>> >>>
>>> >>> The reason I am asking is that, from the documentation in [1]:
>>> >>>
>>> >>> "Note that if you use the MemoryStateBackend, metadata and savepoint
>>> >>> state will be stored in the _metadata file. Since it is
>>> >>> self-contained, you may move the file and restore from any location."
>>> >>>
>>> >>> I am also cc'ing Gordon who may know a bit more about state formats.
>>> >>>
>>> >>> I hope this helps,
>>> >>> Kostas
>>> >>>
>>> >>> [1]
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.6_ops_state_savepoints.html&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=awEv6FqKY6dZ8NIA4KEFc_qQ6aadR_jTAWnO17wtAus&s=fw0c-Ct21HHJv4MzZRicIaltqHLQOrNvqchzNgCdwkA&e=
>>> >>>
>>> >>> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com>
>>> wrote:
>>> >>> >
>>> >>> > Per the documentation:
>>> >>> >
>>> >>> > "The meta data file of a Savepoint contains (primarily) pointers
>>> to all files on stable storage that are part of the Savepoint, in form of
>>> absolute paths."
>>> >>> >
>>> >>> > I somehow have a _metadata file that's 1.9GB. Running strings on
>>> it I find 962 strings, most of which look like HDFS paths, which leaves a
>>> lot of that file-size unexplained. What else is in there, and how exactly
>>> could this be happening?
>>> >>> >
>>> >>> > We're running 1.6.
>>> >>> >
>>> >>> > Jacob
>>> >
>>> >
>>> >
>>> > --
>>> > Jacob Sevart
>>> > Software Engineer, Safety
>>>
>>
>>
>> --
>> Jacob Sevart
>> Software Engineer, Safety
>>
>

-- 
Jacob Sevart
Software Engineer, Safety

Re: Very large _metadata file

Reply via email to