Re: Very large _metadata file

Jacob Sevart Fri, 13 Mar 2020 19:24:00 -0700

Oh, I should clarify that's 43MB per partition, so with 48 partitions it
explains my 2GB.


On Fri, Mar 13, 2020 at 7:21 PM Jacob Sevart <jsev...@uber.com> wrote:

> Running *Checkpoints.loadCheckpointMetadata *under a debugger, I found
> something:
> *subtaskState.managedOperatorState[0].sateNameToPartitionOffsets("startup-times").offsets.value
>  *weights
> 43MB (5.3 million longs).
>
> "startup-times" is an operator state of mine (union list of
> java.time.Instant). I see a way to end up fewer items in the list, but I'm
> not sure how the actual size is related to the number of offsets. Can you
> elaborate on that?
>
> Incidentally, 42.5MB is the number I got out of
> https://issues.apache.org/jira/browse/FLINK-14618. So I think my two
> problems are closely related.
>
> Jacob
>
> On Mon, Mar 9, 2020 at 6:36 AM Congxian Qiu <qcx978132...@gmail.com>
> wrote:
>
>> Hi
>>
>> As Gordon said, the metadata will contain the ByteStreamStateHandle, when
>> writing out the ByteStreamStateHandle, will write out the handle name --
>> which is a path(as you saw). The ByteStreamStateHandle will be created when
>> state size is small than `state.backend.fs.memory-threshold`(default is
>> 1024).
>>
>> If you want to verify this, you can ref the unit test
>> `CheckpointMetadataLoadingTest#testLoadAndValidateSavepoint` and load the
>> metadata, you can find out that there are many `ByteStreamStateHandle`, and
>> their names are the strings you saw in the metadata.
>>
>> Best,
>> Congxian
>>
>>
>> Jacob Sevart <jsev...@uber.com> 于2020年3月6日周五 上午3:57写道：
>>
>>> Thanks, I will monitor that thread.
>>>
>>> I'm having a hard time following the serialization code, but if you know
>>> anything about the layout, tell me if this makes sense. What I see in the
>>> hex editor is, first, many HDFS paths. Then gigabytes of unreadable data.
>>> Then finally another HDFS path at the end.
>>>
>>> If it is putting state in there, under normal circumstances, does it
>>> make sense that it would be interleaved with metadata? I would expect all
>>> the metadata to come first, and then state.
>>>
>>> Jacob
>>>
>>>
>>>
>>> Jacob
>>>
>>> On Thu, Mar 5, 2020 at 10:53 AM Kostas Kloudas <kklou...@gmail.com>
>>> wrote:
>>>
>>>> Hi Jacob,
>>>>
>>>> As I said previously I am not 100% sure what can be causing this
>>>> behavior, but this is a related thread here:
>>>>
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_r3bfa2a3368a9c7850cba778e4decfe4f6dba9607f32addb69814f43d-2540-253Cuser.flink.apache.org-253E&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=awEv6FqKY6dZ8NIA4KEFc_qQ6aadR_jTAWnO17wtAus&s=P3Xd0IFKJTDIG2MMeP-hOSfY4ohoCEUMQEJhvGecSlI&e=
>>>>
>>>> Which you can re-post your problem and monitor for answers.
>>>>
>>>> Cheers,
>>>> Kostas
>>>>
>>>> On Wed, Mar 4, 2020 at 7:02 PM Jacob Sevart <jsev...@uber.com> wrote:
>>>> >
>>>> > Kostas and Gordon,
>>>> >
>>>> > Thanks for the suggestions! I'm on RocksDB. We don't have that
>>>> setting configured so it should be at the default 1024b. This is the full
>>>> "state.*" section showing in the JobManager UI.
>>>> >
>>>> >
>>>> >
>>>> > Jacob
>>>> >
>>>> > On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <
>>>> tzuli...@apache.org> wrote:
>>>> >>
>>>> >> Hi Jacob,
>>>> >>
>>>> >> Apart from what Klou already mentioned, one slightly possible reason:
>>>> >>
>>>> >> If you are using the FsStateBackend, it is also possible that your
>>>> state is small enough to be considered to be stored inline within the
>>>> metadata file.
>>>> >> That is governed by the "state.backend.fs.memory-threshold"
>>>> configuration, with a default value of 1024 bytes, or can also be
>>>> configured with the `fileStateSizeThreshold` argument when constructing the
>>>> `FsStateBackend`.
>>>> >> The purpose of that threshold is to ensure that the backend does not
>>>> create a large amount of very small files, where potentially the file
>>>> pointers are actually larger than the state itself.
>>>> >>
>>>> >> Cheers,
>>>> >> Gordon
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> Hi Jacob,
>>>> >>>
>>>> >>> Could you specify which StateBackend you are using?
>>>> >>>
>>>> >>> The reason I am asking is that, from the documentation in [1]:
>>>> >>>
>>>> >>> "Note that if you use the MemoryStateBackend, metadata and savepoint
>>>> >>> state will be stored in the _metadata file. Since it is
>>>> >>> self-contained, you may move the file and restore from any
>>>> location."
>>>> >>>
>>>> >>> I am also cc'ing Gordon who may know a bit more about state formats.
>>>> >>>
>>>> >>> I hope this helps,
>>>> >>> Kostas
>>>> >>>
>>>> >>> [1]
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.6_ops_state_savepoints.html&d=DwIBaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=lTq5mEceM-U-tVfWzKBngg&m=awEv6FqKY6dZ8NIA4KEFc_qQ6aadR_jTAWnO17wtAus&s=fw0c-Ct21HHJv4MzZRicIaltqHLQOrNvqchzNgCdwkA&e=
>>>> >>>
>>>> >>> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com>
>>>> wrote:
>>>> >>> >
>>>> >>> > Per the documentation:
>>>> >>> >
>>>> >>> > "The meta data file of a Savepoint contains (primarily) pointers
>>>> to all files on stable storage that are part of the Savepoint, in form of
>>>> absolute paths."
>>>> >>> >
>>>> >>> > I somehow have a _metadata file that's 1.9GB. Running strings on
>>>> it I find 962 strings, most of which look like HDFS paths, which leaves a
>>>> lot of that file-size unexplained. What else is in there, and how exactly
>>>> could this be happening?
>>>> >>> >
>>>> >>> > We're running 1.6.
>>>> >>> >
>>>> >>> > Jacob
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Jacob Sevart
>>>> > Software Engineer, Safety
>>>>
>>>
>>>
>>> --
>>> Jacob Sevart
>>> Software Engineer, Safety
>>>
>>
>
> --
> Jacob Sevart
> Software Engineer, Safety
>


-- 
Jacob Sevart
Software Engineer, Safety

Re: Very large _metadata file

Reply via email to