Hi Jacob,

As I said previously I am not 100% sure what can be causing this
behavior, but this is a related thread here:
https://lists.apache.org/thread.html/r3bfa2a3368a9c7850cba778e4decfe4f6dba9607f32addb69814f43d%40%3Cuser.flink.apache.org%3E

Which you can re-post your problem and monitor for answers.

Cheers,
Kostas

On Wed, Mar 4, 2020 at 7:02 PM Jacob Sevart <jsev...@uber.com> wrote:
>
> Kostas and Gordon,
>
> Thanks for the suggestions! I'm on RocksDB. We don't have that setting 
> configured so it should be at the default 1024b. This is the full "state.*" 
> section showing in the JobManager UI.
>
>
>
> Jacob
>
> On Wed, Mar 4, 2020 at 2:45 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> 
> wrote:
>>
>> Hi Jacob,
>>
>> Apart from what Klou already mentioned, one slightly possible reason:
>>
>> If you are using the FsStateBackend, it is also possible that your state is 
>> small enough to be considered to be stored inline within the metadata file.
>> That is governed by the "state.backend.fs.memory-threshold" configuration, 
>> with a default value of 1024 bytes, or can also be configured with the 
>> `fileStateSizeThreshold` argument when constructing the `FsStateBackend`.
>> The purpose of that threshold is to ensure that the backend does not create 
>> a large amount of very small files, where potentially the file pointers are 
>> actually larger than the state itself.
>>
>> Cheers,
>> Gordon
>>
>>
>>
>> On Wed, Mar 4, 2020 at 6:17 PM Kostas Kloudas <kklou...@gmail.com> wrote:
>>>
>>> Hi Jacob,
>>>
>>> Could you specify which StateBackend you are using?
>>>
>>> The reason I am asking is that, from the documentation in [1]:
>>>
>>> "Note that if you use the MemoryStateBackend, metadata and savepoint
>>> state will be stored in the _metadata file. Since it is
>>> self-contained, you may move the file and restore from any location."
>>>
>>> I am also cc'ing Gordon who may know a bit more about state formats.
>>>
>>> I hope this helps,
>>> Kostas
>>>
>>> [1] 
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/savepoints.html
>>>
>>> On Wed, Mar 4, 2020 at 1:25 AM Jacob Sevart <jsev...@uber.com> wrote:
>>> >
>>> > Per the documentation:
>>> >
>>> > "The meta data file of a Savepoint contains (primarily) pointers to all 
>>> > files on stable storage that are part of the Savepoint, in form of 
>>> > absolute paths."
>>> >
>>> > I somehow have a _metadata file that's 1.9GB. Running strings on it I 
>>> > find 962 strings, most of which look like HDFS paths, which leaves a lot 
>>> > of that file-size unexplained. What else is in there, and how exactly 
>>> > could this be happening?
>>> >
>>> > We're running 1.6.
>>> >
>>> > Jacob
>
>
>
> --
> Jacob Sevart
> Software Engineer, Safety

Reply via email to