Thank you for your answer to my question, Chiwan :)  
Can I ask another question?  


> On Jun 22, 2016, at 7:22 PM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
> Hi Tae-Geon,
> 
> AFAIK, spilling *data* to disk happens only when managed memory is used. 
> Currently, streaming API (DataStream) doesn’t use managed memory yet. 
> `MutableHashTable` is one of representative usage of managed memory with disk 
> spilling. Note that some special structures such as `CompactingHashTable` 
> doesn’t spill data to disk even though they use the manage memory to achieve 
> high performance.

As far as I understand, spilling data is only performed on batch mode. 
Do you know why streaming mode does not use managed memory? 
Is this because the performance gain is negligible?

> 
> About spilling *states*, I think that it depends on how state backends is 
> implemented. For example, `FsStateBackend` saves states to file system but 
> `MemoryStateBackend` doesn’t. `RocksDBStateBackend` uses memory first and 
> also can spill states to disk.

I’ve found a nice document on the state backend [1]. I will take a look at this 
doc to know the detail. 
Thanks! 

Taegeon

[1]: 
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#state-backends
 
<https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#state-backends>

> 
> Regards,
> Chiwan Park
> 
>> On Jun 22, 2016, at 3:27 PM, Tae-Geon Um <taegeo...@gmail.com> wrote:
>> 
>> I have another question. 
>> Is the spilling only executed on batch mode? 
>> What happen on streaming mode?  
>> 
>>> On Jun 22, 2016, at 1:48 PM, Tae-Geon Um <taegeo...@gmail.com> wrote:
>>> 
>>> Hi, all
>>> 
>>> As far as I know, Flink spills data (states?) to disk if the data exceeds 
>>> memory threshold or there exists memory pressure.
>>> i’d like to know the detail of how Flink spills data to disk. 
>>> 
>>> Could you please let me know which codes do I have to investigate? 
>>> 
>>> Thanks,
>>> Taegeon
>> 
> 

Reply via email to