Thank you for your answer to my question, Chiwan :) Can I ask another question?
> On Jun 22, 2016, at 7:22 PM, Chiwan Park <chiwanp...@apache.org> wrote: > > Hi Tae-Geon, > > AFAIK, spilling *data* to disk happens only when managed memory is used. > Currently, streaming API (DataStream) doesn’t use managed memory yet. > `MutableHashTable` is one of representative usage of managed memory with disk > spilling. Note that some special structures such as `CompactingHashTable` > doesn’t spill data to disk even though they use the manage memory to achieve > high performance. As far as I understand, spilling data is only performed on batch mode. Do you know why streaming mode does not use managed memory? Is this because the performance gain is negligible? > > About spilling *states*, I think that it depends on how state backends is > implemented. For example, `FsStateBackend` saves states to file system but > `MemoryStateBackend` doesn’t. `RocksDBStateBackend` uses memory first and > also can spill states to disk. I’ve found a nice document on the state backend [1]. I will take a look at this doc to know the detail. Thanks! Taegeon [1]: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#state-backends <https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#state-backends> > > Regards, > Chiwan Park > >> On Jun 22, 2016, at 3:27 PM, Tae-Geon Um <taegeo...@gmail.com> wrote: >> >> I have another question. >> Is the spilling only executed on batch mode? >> What happen on streaming mode? >> >>> On Jun 22, 2016, at 1:48 PM, Tae-Geon Um <taegeo...@gmail.com> wrote: >>> >>> Hi, all >>> >>> As far as I know, Flink spills data (states?) to disk if the data exceeds >>> memory threshold or there exists memory pressure. >>> i’d like to know the detail of how Flink spills data to disk. >>> >>> Could you please let me know which codes do I have to investigate? >>> >>> Thanks, >>> Taegeon >> >