Hi Tae-Geon, AFAIK, spilling *data* to disk happens only when managed memory is used. Currently, streaming API (DataStream) doesn’t use managed memory yet. `MutableHashTable` is one of representative usage of managed memory with disk spilling. Note that some special structures such as `CompactingHashTable` doesn’t spill data to disk even though they use the manage memory to achieve high performance.
About spilling *states*, I think that it depends on how state backends is implemented. For example, `FsStateBackend` saves states to file system but `MemoryStateBackend` doesn’t. `RocksDBStateBackend` uses memory first and also can spill states to disk. Regards, Chiwan Park > On Jun 22, 2016, at 3:27 PM, Tae-Geon Um <taegeo...@gmail.com> wrote: > > I have another question. > Is the spilling only executed on batch mode? > What happen on streaming mode? > >> On Jun 22, 2016, at 1:48 PM, Tae-Geon Um <taegeo...@gmail.com> wrote: >> >> Hi, all >> >> As far as I know, Flink spills data (states?) to disk if the data exceeds >> memory threshold or there exists memory pressure. >> i’d like to know the detail of how Flink spills data to disk. >> >> Could you please let me know which codes do I have to investigate? >> >> Thanks, >> Taegeon >