RE: the design of spilling to disk

2017-09-20 Thread Newport, Billy
: the design of spilling to disk Copied from my earlier response to some similar question: "Here is a short description for how it works: there are totally 3 threads working together, one for reading, one for sorting partial data in memory, and the last one is responsible for spilling. Flink

Re: the design of spilling to disk

2017-09-19 Thread Kurt Young
oudas > *Sent:* Tuesday, September 19, 2017 18:10 > *To:* Florin Dinu > *Cc:* user@flink.apache.org; fhue...@apache.org > *Subject:* Re: the design of spilling to disk > > Hi Florin, > > Unfortunately, there is no design document. > > The UnilateralSortMerger.java is used

Re: the design of spilling to disk

2017-09-19 Thread Florin Dinu
From: Kostas Kloudas Sent: Tuesday, September 19, 2017 18:10 To: Florin Dinu Cc: user@flink.apache.org; fhue...@apache.org Subject: Re: the design of spilling to disk Hi Florin, Unfortunately, there is no design document. The UnilateralSortMerger.java is used in the batch

Re: the design of spilling to disk

2017-09-19 Thread Kostas Kloudas
Hi Florin, Unfortunately, there is no design document. The UnilateralSortMerger.java is used in the batch processing mode (not is streaming) and, in fact, the code dates some years back. I cc also Fabian as he may have more things to say on this. Now for the streaming side, Flink uses 3 state

the design of spilling to disk

2017-09-19 Thread Florin Dinu
Hello everyone, In our group at EPFL we're doing research on understanding and potentially improving the performance of data-parallel frameworks that use secondary storage. I was looking at the Flink code to understand how spilling to disk actually works. So far I got to the UnilateralSortMe