Minor update: I have executed the flink-runtime tests on XFS, Lustre and
DVS (Cray DataWarp), and I observe divergences on XFS and Lustre, but not
on DVS. It turns out that cached reads are reported by the file systems as
well, so I don't think caching is an issue here. There might still be some
th
Hey Martin,
I'm still on it. I have switched to analyzing the flink-runtime tests, as I
observe similar divergence there. I'm not sure how long it'll take, but if
I find something I'll make sure to let you all know :)
Robert
On Sat, Apr 29, 2017 at 3:12 PM, Martin Eden
wrote:
> Hi Robert,
>
>
Hi Robert,
Any updates on the below for the community?
Thanks,
M
On Tue, Apr 25, 2017 at 8:50 AM, Robert Schmidtke
wrote:
> Hi Ufuk, thanks for coming back to me on this.
>
> The records are 100 bytes in size, the benchmark being TeraSort, so that
> should not be an issue. I have played around
Hi Ufuk, thanks for coming back to me on this.
The records are 100 bytes in size, the benchmark being TeraSort, so that
should not be an issue. I have played around with the input size, and here
are my observations:
128 GiB input: 0 Spilling in Flink.
256 GiB input: 88 GiB Spilling in Flink (so 8
Hey Robert,
for batch that should cover the relevant spilling code. If the records
are >= 5 MB, the SpillingAdaptiveSpanningRecordDeserializer will spill
incoming records as well. But that should be covered by the
FileChannel instrumentation as well?
– Ufuk
On Tue, Apr 18, 2017 at 3:57 PM, Robe
Hi,
I have already looked at the UnilateralSortMerger, concluding that all I/O
eventually goes via SegmentReadRequest and SegmentWriteRequest (which in
turn use java.nio.channels.FileChannel) in AsynchronousFileIOChannel. Are
there more interaction points between Flink and the underlying file syst
Hi,
I'm currently examining the I/O patterns of Flink, and I'd like to know
when/how Flink goes to disk. Let me give an introduction of what I have
done so far.
I am running TeraGen (from the Hadoop examples package) + TeraSort (
https://github.com/robert-schmidtke/terasort) on a 16 node cluster,