Alexey, this is a great feature. Can you explain what you meant by "warm-up" when iterating through pages? Do you have this feature already implemented?
D. On Sun, Sep 16, 2018 at 6:44 PM Alexei Scherbakov < [email protected]> wrote: > Igniters, > > My use case involves scenario where it's necessary to iterate over > large(many TBs) persistent cache doing some calculation on read data. > > The basic solution is to iterate cache using ScanQuery. > > This turns out to be slow because iteration over cache involves a lot of > random disk access for reading data pages referenced from leaf pages by > links. > > This is especially true when data is stored on disks with slow random > access, like SAS disks. In my case on modern SAS disks array reading speed > was like several MB/sec while sequential read speed in perf test was about > GB/sec. > > I was able to fix the issue by using ScanQuery with explicit partition set > and running simple warmup code before each partition scan. > > The code pins cold pages in memory in sequential order thus eliminating > random disk access. Speedup was like x100 magnitude. > > I suggest adding the improvement to the product's core by always > sequentially preloading pages for all internal partition iterations (cache > iterators, scan queries, sql queries with scan plan) if partition is cold > (low number of pinned pages). > > This also should speed up rebalancing from cold partitions. > > Ignite JIRA ticket [1] > > Thoughts ? > > [1] https://issues.apache.org/jira/browse/IGNITE-8873 > > -- > > Best regards, > Alexei Scherbakov >
