Great to hear!

On Fri, 9 Dec 2016 at 01:02 Cliff Resnick <cre...@gmail.com> wrote:

> It turns out that most of the time in RocksDBFoldingState was spent on
> serialization/deserializaton. RocksDb read/write was performing well. By
> moving from Kryo to custom serialization we were able to increase
> throughput dramatically. Load is now where it should be.
>
> On Mon, Dec 5, 2016 at 1:15 PM, Robert Metzger <rmetz...@apache.org>
> wrote:
>
> Another Flink user using RocksDB with large state on SSDs recently posted
> this video for oprimizing the performance of Rocks on SSDs:
> https://www.youtube.com/watch?v=pvUqbIeoPzM
> That could be relevant for you.
>
> For how long did you look at iotop. It could be that the IO access happens
> in bursts, depending on how data is cached.
>
> I'll also add Stefan Richter to the conversation, he has maybe some more
> ideas what we can do here.
>
>
> On Mon, Dec 5, 2016 at 6:19 PM, Cliff Resnick <cre...@gmail.com> wrote:
>
> Hi Robert,
>
> We're following 1.2-SNAPSHOT,  using event time. I have tried "iotop" and
> I see usually less than 1 % IO. The most I've seen was a quick flash here
> or there of something substantial (e.g. 19%, 52%) then back to nothing. I
> also assumed we were disk-bound, but to use your metaphor I'm having
> trouble finding any smoke. However, I'm not very experienced in sussing out
> IO issues so perhaps there is something else I'm missing.
>
> I'll keep investigating. If I continue to come up empty then I guess my
> next steps may be to stage some independent tests directly against RocksDb.
>
> -Cliff
>
>
> On Mon, Dec 5, 2016 at 5:52 AM, Robert Metzger <rmetz...@apache.org>
> wrote:
>
> Hi Cliff,
>
> which Flink version are you using?
> Are you using Eventtime or processing time windows?
>
> I suspect that your disks are "burning" (= your job is IO bound). Can you
> check with a tool like "iotop" how much disk IO Flink is producing?
> Then, I would set this number in relation with the theoretical maximum of
> your SSD's (a good rough estimate is to use dd for that).
>
> If you find that your disk bandwidth is saturated by Flink, you could look
> into tuning the RocksDB settings so that it uses more memory for caching.
>
> Regards,
> Robert
>
>
> On Fri, Dec 2, 2016 at 11:34 PM, Cliff Resnick <cre...@gmail.com> wrote:
>
> In tests comparing RocksDb to fs state backend we observe much lower
> throughput, around 10x slower. While the lowered throughput is expected,
> what's perplexing is that machine load is also very low with RocksDb,
> typically falling to  < 25% CPU and negligible IO wait (around 0.1%). Our
> test instances are EC2 c3.xlarge which are 4 virtual CPUs and 7.5G RAM,
> each running a single TaskManager in YARN, with 6.5G allocated memory per
> TaskManager. The instances also have 2x40G attached SSDs which we have
> mapped to `taskmanager.tmp.dir`.
>
> With FS state and 4 slots per TM, we will easily max out with an average
> load average around 5 or 6, so we actually need throttle down the slots to
> 3. With RocksDb using the Flink SSD configured options we see a load
> average at around 1. Also, load (and actual) throughput remain more or less
> constant no matter how many slots we use. The weak load is spread over all
> CPUs.
>
> Here is a sample top:
>
> Cpu0  : 20.5%us,  0.0%sy,  0.0%ni, 79.5%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Cpu1  : 18.5%us,  0.0%sy,  0.0%ni, 81.5%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Cpu2  : 11.6%us,  0.7%sy,  0.0%ni, 87.0%id,  0.7%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Cpu3  : 12.5%us,  0.3%sy,  0.0%ni, 86.8%id,  0.0%wa,  0.0%hi,  0.3%si,
>  0.0%st
>
> Our pipeline uses tumbling windows, each with a ValueState keyed to a
> 3-tuple of one string and two ints.. Each ValueState comprises a small set
> of tuples around 5-7 fields each. The WindowFunction simply diffs agains
> the set and updates state if there is a diff.
>
> Any ideas as to what the bottleneck is here? Any suggestions welcomed!
>
> -Cliff
>
>
>
>
>
>
>
>
>
>
>

Reply via email to