Hey all, Another note from Yi:
""" Here are a few more links regarding to SSD performance: A comprehensive overview: http://codecapsule.com/2014/02/12/coding-for-ssds-part-5-access-patterns-and-system-optimizations/ A quick note on filesystem configuration: http://superuser.com/questions/228657/which-linux-filesystem-works-best-with-ssd/ A blog of test results on different I/O schedulers on SSD: http://www.phoronix.com/scan.php?page=article&item=linux_iosched_2012&num=1 If we are sure that our main workload generated by RocksDb store is sequential READ/WRITE, can we check whether we have are using the file system configuration mentioned in the second link above? """ Cheers, Chris On Mon, Jan 26, 2015 at 8:14 AM, Jon Bringhurst < jbringhu...@linkedin.com.invalid> wrote: > Right now we're mostly running with noop for our most recently installed > SSDs. Some older ones are running with cfq. > > Early on in the development of samza-kv, I tried deadline and noop (in > place of cfq) and didn't notice a significant change in performance. > However, I don't have any numbers to back this up, so this observation is > probably worthless. :) That was also when we were still using LevelDB > backed KV and a different SSD model and brand, so I agree that testing the > different schedulers (mostly noop vs deadline) is worth revisiting. > > -Jon > > On Jan 25, 2015, at 12:29 PM, Roger Hoover <roger.hoo...@gmail.com> wrote: > > > FYI, for Linux with SSDs, changing the io scheduler to deadline or noop > can > > make a 500x improvement. I haven't tried this myself. > > > > > http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/hardware.html#_disks > > > > On Tue, Jan 20, 2015 at 9:28 AM, Chris Riccomini < > > criccom...@linkedin.com.invalid> wrote: > > > >> Hey Roger, > >> > >> We did some benchmarking, and discovered very similar performance to > what > >> you've described. We saw ~40k writes/sec, and ~20 k reads/sec, > >> per-container, on a Virident SSD. This was without any changelog. Are > you > >> using a changelog on the store? > >> > >> When we attached a changelog to the store, the writes dropped > >> significantly (~1000 writes/sec). When we hooked up VisualVM, we saw > that > >> the container was spending > 99% of its time in > KafkaSystemProducer.send(). > >> > >> We're currently doing two things: > >> > >> 1. Working with our performance team to understand and tune RocksDB > >> properly. > >> 2. Upgrading the Kafka producer to use the new Java-based API. > (SAMZA-227) > >> > >> For (1), it seems like we should be able to get a lot higher throughput > >> from RocksDB. Anecdotally, we've heard that RocksDB requires many > threads > >> in order to max out an SSD, and since Samza is single-threaded, we could > >> just be hitting a RocksDB bottleneck. We won't know until we dig into > the > >> problem (which we started investigating last week). The current plan is > to > >> start by benchmarking RocksDB JNI outside of Samza, and see what we can > >> get. From there, we'll know our "speed of light", and can try to get > Samza > >> as close as possible to it. If RocksDB JNI can't be made to go "fast", > >> then we'll have to understand why. > >> > >> (2) should help with the changelog issue. I believe that the slowness > with > >> the changelog is caused because the changelog is using a sync producer > to > >> send to Kafka, and is blocking when a batch is flushed. In the new API, > >> the concept of a "sync" producer is removed. All writes are handled on > an > >> async writer thread (though we can still guarantee writes are safely > >> written before checkpointing, which is what we need). > >> > >> In short, I agree, it seems slow. We see this behavior, too. We're > digging > >> into it. > >> > >> Cheers, > >> Chris > >> > >> On 1/17/15 12:58 PM, "Roger Hoover" <roger.hoo...@gmail.com> wrote: > >> > >>> Michael, > >>> > >>> Thanks for the response. I used VisualVM and YourKit and see the CPU > is > >>> not being used (0.1%). I took a few thread dumps and see the main > thread > >>> blocked on the flush() method inside the KV store. > >>> > >>> On Sat, Jan 17, 2015 at 7:09 AM, Michael Rose <elementat...@gmail.com> > >>> wrote: > >>> > >>>> Is your process at 100% CPU? I suspect you're spending most of your > >>>> time in > >>>> JSON deserialization, but profile it and check. > >>>> > >>>> Michael > >>>> > >>>> On Friday, January 16, 2015, Roger Hoover <roger.hoo...@gmail.com> > >>>> wrote: > >>>> > >>>>> Hi guys, > >>>>> > >>>>> I'm testing a job that needs to load 40M records (6GB in Kafka as > >>>> JSON) > >>>>> from a bootstrap topic. The topic has 4 partitions and I'm running > >>>> the > >>>> job > >>>>> using the ProcessJobFactory so all four tasks are in one container. > >>>>> > >>>>> Using RocksDB, it's taking 19 minutes to load all the data which > >>>> amounts > >>>> to > >>>>> 35k records/sec or 5MB/s based on input size. I ran iostat during > >>>> this > >>>>> time as see the disk write throughput is 14MB/s. > >>>>> > >>>>> I didn't tweak any of the storage settings. > >>>>> > >>>>> A few questions: > >>>>> 1) Does this seem low? I'm running on a Macbook Pro with SSD. > >>>>> 2) Do you have any recommendations for improving the load speed? > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Roger > >>>>> > >>>> > >> > >> > >