Got it. Thanks, Mark! Kind regards,
Charles Alva Sent from Gmail Mobile On Fri, Apr 12, 2019 at 10:53 PM Mark Nelson <mnel...@redhat.com> wrote: > They have the same issue, but depending on the SSD may be better at > absorbing the extra IO if network or CPU are bigger bottlenecks. That's > one of the reasons that a lot of folks like to put the DB on flash for > HDD based clusters. It's still possible to oversubscribe them, but > you've got more headroom. > > > Mark > > On 4/12/19 10:25 AM, Charles Alva wrote: > > Thanks Mark, > > > > This is interesting. I'll take a look at the links you provided. > > > > Does rocksdb compacting issue only affect HDDs? Or SSDs are having > > same issue? > > > > Kind regards, > > > > Charles Alva > > Sent from Gmail Mobile > > > > On Fri, Apr 12, 2019, 9:01 PM Mark Nelson <mnel...@redhat.com > > <mailto:mnel...@redhat.com>> wrote: > > > > Hi Charles, > > > > > > Basically the goal is to reduce write-amplification as much as > > possible. The deeper that the rocksdb hierarchy gets, the worse the > > write-amplifcation for compaction is going to be. If you look at the > > OSD logs you'll see the write-amp factors for compaction in the > > rocksdb > > compaction summary sections that periodically pop up. There's a > > couple > > of things we are trying to see if we can improve things on our end: > > > > > > 1) Adam has been working on experimenting with sharding data across > > multiple column families. The idea here is that it might be > > better to > > hav multiple L0 and L1 levels rather than L0, L1, L2 and L3. I'm not > > sure if this will pan out of not, but that was one of the goals > > behind > > trying this. > > > > > > 2) Toshiba recently released trocksdb which could have a really big > > impact on compaction write amplification: > > > > > > Code: > https://github.com/ToshibaMemoryAmerica/trocksdb/tree/TRocksRel > > > > Wiki: https://github.com/ToshibaMemoryAmerica/trocksdb/wiki > > > > > > I recently took a look to see if our key/value size distribution > > would > > work well with the approach that trocksdb is taking to reduce > > write-amplification: > > > > > > > https://docs.google.com/spreadsheets/d/1fNFI8U-JRkU5uaRJzgg5rNxqhgRJFlDB4TsTAVsuYkk/edit?usp=sharing > > > > > > The good news is that it sounds like the "Trocks Ratio" for the > > data we > > put in rocksdb is sufficiently high that we'd see some benefit > > since it > > should greatly reduce write-amplification during compaction for data > > (but not keys). This doesn't help your immediate problem, but I > > wanted > > you to know that you aren't the only one and we are thinking about > > ways > > to reduce the compaction impact. > > > > > > Mark > > > > > > On 4/10/19 2:07 AM, Charles Alva wrote: > > > Hi Ceph Users, > > > > > > Is there a way around to minimize rocksdb compacting event so > > that it > > > won't use all the spinning disk IO utilization and avoid it being > > > marked as down due to fail to send heartbeat to others? > > > > > > Right now we have frequent high IO disk utilization for every 20-25 > > > minutes where the rocksdb reaches level 4 with 67GB data to > compact. > > > > > > > > > Kind regards, > > > > > > Charles Alva > > > Sent from Gmail Mobile > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com