Got it. Thanks, Mark!

Kind regards,

Charles Alva
Sent from Gmail Mobile


On Fri, Apr 12, 2019 at 10:53 PM Mark Nelson <mnel...@redhat.com> wrote:

> They have the same issue, but depending on the SSD may be better at
> absorbing the extra IO if network or CPU are bigger bottlenecks.  That's
> one of the reasons that a lot of folks like to put the DB on flash for
> HDD based clusters.  It's still possible to oversubscribe them, but
> you've got more headroom.
>
>
> Mark
>
> On 4/12/19 10:25 AM, Charles Alva wrote:
> > Thanks Mark,
> >
> > This is interesting. I'll take a look at the links you provided.
> >
> > Does rocksdb compacting issue only affect HDDs? Or SSDs are having
> > same issue?
> >
> > Kind regards,
> >
> > Charles Alva
> > Sent from Gmail Mobile
> >
> > On Fri, Apr 12, 2019, 9:01 PM Mark Nelson <mnel...@redhat.com
> > <mailto:mnel...@redhat.com>> wrote:
> >
> >     Hi Charles,
> >
> >
> >     Basically the goal is to reduce write-amplification as much as
> >     possible.  The deeper that the rocksdb hierarchy gets, the worse the
> >     write-amplifcation for compaction is going to be.  If you look at the
> >     OSD logs you'll see the write-amp factors for compaction in the
> >     rocksdb
> >     compaction summary sections that periodically pop up. There's a
> >     couple
> >     of things we are trying to see if we can improve things on our end:
> >
> >
> >     1) Adam has been working on experimenting with sharding data across
> >     multiple column families.  The idea here is that it might be
> >     better to
> >     hav multiple L0 and L1 levels rather than L0, L1, L2 and L3. I'm not
> >     sure if this will pan out of not, but that was one of the goals
> >     behind
> >     trying this.
> >
> >
> >     2) Toshiba recently released trocksdb which could have a really big
> >     impact on compaction write amplification:
> >
> >
> >     Code:
> https://github.com/ToshibaMemoryAmerica/trocksdb/tree/TRocksRel
> >
> >     Wiki: https://github.com/ToshibaMemoryAmerica/trocksdb/wiki
> >
> >
> >     I recently took a look to see if our key/value size distribution
> >     would
> >     work well with the approach that trocksdb is taking to reduce
> >     write-amplification:
> >
> >
> >
> https://docs.google.com/spreadsheets/d/1fNFI8U-JRkU5uaRJzgg5rNxqhgRJFlDB4TsTAVsuYkk/edit?usp=sharing
> >
> >
> >     The good news is that it sounds like the "Trocks Ratio" for the
> >     data we
> >     put in rocksdb is sufficiently high that we'd see some benefit
> >     since it
> >     should greatly reduce write-amplification during compaction for data
> >     (but not keys). This doesn't help your immediate problem, but I
> >     wanted
> >     you to know that you aren't the only one and we are thinking about
> >     ways
> >     to reduce the compaction impact.
> >
> >
> >     Mark
> >
> >
> >     On 4/10/19 2:07 AM, Charles Alva wrote:
> >     > Hi Ceph Users,
> >     >
> >     > Is there a way around to minimize rocksdb compacting event so
> >     that it
> >     > won't use all the spinning disk IO utilization and avoid it being
> >     > marked as down due to fail to send heartbeat to others?
> >     >
> >     > Right now we have frequent high IO disk utilization for every 20-25
> >     > minutes where the rocksdb reaches level 4 with 67GB data to
> compact.
> >     >
> >     >
> >     > Kind regards,
> >     >
> >     > Charles Alva
> >     > Sent from Gmail Mobile
> >     >
> >     > _______________________________________________
> >     > ceph-users mailing list
> >     > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> >     > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >     _______________________________________________
> >     ceph-users mailing list
> >     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to