Re: To batch or not to batch: A question for fast inserts

Eric Stevens Fri, 25 Sep 2015 11:25:34 -0700

> compaction usually is the limiter for most clusters, so the difference
between async versus unlogged batch ends up being minor or worse..non
existent cause the hardware and data model combination result in compaction
being the main throttle.


If your number of records to load per second is predetermined (as would be
the case in any production use case), then this doesn't make any difference
on compaction whether loaded as batches vs as single statements, your
cluster needs to support the same number and shape of mutates either way.

> if you add in token awareness to your batch..you’ve basically eliminated
the primary complaint of using unlogged batches so why not do that.

This is absolutely the right idea if your driver supports it, but the gain
is smaller than I would have expected based on the warnings
of imminent doom when we've had this conversation before.  If your driver
supports token awareness, use that to group statements by primary replica
and concurrently execute those that way.  Here's the code we're using (in
Scala using the Java driver):

def groupByFirstReplica()(implicit session: CQLSession): Map[Host, CQLBatch] = {
  val meta = session.getCluster.getMetadata
  statements.groupBy { st =>
    try {
      meta.getReplicas(st.getKeyspace, st.getRoutingKey).iterator().next
    } catch { case NonFatal(e) =>
      null
    }
  } mapValues { st => CQLBatch(st) }
}

We now have a map of primary host to sub-batch for all the statements in
our logical batch.  We can now do either of these (depending on how greedy
we want to be in our client; Future.traverse is preferred and nicer,
Future.sequence is greedier and more resource intensive):

Future.sequence(groupByFirstReplica().values.map(_.execute())).map(_.flatten)
Future.traverse(groupByFirstReplica().values) { _.execute() }.map(_.flatten)

We get back Future[Iterable[ResultSet]] - this future completes when the
logical batch's sub-batches have all completed.

Note that with the DSE Java driver, for the above to succeed in its intent,
the statements need to be prepared statements (for st.getRoutingKey to
return non-null), and either the keyspace has to be fully defined in the
CQL, or you have to have set the correct keyspace when you created the
connection (for st.getKeyspace to return non-null).  Otherwise the values
given to meta.getReplicas will fail to resolve a primary host which results
in doing token-unaware batches (i.e. you'll get back a Map(null ->
allStatements)).  However those same criteria are required for single
statements to be token aware.




On Fri, Sep 25, 2015 at 7:30 AM, Ryan Svihla <r...@foundev.pro> wrote:

> Generally this is all correct but I cannot emphasize enough how much this
> “just depends” and today I generally move people to async inserts first
> before trying to micro-optimize some things to keep in mind.
>
>
>    - compaction usually is the limiter for most clusters, so the
>    difference between async versus unlogged batch ends up being minor or
>    worse..non existent cause the hardware and data model combination result in
>    compaction being the main throttle.
>    - if you add in token awareness to your batch..you’ve basically
>    eliminated the primary complaint of using unlogged batches so why not do
>    that. When I was at DataStax I made some similar suggestions for token
>    aware batch after seeing the perf improvements with Spark writes using
>    unlogged batch. Several others did as well so I’m not the first one with
>    this idea.
>    - write size makes in my experience the largest difference BY FAR
>    about which is faster. and the number is largely irrelevant compared to the
>    total payload size. Depending on the hardware and etc a good rule of thumb
>    is writes below 1k bytes tend to get really inefficient and writes that are
>    over 100k tend to slow down total throughput. I’ll reemphasize this magic
>    number has been different on almost every cluster I’ve tuned.
>
>
> In summary all this means is, too small or too large of writes are slow,
> and unlogged batches may involve some extra hops, if you eliminate the
> extra hops by token awareness then it just comes down to write size
> optimization.
>
> On Sep 24, 2015, at 5:18 PM, Eric Stevens <migh...@gmail.com> wrote:
>
> > I side-tracked some punctual benchmarks and stumbled on the
> observations of unlogged inserts being *A LOT* faster than the async
> counterparts.
>
> My own testing agrees very strongly with this.  When this topic came up on
> this list before, there was a concern that batch coordination produces GC
> pressure in your cluster because you're involving nodes which aren't *strictly
> speaking* necessary to be involved.
>
> Our own testing shows some small impact on this front, but really
> lightweight GC tuning mitigated the effects by putting a little more room
> in Xmn (if you're still on CMS garbage collector).  On G1GC (which is what
> we run in production) we weren't able to measure a difference.
>
> Our testing shows data loads being as much as 5x to 8x faster when using
> small concurrent batches over using single statements concurrently.  We
> tried three different concurrency models.
>
> To save on coordinator overhead, we group the statements in our "batch" by
> replica (using the functionality exposed by the DataStax Java driver), and
> do essentially token aware batching.  This still has a *small* amount of
> additional coordinator overhead (since the data size of the unit of work is
> larger, and sits in memory in the coordinator longer).  We've been running
> this way successfully for months with *sustained* rates north of 50,000
> mutates per second.  We burst *much* higher.
>
> Through trial and error we determined we got diminishing returns in the
> realm of 100 statements per token-aware batch.  It looks like your own data
> bears that out as well.  I'm sure that's workload dependent though.
>
> I've been disagreed with on this topic in this list in the past despite
> the numbers I was able to post.  Nobody has shown me numbers (nor anything
> else concrete) that contradict my position though, so I stand by it.  There's
> no question in my mind, if your mutates are of any significant volume and
> you care about the performance of them, token aware unlogged batching is
> the right strategy.  When we reduce our batch sizes or switch to single
> async statements, we fall over immediately.
>
> On Tue, Sep 22, 2015 at 7:54 AM, Gerard Maas <gerard.m...@gmail.com>
> wrote:
>
>> General advice advocates for individual async inserts as the fastest way
>> to insert data into Cassandra. Our insertion mechanism is based on that
>> model and recently we have been evaluating performance, looking to measure
>> and optimize our ingestion rate.
>>
>> I side-tracked some punctual benchmarks and stumbled on the observations
>> of unlogged inserts being *A LOT* faster than the async counterparts.
>>
>> In our tests, unlogged batch shows increased throughput and lower cluster
>> CPU usage, so I'm wondering where the tradeoff might be.
>>
>> I compiled those observations in this document that I'm sharing and
>> opening up for comments.  Are we observing some artifact or should we set
>> the record straight for unlogged batches to achieve better insertion
>> throughput?
>>
>>
>> https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI
>>
>> Let me know.
>>
>> Kind regards,
>>
>> Gerard.
>>
>
>
> Regards,
>
> Ryan Svihla
>
>

Re: To batch or not to batch: A question for fast inserts

Reply via email to