I ended up working around this by allowing the host to connect to its own
fronted port.
Figured it’s a reasonable solution.
On Fri, Dec 12, 2014 at 12:38 PM, Ryan Svihla wrote:
> well did you restart cassandra after changing the JVM_OPTS to match your
> desired address?
>
> On Fri, Dec 12, 2014
Not a problem - it's good to hash this stuff out and understand the
technical reasons why something works or doesn't work.
On Sat Dec 13 2014 at 10:07:10 AM Jonathan Haddad wrote:
> On Sat Dec 13 2014 at 10:00:16 AM Eric Stevens wrote:
>
>> Isn't the net effect of coordination overhead incurre
On Sat Dec 13 2014 at 10:00:16 AM Eric Stevens wrote:
> Isn't the net effect of coordination overhead incurred by batches
> basically the same as the overhead incurred by RoundRobin or other
> non-token-aware request routing? As the cluster size increases, each node
> would coordinate the same p
And by the way Jon and Ryan, I want to thank you for engaging in the
conversation. I hope I'm not coming across as argumentative or combative
or anything like that. But I would definitely love to reconcile my
measurements with recommended practices so that I can make good decisions
about how to m
Isn't the net effect of coordination overhead incurred by batches basically
the same as the overhead incurred by RoundRobin or other non-token-aware
request routing? As the cluster size increases, each node would coordinate
the same percentage of writes in batches under token awareness as they
wou
Here is my test code; this was written as disposable code, so it's not
especially well documented, and it includes some chunks copied from
elsewhere in our stack, but hopefully it's readable.
https://gist.github.com/MightyE/1c98912fca104f6138fc
Here's some test runs after I reduced RF to 1, to int
One thing to keep in mind is the overhead of a batch goes up as the number
of servers increases. Talking to 3 is going to have a much different
performance profile than talking to 20. Keep in mind that the coordinator
is going to be talking to every server in the cluster with a big batch.
The amo
You can seen what the partition key strategies are for each of the tables,
test5 shows the least improvement. The set (aid, end) should be unique,
and bckt is derived from end. Some of these layouts result in clustering
on the same partition keys, that's actually tunable with the "~15 per
bucket"
Also..what happens when you turn on shuffle with token aware?
http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/policies/TokenAwarePolicy.html
On Sat, Dec 13, 2014 at 8:21 AM, Jonathan Haddad wrote:
>
> To add to Ryan's (extremely valid!) point, your test works because the
> coord
To add to Ryan's (extremely valid!) point, your test works because the
coordinator is always a replica. Try again using 20 (or 50) nodes.
Batching works great at RF=N=3 because it always gets to write to local and
talk to exactly 2 other servers on every request. Consider what happens
when the co
Also..to piggy back on what Jon is saying. what is the cost of retrying a
failed batch versus retrying a single failed record in your logical batch?
I'd suggest in the example you provide there is a 100x difference in retry
cost of a single record.
Regardless, I'm glad you're testing this out, alw
Are batches to the same partition key (which results in a single mutation,
and obviously eliminates the primary problem)? Is your client network
and/or CPU bound?
Remember, the coordinator node is _just_ doing what your client is doing
with executeAsync, only now it's dealing with the heap pressur
There are cases where it can. For instance, if you batch multiple
mutations to the same partition (and talk to a replica for that partition)
they can reduce network overhead because they're effectively a single
mutation in the eye of the cluster. However, if you're not doing that (and
most people
Jon,
> The really important thing to really take away from Ryan's original post
is that batches are not there for performance.
> tl;dr: you probably don't want batch, you most likely want many async
calls
My own rudimentary testing does not bear this out - at least not if you
mean to say that bat
Jonathan and Ryan,
Jonathan says “It is absolutely not going to help you if you're trying to lump
queries together to reduce network & server overhead - in fact it'll do the
opposite”, but I would note that the CQL3 spec says “The BATCH statement ...
serves several purposes: 1. It saves network
15 matches
Mail list logo