Make sure the driver is configured for token aware routing, otherwise the
coordinator node may have to redirect your write, adding a network hop.

To be absolutely clear, Cassandra uses the distributed, parallel model for
Big Data - lots of multi-threaded clients with lots of nodes. Clusters with
less than six or eight nodes and using a single, single-threaded client are
not a representative usage of Cassandra. Replication is presumed as well.
Anything less than RF=3 is simply not a representative or recommended usage
of Cassandra. Similarly, writes at less than QUORUM are neither
representative nor recommended.

CL=ONE has to update the memtable as well, not just the commit log.
Flushing to sstables occurs once the memtables reach some threshold
size.See:
http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html


-- Jack Krupansky

On Thu, Dec 31, 2015 at 11:13 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> The limitation is on the driver side. Try looking at
> execute_concurrent_with_args in the cassandra.concurrent module to get
> parallel writes with prepared statements.
>
> https://datastax.github.io/python-driver/api/cassandra/concurrent.html
>
> On Wed, Dec 30, 2015 at 11:34 PM Alexandre Beaulne <
> alexandre.beau...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> First and foremost thanks to everyone involved with making C* available
>> to the world, it is a great technology to have access to.
>>
>> I'm experimenting with C* for one of our projects and I cannot reproduce
>> the write speeds C* is lauded for. I would appreciate some guidance as to
>> what I'm doing wrong.
>>
>> *Setup*: I have one, single-threaded, python client (using Datastax's
>> python driver), writing (no reads) to a C* cluster. All C* nodes are
>> launched by running the official Docker container. There's a single
>> keyspace with replication factor of 1 and client is set to consistency
>> level LOCAL ONE. In that keyspace there is a single table with ~40 columns
>> of mixed types. Two columns are set as primary key and two more as
>> clustering columns. The primary key is close to uniformly distributed in
>> the dataset. The writer is in a tight-loop, building CQL 3 insert
>> statements one by one and executing them against the C* cluster.
>>
>> *Specs*: Cassandra v3.0.1, python-driver v3.0.0, host is CentOS 7 with
>> 40 cores @ 3GHz and 66Gb of RAM.
>>
>> In the course of my experimentation I came up with 7 scenarios trying to
>> isolate the performance bottleneck:
>>
>> *Scenario 1*: the writer simply build the insert statement strings
>> without doing anything with them.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 0.00 - [95] 0.01 -
>> [99] 0.01 [100] 0.05
>>
>> *Scenario 2*: the writer open a TCP socket and send the insert statement
>> string to a simple reader running on the same host. The reader then append
>> that insert statement string to a file on disk, mimicking a commit log of
>> some sort.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 0.01 - [95] 0.02 -
>> [99] 0.03 [100] 63.33
>>
>> *Scenario 3*: is identical to scenario 2, but the reader is ran inside a
>> Docker container, to measure if there is any overhead from running in the
>> container.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 0.01 - [95] 0.01 -
>> [99] 0.01 [100] 4.45
>>
>> * Scenario 4*: the writer asynchronously executes the insert statements
>> against a single-node C* cluster.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 0.07 - [95] 0.15 -
>> [99] 0.56 [100] 534.09
>>
>> *Scenario 5*: the writer synchronously executes the insert statements
>> against a single-node C* cluster.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 1.40 - [95] 1.46 -
>> [99] 1.54 [100] 41.75
>>
>> *Scenario 6*: the writer asynchronously executes the insert statements
>> against a four-nodes C* cluster.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 0.09 - [95] 0.14 -
>> [99] 0.16 [100] 838.83
>>
>> *Scenario 7*: the writer synchronously executes the insert statements
>> against a four-nodes C* cluster.
>>
>> Results: sample size: 200002, percentiles (ms): [50] 1.73 - [95] 1.89 -
>> [99] 2.15 [100] 50.94
>>
>> Looking at scenario 3 & 5, a synchronous write statement to C* is about
>> 150x slower than appending to a flat file. Now I understand write to a DB
>> is more involved than appending to a file, but I'm surprised by the
>> magnitude of the difference. I thought all C* did for writes with
>> consistency level of 1 was to append the write to its commit log and
>> return, then distribute the write across the cluster in an eventual
>> consistency manner. More than 1 ms per write is less than a 1000 writes per
>> second, far from big data velocity.
>>
>> What am I doing wrong? Are writes supposed to be batched before inserted?
>> Instead of appending rows to the table, would it be more efficient to
>> append columns to the rows? Why writes are so slow?
>>
>> Thanks for your time,
>> Alex
>>
>

Reply via email to