Hi to all,

I'm trying to put a large binary file (> 500MB) on a C* cluster as fast as
I can but I get some (many) WriteTimeoutExceptions.

I created a small POC that isolates the problem I'm facing. Here you will
find the code: https://github.com/giampaolotrapasso/cassandratest,

*Main details about it:*

   - I try to write the file into chunks (*data* field) <= 1MB (1MB is
   recommended max size for single cell),


   - Chunks are grouped into buckets. Every bucket is a partition row,
   - Buckets are grouped by UUIDs.


   - Chunk size and bucket size are configurable from app so I can try
   different configurations and see what happens.


   - Trying to max throughput, I execute asynch insertions, however to
   avoid too much pressure on the db, after a threshold, I wait at least for a
   finished insert to add another (this part is quite raw in my code but I
   think it's not so important). Also this parameter is configurable to test
   different combinations.

This is the table on db:

CREATE TABLE blobtest.store (
    uuid uuid,
    bucket bigint,
    start bigint,
    data blob,
    end bigint,
    PRIMARY KEY ((uuid, bucket), start)
)

and this is the main code (Scala, but I hope is be generally readable)

    val statement = client.session.prepare("INSERT INTO
blobTest.store(uuid, bucket, start, end, data) VALUES (?, ?, ?, ?, ?) if
not exists;")

    val blob = new Array[Byte](MyConfig.blobSize)
    scala.util.Random.nextBytes(blob)

    write(client,
      numberOfRecords = MyConfig.recordNumber,
      bucketSize = MyConfig.bucketSize,
      maxConcurrentWrites = MyConfig.maxFutures,
      blob,
      statement)

where write is

def write(database: Database, numberOfRecords: Int, bucketSize: Int,
maxConcurrentWrites: Int,
            blob: Array[Byte], statement: PreparedStatement): Unit = {

    val uuid: UUID = UUID.randomUUID()
    var count = 0;

    //Javish loop
    while (count < numberOfRecords) {
      val record = Record(
        uuid = uuid,
        bucket = count / bucketSize,
        start = ((count % bucketSize)) * blob.length,
        end = ((count % bucketSize) + 1) * blob.length,
        bytes = blob
      )
      asynchWrite(database, maxConcurrentWrites, statement, record)
      count += 1
    }

    waitDbWrites()
  }

and asynchWrite is just binding to statement

*Problem*

The problem is that when I try to increase the chunck size, or the number
of asynch insert or the size of the bucket (ie number of chuncks), app
become unstable since the db starts throwing WriteTimeoutException.

I've tested the stuff on the CCM (4 nodes) and on a EC2 cluster (5 nodes,
8GB Heap). Problem seems the same on both enviroments.

On my local cluster, I've tried to change respect to default configuration:

concurrent_writes: 128

write_request_timeout_in_ms: 200000

other configurations are here:
https://gist.github.com/giampaolotrapasso/ca21a83befd339075e07

*Other*

Exceptions seems random, sometimes are at the beginning of the write

*Questions:*

1. Is my model wrong? Am I missing some important detail?

2. What are the important information to look at for this kind of problem?

3. Why exceptions are so random?

4. There is some other C* parameter I can set to assure that
WriteTimeoutException does not occur?

I hope I provided enough information to get some help.

Thank you in advance for any reply.


Giampaolo

Reply via email to