Try breaking it up into smaller chunks using multiple threads and token
ranges. 86400 is pretty large. I found ~1000 results per query is good.
This will spread the burden across all servers a little more evenly.

On Thu, May 7, 2015 at 4:27 AM, Alprema <alpr...@alprema.com> wrote:

> Hi,
>
> I am writing an application that will periodically read big amounts of
> data from Cassandra and I am experiencing odd performances.
>
> My column family is a classic time series one, with series ID and Day as
> partition key and a timestamp as clustering key, the value being a double.
>
> The query I run gets all the values for a given time series for a given
> day (so about 86400 points):
>
> SELECT "UtcDate", "Value"FROM "Metric_OneSec"WHERE "MetricId" = 
> 12215ece-6544-4fcf-a15d-4f9e9ce1567eAND "Day" = '2015-05-05 
> 00:00:00+0000'LIMIT 86400;
>
>
> This takes about 450ms to run and when I trace the query I see that it
> takes about 110ms to read the data from disk and 224ms to send the data
> from the responsible node to the coordinator (full trace in attachment).
>
> I did a quick estimation of the requested data (correct me if I'm wrong):
> 86400 * (column name + column value + timestamp + ttl)
> = 86400 * (8 + 8 + 8 + 8?)
> = 2.6Mb
>
> Let's say about 3Mb with misc. overhead, so these timings seem pretty slow
> to me for a modern SSD and a 1Gb/s NIC.
>
> Do those timings seem normal? Am I missing something?
>
> Thank you,
>
> Kévin
>
>
>

Reply via email to