RE: Read Perf

Kanwar Sangha Tue, 26 Feb 2013 07:44:21 -0800

Yep. So the read will remain constant in this case ?


-----Original Message-----
From: Hiller, Dean [mailto:dean.hil...@nrel.gov] 
Sent: 26 February 2013 09:32
To: user@cassandra.apache.org
Subject: Re: Read Perf

In that case, make sure you don't plan on going into the millions or test the 
limit as I pretty sure it can't go above 10 million. (from previous posts on 
this list).

Dean

On 2/26/13 8:23 AM, "Kanwar Sangha" <kan...@mavenir.com> wrote:

>Thanks. For our case, the no of rows will more or less be the same. The 
>only thing which changes is the columns and they keep getting added.
>
>-----Original Message-----
>From: Hiller, Dean [mailto:dean.hil...@nrel.gov]
>Sent: 26 February 2013 09:21
>To: user@cassandra.apache.org
>Subject: Re: Read Perf
>
>To find stuff on disk, there is a bloomfilter for each file in memory.
>On the docs, 1 billion rows has 2Gig of RAM, so it really will have a 
>huge dependency on your number of rows.  As you get more rows, you may 
>need to modify the bloomfilter false positive to use less RAM but that 
>means slower reads.  Ie. As you add more rows, you will have slower 
>reads on a single machine.
>
>We hit the RAM limit on one machine with 1 billion rows so we are in 
>the process of tweaking the ratio of 0.000744(the default) to 0.1 to 
>give us more time to solve.  Since we see no I/o load on our 
>machines(or rather extremely little), we plan on moving to leveled 
>compaction where 0.1 is the default in new releases and size tiered new 
>default I think is 0.01.
>
>Ie. If you store more data per row, this is not an issue as much but 
>still something to consider.  (Also, rows have a limit I think as well 
>on data size but not sure what that is.  I know the column limit on a 
>row is in the millions, somewhere lower than 10 million).
>
>Later,
>Dean
>
>From: Kanwar Sangha <kan...@mavenir.com<mailto:kan...@mavenir.com>>
>Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
><user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>Date: Monday, February 25, 2013 8:31 PM
>To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
><user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>Subject: Read Perf
>
>Hi - I am doing a performance run using modified YCSB client and was 
>able to populate 8TB on a node and then ran some read workloads. I am 
>seeing an average TPS of 930 ops/sec for random reads. There is no key 
>cache/row cache. Question -
>
>Will the read TPS degrade if the data size increases to say 20 TB , 50 
>TB, 100 TB ? If I understand correctly, the read should remain constant 
>irrespective of the data size since we eventually have sorted SStables 
>and binary search would be done on the index filter to find the row ?
>
>
>Thanks,
>Kanwar

RE: Read Perf

Reply via email to