Yep. So the read will remain constant in this case ?
-----Original Message----- From: Hiller, Dean [mailto:dean.hil...@nrel.gov] Sent: 26 February 2013 09:32 To: user@cassandra.apache.org Subject: Re: Read Perf In that case, make sure you don't plan on going into the millions or test the limit as I pretty sure it can't go above 10 million. (from previous posts on this list). Dean On 2/26/13 8:23 AM, "Kanwar Sangha" <kan...@mavenir.com> wrote: >Thanks. For our case, the no of rows will more or less be the same. The >only thing which changes is the columns and they keep getting added. > >-----Original Message----- >From: Hiller, Dean [mailto:dean.hil...@nrel.gov] >Sent: 26 February 2013 09:21 >To: user@cassandra.apache.org >Subject: Re: Read Perf > >To find stuff on disk, there is a bloomfilter for each file in memory. >On the docs, 1 billion rows has 2Gig of RAM, so it really will have a >huge dependency on your number of rows. As you get more rows, you may >need to modify the bloomfilter false positive to use less RAM but that >means slower reads. Ie. As you add more rows, you will have slower >reads on a single machine. > >We hit the RAM limit on one machine with 1 billion rows so we are in >the process of tweaking the ratio of 0.000744(the default) to 0.1 to >give us more time to solve. Since we see no I/o load on our >machines(or rather extremely little), we plan on moving to leveled >compaction where 0.1 is the default in new releases and size tiered new >default I think is 0.01. > >Ie. If you store more data per row, this is not an issue as much but >still something to consider. (Also, rows have a limit I think as well >on data size but not sure what that is. I know the column limit on a >row is in the millions, somewhere lower than 10 million). > >Later, >Dean > >From: Kanwar Sangha <kan...@mavenir.com<mailto:kan...@mavenir.com>> >Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >Date: Monday, February 25, 2013 8:31 PM >To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >Subject: Read Perf > >Hi - I am doing a performance run using modified YCSB client and was >able to populate 8TB on a node and then ran some read workloads. I am >seeing an average TPS of 930 ops/sec for random reads. There is no key >cache/row cache. Question - > >Will the read TPS degrade if the data size increases to say 20 TB , 50 >TB, 100 TB ? If I understand correctly, the read should remain constant >irrespective of the data size since we eventually have sorted SStables >and binary search would be done on the index filter to find the row ? > > >Thanks, >Kanwar