We have UI interface which needs this data for rendering. So efficiency of pulling this data matters a lot. It should be fetched within a minute. Is there a way to achieve such efficiency
On Wed, Mar 18, 2015 at 4:06 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > Perhaps just fetch them in batches of 1000 or 2000? For 1m rows, it seems > like the difference would only be a few minutes. Do you have to do this all > the time, or only once in a while? > > On Wed, Mar 18, 2015 at 12:34 PM, Mehak Mehta <meme...@cs.stonybrook.edu> > wrote: > >> yes it works for 1000 but not more than that. >> How can I fetch all rows using this efficiently? >> >> On Wed, Mar 18, 2015 at 3:29 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: >> >>> Have you tried a smaller fetch size, such as 5k - 2k ? >>> >>> On Wed, Mar 18, 2015 at 12:22 PM, Mehak Mehta <meme...@cs.stonybrook.edu >>> > wrote: >>> >>>> Hi Jens, >>>> >>>> I have tried with fetch size of 10000 still its not giving any results. >>>> My expectations were that Cassandra can handle a million rows easily. >>>> >>>> Is there any mistake in the way I am defining the keys or querying them. >>>> >>>> Thanks >>>> Mehak >>>> >>>> On Wed, Mar 18, 2015 at 3:02 AM, Jens Rantil <jens.ran...@tink.se> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> Try setting fetchsize before querying. Assuming you don't set it too >>>>> high, and you don't have too many tombstones, that should do it. >>>>> >>>>> Cheers, >>>>> Jens >>>>> >>>>> – >>>>> Skickat från Mailbox <https://www.dropbox.com/mailbox> >>>>> >>>>> >>>>> On Wed, Mar 18, 2015 at 2:58 AM, Mehak Mehta < >>>>> meme...@cs.stonybrook.edu> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have requirement to fetch million row as result of my query which >>>>>> is giving timeout errors. >>>>>> I am fetching results by selecting clustering columns, then why the >>>>>> queries are taking so long. I can change the timeout settings but I need >>>>>> the data to fetched faster as per my requirement. >>>>>> >>>>>> My table definition is: >>>>>> *CREATE TABLE images.results (uuid uuid, analysis_execution_id >>>>>> varchar, analysis_execution_uuid uuid, x double, y double, loc varchar, >>>>>> w >>>>>> double, h double, normalized varchar, type varchar, filehost varchar, >>>>>> filename varchar, image_uuid uuid, image_uri varchar, image_caseid >>>>>> varchar, >>>>>> image_mpp_x double, image_mpp_y double, image_width double, image_height >>>>>> double, objective double, cancer_type varchar, Area float, submit_date >>>>>> timestamp, points list<double>, PRIMARY KEY >>>>>> ((image_caseid),Area,uuid));* >>>>>> >>>>>> Here each row is uniquely identified on the basis of unique uuid. But >>>>>> since my data is generally queried based upon *image_caseid *I have >>>>>> made it partition key. >>>>>> I am currently using Java Datastax api to fetch the results. But the >>>>>> query is taking a lot of time resulting in timeout errors: >>>>>> >>>>>> Exception in thread "main" >>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) >>>>>> tried for query failed (tried: localhost/127.0.0.1:9042 >>>>>> (com.datastax.driver.core.exceptions.DriverException: Timed out waiting >>>>>> for >>>>>> server response)) >>>>>> at >>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84) >>>>>> at >>>>>> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:289) >>>>>> at >>>>>> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:205) >>>>>> at >>>>>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52) >>>>>> at QueryDB.queryArea(TestQuery.java:59) >>>>>> at TestQuery.main(TestQuery.java:35) >>>>>> Caused by: >>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) >>>>>> tried for query failed (tried: localhost/127.0.0.1:9042 >>>>>> (com.datastax.driver.core.exceptions.DriverException: Timed out waiting >>>>>> for >>>>>> server response)) >>>>>> at >>>>>> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:108) >>>>>> at >>>>>> com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:179) >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>>> at >>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>>> at java.lang.Thread.run(Thread.java:744) >>>>>> >>>>>> Also when I try the same query on console even while using limit of >>>>>> 2000 rows: >>>>>> >>>>>> cqlsh:images> select count(*) from results where >>>>>> image_caseid='TCGA-HN-A2NL-01Z-00-DX1' and Area<100 and Area>20 limit >>>>>> 2000; >>>>>> errors={}, last_host=127.0.0.1 >>>>>> >>>>>> Thanks and Regards, >>>>>> Mehak >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >