> [pp] no, I didn’t look at proxyhistogram, in fact I don’t know how to run it. > Can you give me insights of how to run it? It's available on nodetool but I cannot remember the version it was added.
If it's not there the information has always been available on the StorageProxyMBean. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 22/03/2013, at 5:15 PM, Pushkar Prasad <pushkar.pra...@airtightnetworks.net> wrote: > Answers prefixed with [PP] > From: aaron morton [mailto:aa...@thelastpickle.com] > Sent: 21 March 2013 23:11 > To: user@cassandra.apache.org > Subject: Re: Unable to fetch large amount of rows > > + Did run cfhistograms, the results are interesting (Note: row cache is > disabled): > SSTables in cfhistograms is a friend here. It tells you how many sstables > were read from per read, if it's above 3 I then take a look at the data > model. If you case I would be wondering how long that row with the time stamp > is written to. Is it spread over many sstables ? > > [PP] Just one SSTable >> + 75% time is spent on disk latency > Do you mean 75% of the latency reported by proxyhistorgrams is also reported > by cfhistograms > > [pp] no, I didn’t look at proxyhistogram, in fact I don’t know how to run it. > Can you give me insights of how to run it? >> +++ When query made on node on which all the records are not present > Do you mean the co-ordinator for the request was not a replica for the row? > > [PP] Correct >> + If my query is >> >> - select * from schema where timestamp = '..' ORDER BY MacAddress, >> would that be faster than, say >> >> - select * from schema where timestamp = '..' > As usual in a DB, it's faster to not re-order things. I'd have to check if > the order by will no-op if it's the same as the clustering columns, for now > lets just keep it out. > >> >> 2) Why does response time suffer when query is made on a node on which >> records to be returned are not present? In order to be able to get better >> response when queried from a different node, can something be done? > During a read one node is asked to return the data, and the others to return > a digest of their data. When the read runs on a node that is a replica the > data read is done locally and the others are asked for a digest, this can > lead to better performance. If you are asking for a large row this will have > a larger impact. > > Astyanax can direct reads to nodes which are replicas. > > Cheers > > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 21/03/2013, at 4:48 PM, Pushkar Prasad > <pushkar.pra...@airtightnetworks.net> wrote: > > > Yes, I'm reading from a single partition. > > -----Original Message----- > From: Hiller, Dean [mailto:dean.hil...@nrel.gov] > Sent: 21 March 2013 01:38 > To: user@cassandra.apache.org > Subject: Re: Unable to fetch large amount of rows > > Is your use case reading from a single partition? If so, you may want to > switch to something like playorm which does virtual partitions so you still > get the performance of multiple disks when reading from a single partition. > My understanding is a single cassandra partition exists on a single node. > Anyways, just an option if that is your use-case. > > Later, > Dean > > From: Pushkar Prasad > <pushkar.pra...@airtightnetworks.net<mailto:pushkar.prasad@airtightnetworks. > net>> > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" > <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Date: Wednesday, March 20, 2013 11:41 AM > To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" > <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Subject: RE: Unable to fetch large amount of rows > > Hi aaron. > > I added pagination, and things seem to have started performing much better. > With 1000 page size, now able to fetch 500K records in 25-30 seconds. > However, I'd like to point you to some interesting observations: > > + Did run cfhistograms, the results are interesting (Note: row cache is > disabled): > +++ When query made on node on which all the records are present > + 75% time is spent on disk latency > + Example: When 50 K entries were fetched, it took 2.65 seconds, out > of which 1.92 seconds were spent in disk latency > +++ When query made on node on which all the records are not present > + Considerable amount of time is spent on things other than disk > latency (probably deserialization/serialization, network, etc.) > + Example: When 50 K entries were fetched, it took 5.74 seconds, out > of which 2.21 seconds were spent in disk latency. > > I've used Astyanax to run the above queries. The results were same when run > with different data points. Compaction has not been done after data > population yet. > > I've a few questions: > 1) Is it necessary to fetch the records in natural order of comparator > column in order to get a high throughput? I'm trying to fetch all the > records for a particular partition ID without any ordering on comparator > column. Would that slow down the response? Consider that timestamp is > partitionId, and MacAddress is natural comparator column. > + If my query is > - select * from schema where timestamp = '..' ORDER BY MacAddress, > would that be faster than, say > - select * from schema where timestamp = '..' > 2) Why does response time suffer when query is made on a node on which > records to be returned are not present? In order to be able to get better > response when queried from a different node, can something be done? > > Thanks > Pushkar > ________________________________ > From: aaron morton [mailto:aa...@thelastpickle.com] > Sent: 20 March 2013 15:02 > To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> > Subject: Re: Unable to fetch large amount of rows > > The query returns fine if I request for lesser number of entries (takes 15 > seconds for returning 20K records). > That feels a little slow, but it depends on the data model, the query type > and the server and a bunch of other things. > > However, as I increase the limit on > number of entries, the response begins to slow down. It results in > TimedOutException. > Make many smaller requests. > This is often faster. > > Isn't it the case that all the data for a partitionID is stored sequentially > in disk? > Yes and no. > In each file all the columns on one partition / row are stored in comparator > order. But there may be many files. > > If that is so, then why does fetching this data take such a long > amount of time? > You need to work out where the time is being spent. > Add timing to your app, use nodetool proxyhistograms to see how long the > requests takes at the co-ordinator, use nodetool histograms to see how long > it takes at the disk level. > > Look at your data model, are you reading data in the natural order of the > comparator. > > If disk throughput is 40 MB/s, then assuming sequential > reads, the response should come pretty quickly. > There is more involved than doing one read from disk and returning it. > > If it is stored > sequentially, why does C* take so much time to return the records? > It is always going to take time to read 500,000 columns. It will take time > on the client to allocate the 2 to 4 million objects needed to represent > them. And once it comes to allocating those objects it will probably take > more than 40MB in ram. > > Do some tests at a smaller scale, start with 500 or 1000 columns then get > bigger, to get a feel for what is practical in your environment. Often it's > better to make many smaller / constant size requests. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 19/03/2013, at 9:38 PM, Pushkar Prasad > <pushkar.pra...@airtightnetworks.net<mailto:pushkar.prasad@airtightnetworks. > net>> wrote: > > > Aaron, > > Thanks for your reply. Here are the answers to questions you had asked: > > I am trying to read all the rows which have a particular TimeStamp. In my > data base, there are 500 K entries for a particular TimeStamp. That means > about 40 MB of data. > > The query returns fine if I request for lesser number of entries (takes 15 > seconds for returning 20K records). However, as I increase the limit on > number of entries, the response begins to slow down. It results in > TimedOutException. > > Isn't it the case that all the data for a partitionID is stored sequentially > in disk? If that is so, then why does fetching this data take such a long > amount of time? If disk throughput is 40 MB/s, then assuming sequential > reads, the response should come pretty quickly. Is it not the case that the > data I am trying to fetch would be sequentially stored? If it is stored > sequentially, why does C* take so much time to return the records? And if > data is stored sequentially, is there any alternative that would allow me to > fetch all the records quickly (by sequential disk fetch)? > > Thanks > Pushkar > > -----Original Message----- > From: aaron morton > [mailto:aa...@thelastpickle.com<http://thelastpickle.com>] > Sent: 19 March 2013 13:11 > To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> > Subject: Re: Unable to fetch large amount of rows > > > I have 1000 timestamps, and for each timestamp, I have 500K different > MACAddress. > So you are trying to read about 2 million columns ? > 500K MACAddresses each with 3 other columns? > > > When I run the following query, I get RPC Timeout exceptions: > What is the exception? > Is it a client side socket timeout or a server side TimedOutException ? > > If my understanding is correct then try reading fewer columns and/or check > the server side for logs. It sounds like you are trying to read too much > though. > > Cheers > > > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 19/03/2013, at 3:51 AM, Pushkar Prasad > <pushkar.pra...@airtightnetworks.net<mailto:pushkar.prasad@airtightnetworks. > net>> wrote: > > > Hi, > > I have following schema: > > TimeStamp > MACAddress > Data Transfer > Data Rate > LocationID > > PKEY is (TimeStamp, MACAddress). That means partitioning is on TimeStamp, > and data is ordered by MACAddress, and stored together physically (let me > know if my understanding is wrong). I have 1000 timestamps, and for each > timestamp, I have 500K different MACAddress. > > > When I run the following query, I get RPC Timeout exceptions: > > > Select * from db_table where Timestamp='...' > > From my understanding, this should give all the rows with just one disk > seek, as all the records for a particular timeStamp. This should be very > quick, however, clearly, that doesn't seem to be the case. Is there > something I am missing here? Your help would be greatly appreciated. > > > Thanks > PP > > > > > >