subject:"Re\: Getting the most\-recent version from time\-series data"

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Tupshin Harper

You are correct that with that schema, all data for a give key would be in a single partition, and hence on the same node(s). I missed that before. -Tupshin On Fri, Feb 28, 2014 at 12:47 PM, Clint Kelly wrote: > Hi Tupshin, > > Thanks for your help once again, I really appreciate it. Quick

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Clint Kelly

Hi Tupshin, BTW, you asked earlier about the number of different distinct "family" values. There could easily be millions of different families, each with many different values. Right now I see two options: 1. Query the table once just to get all of the distinct families, then do separate

Re: Getting the most-recent version from time-series data

2014-02-28 Thread Clint Kelly

Hi Tupshin, Thanks for your help once again, I really appreciate it. Quick question regarding the issue of token-aware routing, etc. Let's say that I am using the table described earlier: CREATE TABLE time_series_stuff ( key text, family text, version int, val text, PRIMARY KEY (key,

Re: Getting the most-recent version from time-series data

2014-02-26 Thread Tupshin Harper

And one last clarification. Where I said "stored procedure" earlier, I meant "prepared statement". Sorry for the confusion. Too much typing while tired. -Tupshin On Tue, Feb 25, 2014 at 10:36 PM, Tupshin Harper wrote: > I failed to address the matter of not knowing the families in advance. > >

Re: Getting the most-recent version from time-series data

2014-02-25 Thread Tupshin Harper

I failed to address the matter of not knowing the families in advance. I can't really recommend any solution to that other than storing the list of families in another structure that is readily queryable. I don't know how many families you are thinking, but if it is in the millions or more, You mi

Re: Getting the most-recent version from time-series data

2014-02-25 Thread Tupshin Harper

Hi Clint, What you are describing could actually be accomplished with the Thrift API and a multiget_slice with a slicerange having a count of 1. Initially I was thinking that this was an important feature gap between Thrift and CQL, and was going to suggest that it should be implemented (possible

Re: Getting the most-recent version from time-series data

2014-02-25 Thread Clint Kelly

Hi Jonathan, Thanks for the suggestion! I see a couple of problems with this approach: 1. I do not know a priori all of the family names (so I still would not know what value to use for LIMIT). 2. The "versions" here are similar to timestamps, so one "family" may get updated far more often than

Re: Getting the most-recent version from time-series data

2014-02-25 Thread Jonathan Lacefield

Clint One approach would be to create a copy of this table and switch the clustering columns around so version precedes family. This way you could easily grab the 1st, 2nd, N version rows. Would this help you in your situation? Jonathan > On Feb 25, 2014, at 7:49 PM, Clint Kelly wrote: > >

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

Re: Getting the most-recent version from time-series data

8 matches

Site Navigation

Mail list logo

Footer information