Hi Vidur, I had a go at your solution but the problem is that it doesn't match routes which are valid all throughtout the range queried. For example if I have route that is valid for all of Jan 2016. I will have a table that looks something like this:
start | end | valid New York Washington 2016-01-01 New York Washington 2016-01-31 So if I query for ranges that have at least one bound outside Jan (e.g Jan 15 - Feb 15) then the query you gave will work fine. If, however, I query for a range that is completely inside Jan e.g all routes valid on Jan 15th, The I think I'll end up with a query like: SELECT * from routes where start = 'New York' and end = 'Washington' and valid <= 2016-01-15 and valid >= 2016-01-15. which will return 0 results as it would only match routes that have a valid of 2016-01-15 exactly. thanks, Chris On Wed, Mar 23, 2016 at 11:19 PM, Vidur Malik <vi...@shopkeep.com> wrote: > Flip the problem over. Instead of storing validTo and validFrom, simply > store a valid field and partition by (start, end). This may sound wasteful, > but disk is cheap: > > CREATE TABLE routes ( > start text, > end text, > valid timestamp, > PRIMARY KEY ((start, end), valid) > ); > > Now, you can execute something like: > > SELECT * from routes where start = 'New York' and end = 'Washington' and > valid <= 2016-01-31 and valid >= 2016-01-01. > > > On Wed, Mar 23, 2016 at 5:08 PM, Chris Martin <ch...@cmartinit.co.uk> > wrote: > >> Hi all, >> >> I have a table that represents a train timetable and looks a bit like >> this: >> >> CREATE TABLE routes ( >> start text, >> end text, >> validFrom timestamp, >> validTo timestamp, >> PRIMARY KEY (start, end, validFrom, validTo) >> ); >> >> In this case validFrom is the date that the route becomes valid and >> validTo is the date that the route that stops becoming valid. >> >> If this was SQL I could write a query to find all valid routes between >> New York and Washington from Jan 1st 2016 to Jan 31st 2016 using something >> like: >> >> SELECT * from routes where start = 'New York' and end = 'Washington' and >> validFrom <= 2016-01-31 and validTo >= 2016-01-01. >> >> As far as I can tell such a query is impossible with CQL and my current >> table structure. I'm considering running a query like: >> >> SELECT * from routes where start = 'New York' and end = 'Washington' and >> validFrom <= 2016-01-31 >> >> And then filtering the rest of the data app side. This doesn't seem >> ideal though as I'm going to end up fetching much more data (probably >> around an order of magnitude more) from Cassandra than I really want. >> >> Is there a better way to model the data? >> >> thanks, >> >> Chris >> >> >> >> > > > -- > > Vidur Malik > > [image: ShopKeep] <http://www.shopkeep.com> > > 800.820.9814 > <8008209814> > [image: ShopKeep] <https://www.facebook.com/ShopKeepPOS> [image: ShopKeep] > <https://twitter.com/shopkeep> [image: ShopKeep] > <https://instagram.com/shopkeep/> >