You can change the table to support Multi-column slice restrictions

CREATE TABLE routes (
start text,
end text,
year int,
month int,
day int,
PRIMARY KEY (start, end, year, month, day)
);

Then using Multi-column slice restrictions you can query:

SELECT * from routes where start = 'New York' and end = 'Washington' and 
(year,month,day) >= (2016,1,1) and (year,month,day) <= (2016,1,31);

For more details about Multi-column slice restrictions read 
http://www.datastax.com/dev/blog/a-deep-look-to-the-cql-where-clause

Oded

From: Chris Martin [mailto:ch...@cmartinit.co.uk]
Sent: Thursday, March 24, 2016 9:40 AM
To: user@cassandra.apache.org
Subject: Re: DataModelling to query date range

Ah- that looks interesting!  I'm actaully still on cassandra 2.x but I was 
planning on updgrading anyway.  Once I do so I'll check this one out.


Chris


On Thu, Mar 24, 2016 at 2:57 AM, Henry M 
<henrymanm...@gmail.com<mailto:henrymanm...@gmail.com>> wrote:
I haven't tried the new SASI indexer but it may help: 
https://github.com/apache/cassandra/blob/trunk/doc/SASI.md


On Wed, Mar 23, 2016 at 2:08 PM, Chris Martin 
<ch...@cmartinit.co.uk<mailto:ch...@cmartinit.co.uk>> wrote:
Hi all,

I have a table that represents a train timetable and looks a bit like this:


CREATE TABLE routes (

start text,

end text,

validFrom timestamp,

validTo timestamp,

PRIMARY KEY (start, end, validFrom, validTo)

);

In this case validFrom is the date that the route becomes valid and validTo is 
the date that the route that stops becoming valid.

If this was SQL I could write a query to find all valid routes between New York 
and Washington from Jan 1st 2016 to Jan 31st 2016 using something like:

SELECT * from routes where start = 'New York' and end = 'Washington' and 
validFrom <= 2016-01-31 and validTo >= 2016-01-01.

As far as I can tell such a query is impossible with CQL and my current table 
structure.  I'm considering running a query like:

SELECT * from routes where start = 'New York' and end = 'Washington' and 
validFrom <= 2016-01-31
And then filtering the rest of the data app side.  This doesn't seem ideal 
though as I'm going to end up fetching much more data (probably around an order 
of magnitude more) from Cassandra than I really want.

Is there a better way to model the data?

thanks,

Chris







Reply via email to