Hello,

I'm new to the list and also to Cassandra. I found it when I was searching
for something to replace our busy mysql server.

One of the things we use the server for is filtering IPs based on a list of
IP ranges. These ranges can be small and big, and there are about 50k of
them in the database.

In mysql this is pretty quick: they are stored as integers, and the query
basically looks like (say ip is the ip we want to find the all the ranges
for):

select range from rangelist where ip_start<=ip and ip_end>=ip;

I tried to move this schema to Cassandra, but it turned out to be very
slow, even with indexes on both columns. Since I also had to have an EQ
expression in the query, I added an indexed text field which was the same
for all rows, so the query in cassandra was something like this:

select range from rangelist where type='ip' and ip_start<=ip and ip_end>=ip;

This was very slow, and I imagine it is because it has to scan through all
the rows, making the index useless.

The second thing I tried was to just expand the ranges and store individual
IPs as the keys to a column family. This is very fast to query, but the
problem is that I now have over 2.7 million rows, because some of the
ranges are quite large.

As the number of ranges could change, this method could be a problem -
imagine we add a whole A-class range, it would explode into millions of
rows.

My question is, is there a more sane way to store this information, while
still being able to find all the IP ranges that have the given IP in them?

I've been only dealing with Cassandra for a week or two, so I don't know
about the inner details of what can be done, but I do have programming
experience and am not afraid to get my hands dirty, in case it can be
solved by writing some extension to Cassandra.

Looking forward to any suggestions.

Thanks,
Tamas

Reply via email to