General questions about Cassandra

Alessio Cecchi Fri, 17 Feb 2012 04:43:25 -0800

Hi,

we have developed a software that store logs from mail servers in MySQL,but for huge enviroments we are developing a version that store thisdata in HBase. Raw logs are, once a day, first normalized, so the outputis like this:


username,date of login, IP Address, protocol
username,date of login, IP Address, protocol
username,date of login, IP Address, protocol
[...]

and after inserted into the database.

As I was saying, for huge installation (from 1 to 10 million of loginsper day, keep for 12 months) we are working with HBase, but I would alsoconsider Cassandra.

The advantage of HBase is MapReduce which makes searching the logs veryfast by splitting the "query" concurrently on multiple hosts.

Query will be launched from a web interface (will be few requests perday) and the search keys are user and time range.

But Cassandra seems less complex to manage and simply to run, so I wantto evaluate it instead of HBase.

My question is, can also Cassandra split a "query" over the cluster likeMapReduce? Reading on-line Cassandra seems fast in insert data butslower than HBase to "query". Is it really so?


We want not install Hadoop over Cassandra.

Any suggestion is welcome :-)

--
Alessio Cecchi is:
@ ILS ->  http://www.linux.it/~alessice/
on LinkedIn ->  http://www.linkedin.com/in/alessice
Assistenza Sistemi GNU/Linux ->  http://www.cecchi.biz/
@ PLUG ->  ex-Presidente, adesso senatore a vita, http://www.prato.linux.it
@ LOLUG ->  Socio http://www.lolug.net

General questions about Cassandra

Reply via email to