i suppose that he should buy http://shop.oreilly.com/product/0636920010852.do , 
to get an idea of what cassandra can and what can't. that's my personal 
thinking.

--  
francesco.tangari....@gmail.com
Inviato con Sparrow (http://www.sparrowmailapp.com/?sig)


Il giorno venerdì 17 febbraio 2012, alle ore 17.59, Chris Gerken ha scritto:  

> In response to an offline question…
>  
> There are two usage patterns for Cassandra column families, static and 
> dynamic. With both approaches you store objects of a given type into a column 
> family.
>  
> With static usage the object type you're persisting has a single key and each 
> row in the column family maps to a single object. The value of an object's 
> key is stored in the row key and each of the object's properties is stored in 
> a column whose name is the name of the property and whose value is the 
> property value. There are the same number of columns in a row as there are 
> non-null property values. This usage is very much like traditional relational 
> database usage.
>  
> With dynamic usage the object type to be persisted has two keys (I'll get to 
> composite keys in a bit). With this approach the value of an object's primary 
> key is stored as a row key and the entire object is stored in a single column 
> whose name is the value of the object's secondary key and whose value is the 
> entire object (serialized into a ByteBuffer). This results in persisting 
> potentially many objects in a single row. All of those objects have the same 
> primary key and there are as many columns as there are objects with the same 
> primary key. An example of this approach is a time series column family in 
> which each row holds weather readings for a different city and each column in 
> a row holds all of the weather observations for that city at a certain time. 
> The timestamp is used as a column name and an object holding all the 
> observations is serialized and stored in the corresponding column value.
>  
> Cassandra is a really powerful database, but it excels performance-wise with 
> reading and writing time series data stored using a dynamic column family.
>  
> There are variations of the above patterns. You can use composite types to 
> define a row key or column name that are made up of values of multiple keys, 
> for example.
>  
> I gave a presentation on the topic of Cassandra patterns recently to the 
> Austin Cassandra Meetup. You can find my charts there in the archives or 
> posted to my box at the linkedin site below…. or contact me offline.
>  
> To bring this back to the original question. Asking for the ability to apply 
> a Java method to selected rows makes sense for static column families, but I 
> think the more general need is to be able to apply a Java method to selected 
> persisted objects in a column family regardless of static or dynamic usage. 
> While I'm on my soapbox, I think this requirement applies to Pig support as 
> well.
>  
> thx
>  
> Chris Gerken
>  
> chrisger...@mindspring.com (mailto:chrisger...@mindspring.com)
> 512.587.5261
> http://www.linkedin.com/in/chgerken
>  
>  
>  
> On Feb 17, 2012, at 10:07 AM, Chris Gerken wrote:
>  
> > Don,
> >  
> > That's a good idea, but you have to be careful not to preclude the use of 
> > dynamic column families (e.g. CF's with time series-like schemas) which is 
> > what Cassandra's best at. The right approach is to build your own 
> > "ORM"/persistence layer (or generate one with some tools) that can hide the 
> > API differences between static and dynamic CF's. Once you're there, hadoop 
> > and Pig both come very close to what you're asking for.
> >  
> > In other words, you should be asking for a means to apply a Java method to 
> > selected objects (not rows) that are persisted in a Cassandra column family.
> >  
> > thx
> >  
> > - Chris
> >  
> > Chris Gerken
> >  
> > chrisger...@mindspring.com (mailto:chrisger...@mindspring.com)
> > 512.587.5261
> > http://www.linkedin.com/in/chgerken
> >  
> >  
> >  
> > On Feb 17, 2012, at 9:35 AM, Don Smith wrote:
> >  
> > > Are there plans to build-in some sort of map-reduce framework into 
> > > Cassandra and CQL? It seems that users should be able to apply a Java 
> > > method to selected rows in parallel on the distributed Cassandra JVMs. I 
> > > believe Solandra uses such an integration.
> > >  
> > > Don
> > > ________________________________________
> > > From: Alessio Cecchi [ales...@skye.it (mailto:ales...@skye.it)]
> > > Sent: Friday, February 17, 2012 4:42 AM
> > > To: user@cassandra.apache.org (mailto:user@cassandra.apache.org)
> > > Subject: General questions about Cassandra
> > >  
> > > Hi,
> > >  
> > > we have developed a software that store logs from mail servers in MySQL,
> > > but for huge enviroments we are developing a version that store this
> > > data in HBase. Raw logs are, once a day, first normalized, so the output
> > > is like this:
> > >  
> > > username,date of login, IP Address, protocol
> > > username,date of login, IP Address, protocol
> > > username,date of login, IP Address, protocol
> > > [...]
> > >  
> > > and after inserted into the database.
> > >  
> > > As I was saying, for huge installation (from 1 to 10 million of logins
> > > per day, keep for 12 months) we are working with HBase, but I would also
> > > consider Cassandra.
> > >  
> > > The advantage of HBase is MapReduce which makes searching the logs very
> > > fast by splitting the "query" concurrently on multiple hosts.
> > >  
> > > Query will be launched from a web interface (will be few requests per
> > > day) and the search keys are user and time range.
> > >  
> > > But Cassandra seems less complex to manage and simply to run, so I want
> > > to evaluate it instead of HBase.
> > >  
> > > My question is, can also Cassandra split a "query" over the cluster like
> > > MapReduce? Reading on-line Cassandra seems fast in insert data but
> > > slower than HBase to "query". Is it really so?
> > >  
> > > We want not install Hadoop over Cassandra.
> > >  
> > > Any suggestion is welcome :-)
> > >  
> > > --
> > > Alessio Cecchi is:
> > > @ ILS -> http://www.linux.it/~alessice/
> > > on LinkedIn -> http://www.linkedin.com/in/alessice
> > > Assistenza Sistemi GNU/Linux -> http://www.cecchi.biz/
> > > @ PLUG -> ex-Presidente, adesso senatore a vita, http://www.prato.linux.it
> > > @ LOLUG -> Socio http://www.lolug.net
> > >  
> >  
> >  
>  
>  
>  


Reply via email to