Re: pig_cassandra problem - "Incompatible field schema" error

2011-10-17 Thread Pete Warden
I've dug deeper into this, since this got my script running but still left me at sea when dealing with the actual data. It's looking like there may be a mismatch between the schema that's being reported by CassandraStorage.java, and the data that's actually returned. Here's an example: rows = LOAD

Re: Storing pre-sorted data

2011-10-17 Thread Matthias Pfau
David, thanks for your nice summary on this topic. We would be very happy if cassandra would give us an option to maintain the sort order on our own (application logic). That is why it would be interesting to hear from any of the developers if it would be easily possible to add such a feature

One CF vs several CFs

2011-10-17 Thread Chintana Wilamuna
Hi, Does anyone have an idea about the pros/cons with modeling your data in the following way. First is you write all your data within a single CF. Using the infamous blog example, Posts = { // CF slug-1: { // key to the row inside CF title: "...", body: ".

Re: Storing pre-sorted data

2011-10-17 Thread Matthias Pfau
Thanks for that hint! However, it seems like soundex is a very language specific algorithm (US English). We have to get into this topic further... Kind regards Matthias On 10/13/2011 10:43 PM, Stephen Connolly wrote: Then just use a soundex function on the first word in the text... that will s

Re: pig_cassandra problem - "Incompatible field schema" error

2011-10-17 Thread Pete Warden
JIRA filed, with a messy patch too: https://issues.apache.org/jira/browse/CASSANDRA-3371 cheers, Pete On Mon, Oct 17, 2011 at 2:27 AM, Pete Warden wrote: > I've dug deeper into this, since this got my script running but still left > me at sea when dealing with the actual data. It's l

Re: Cassandra server log continuously the same message

2011-10-17 Thread Jonathan Ellis
It's a debug message. Don't log at debug if you don't want lots of output... 2011/10/17 Thibaut Détrée : > Hello, > > > > I’ve currently upgraded my cluster to the new Cassandra 0.8.7 stable > release. However, I get a strange message in the Cassandra server console > and I m not able to find out

Re: One CF vs several CFs

2011-10-17 Thread Konstantin Naryshkin
Method 1 may also result in very wide rows if you have lots and lots of tags and comments. This is a very drastic inefficiency for Cassadra (but again, it depends on your data). On Mon, Oct 17, 2011 at 05:40, Chintana Wilamuna wrote: > Hi, > > Does anyone have an idea about the pros/cons with mod

Re: Cassandra server log continuously the same message

2011-10-17 Thread Thibaut Détrée
Ok thank you for your answer, however is it possible to use an xml log4j configuration file instead of the properties one (deprecated) ? Thanks, Thibaut Détrée > Message du 17/10/11 15:22 > De : "Jonathan Ellis" > A : user@cassandra.apache.org, "Thibaut Détrée" > Copie à : > Objet : R

Re: Cassandra server log continuously the same message

2011-10-17 Thread Jonathan Ellis
Right now AbstractCassandraDaemon only uses PropertyConfigurator but we'd be happy to review a patch to add xml support. 2011/10/17 Thibaut Détrée : > Ok thank you for your answer, however is it possible to use an xml log4j > configuration file instead of the properties one (deprecated) ? > > Than

Re: show schema fails

2011-10-17 Thread aaron morton
Hi there, If you start cassandra-cli with --debug it will output a stack trace if the error is client side. Otherwise check the server log, by default it's in /var/log/cassandra/system.log Thanks. - Aaron Morton Freelance Developer @aaronmorton http://www.the

Re: CassandraDaemon deactivate doesn't shutdown Cassandra

2011-10-17 Thread aaron morton
What measure are you using to say Cassandra does not shut down ? Can you get a thread dump to see what's still running ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 16/10/2011, at 8:50 AM, Shimi Kiviti wrote: > The problem

Re: Storing pre-sorted data

2011-10-17 Thread aaron morton
Sort order is determined by the Comparator, which is an implementation of the o.a.c.db.marshal.AbstractType class. If you wish to order column (names) in a row based on an opaque (to cassandra) byte value you can create your own implementation. You would then need to decrypt and compare colum

Re: One CF vs several CFs

2011-10-17 Thread aaron morton
It depends on what your workload is and how you want to read the data. If you want to get all the data for an article every time, and the number of comments is not huge go with option 1. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.c

Re: Massive writes when only reading from Cassandra

2011-10-17 Thread Jeremy Hanna
Even after disabling hinted handoff and setting read_repair_chance to 0 on all our column families, we were still experiencing massive writes. Apparently the read_repair_chance is completely ignored at any CL higher than CL.ONE. So we were doing CL.QUORUM on reads and writes and seeing massive

Using elasticsearch on cassandra nodes

2011-10-17 Thread Anthony Ikeda
I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We h

Re: how to reduce disk read? (and bloom filter performance)

2011-10-17 Thread Mohit Anchlia
On Sun, Oct 16, 2011 at 2:20 AM, Radim Kolar wrote: > Dne 10.10.2011 18:53, Mohit Anchlia napsal(a): >> >> Does it mean you are not updating a row or deleting them? > > yes. i have 350m rows and only about 100k of them are updated. >> >>  Can you look at JMX values of >> >> BloomFilter* ? > > i co

Re: show schema fails

2011-10-17 Thread Radim Kolar
Dne 17.10.2011 22:06, aaron morton napsal(a): Hi there, If you start cassandra-cli with --debug it will output a stack trace if the error is client side. A long is exactly 8 bytes: 5 java.lang.RuntimeException: A long is exactly 8 bytes: 5 at org.apache.cassandra.cli.CliClient.

Re: how to reduce disk read? (and bloom filter performance)

2011-10-17 Thread Radim Kolar
Look in jconcole -> org.apache.cassandra.db -> ColumnFamilies bloom filter false ratio is on this server 0.0018 and 0,06% reads hits more than 1 sstable. From cassandra point of view, it looks good.

Re: Storing pre-sorted data

2011-10-17 Thread David Jeske
On Mon, Oct 17, 2011 at 2:39 AM, Matthias Pfau wrote: > We would be very happy if cassandra would give us an option to maintain the > sort order on our own (application logic). That is why it would be > interesting to hear from any of the developers if it would be easily > possible to add such a