Re: Instability and memory problems

2010-06-21 Thread Peter Schuller
>> (1) Is the machine swapping? (Actively swapping in/out as reported by >> e.g. vmstat) > > Yes, somewhat, although swappiness is set to 0. Ok. While I have no good suggestion to fix it other than moving away from mmap(), given that a low swappiness didn't help, I'd say that as long as you're swa

Re: Instability and memory problems

2010-06-21 Thread Peter Schuller
> How much of your physical RAM is dedicatd to the JVM? > > I forgot to say that you probably should consider lowering it > significantly (to be continued, getting off the subway...). So, it occurred to be you reported a 16 GB maximum heap size. If that is a substantial portion of your total physi

Is there a penalty to a SuperColumn?

2010-06-21 Thread David Boxenhorn
I have a column family that doesn't need to be a supercolumn family right now, but I think it *might* need to be one in the future. I'm considering making it a supercolumn family with only one supercolumn per row to give me flexibility going forward. My question: Is there a penalty to this? If the

Re: Is there a penalty to a SuperColumn?

2010-06-21 Thread aaron morton
The only one I know is the one listed in http://wiki.apache.org/cassandra/CassandraLimitations The sub columns in a super column are not indexed, so the entire super column must be read into memory when accessed. I've tried using Super Columns and namespacing columns in a standard column, e.g

New to cassandra

2010-06-21 Thread Ajay Singh
Hi I am a php developer, I am new to cassandra. Is there any starting guide or tutorial from where i can begin Thanks Ajay

Re: bulk loading

2010-06-21 Thread aaron morton
You should be using the thrift API, or a wrapper around the thrift API. It looks like you're using internal cassandra classes. There is a Java wrapper called Hector, and there was another talked about on the mail list recently. There is also a bulk import / export tool see http://wiki.apache.o

Re: django or pylons

2010-06-21 Thread Eugenio Minardi
Hi, I had gave a look to django + cassandra I found the twissandra project (a django version of twitter based on cassandra). But since I am new to django I couldnt make it work. If you find it interesting please give me a hint on how to proceed to make it work :) Eugenio On Mon, Jun 21, 2010 at

Re: bulk loading

2010-06-21 Thread Torsten Curdt
> You should be using the thrift API, or a wrapper around the thrift API. It > looks like you're using internal cassandra classes. The goal is to get around using the overhead of the Thrift API for a bulk import. > There is a Java wrapper called Hector, and there was another talked about on > t

Re: Uneven distribution using RP

2010-06-21 Thread aaron morton
According to http://wiki.apache.org/cassandra/Operations nodetool repair is used to perform a major compaction and compare data between the nodes, repairing any conflicts. Not sure that would improve the load balance, though it may reduce some wasted space on the nodes. nodetool loadbalance wil

Re: bulk loading

2010-06-21 Thread Oleg Anastasjev
Torsten Curdt vafer.org> writes: > > First I tried with my one "cassandra -f" instance then I saw this > requires a separate IP. (Why?) This is because your import program becomes a special member of cassandra cluster to be able to speak internal protocol. And each memboer of cassandra cluster

Re: Instability and memory problems

2010-06-21 Thread James Golick
Just an update here. We're now entirely on standard IO mode, and everything is stable and happy. There hasn't been much of a performance hit, if at all. - James On Mon, Jun 21, 2010 at 3:30 AM, Peter Schuller wrote: > > How much of your physical RAM is dedicatd to the JVM? > > > > I forgot to s

Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-06-21 Thread Pieter Maes
I have the same problem. (same lib) I'm using phpcassa lib on thrift but i got from time to time (verry random) "timed out reading 4 bytes from.." I did the patch and changed to framed transport. so i'm not sure what goes wrong :/ i'l try to get some extra debug and how to replicate te problem.. b

java.lang.OutOfMemoryError: Map failed

2010-06-21 Thread jeff
I am using Lucandra to write Lucene documents to my cassandra server. I am processing a MySQL table of about 700k records, 10k at a time. All goes well until I reach about 220k mark. Figure it has something to do with my lack of correct memory configuration for JVM, keyspace or Cassandra. The

how to implement the function similar to inbox search?

2010-06-21 Thread hu wei
in datamodel wiki: You can think of each super column name as a term and the columns within as the docids with rank info and other attributes being a part of it. If you have keys as the userids then you can have a per-user index stored in this form. This is how the per user index for term search i

Re: Instability and memory problems

2010-06-21 Thread Peter Schuller
> Just an update here. We're now entirely on standard IO mode, and everything > is stable and happy. There hasn't been much of a performance hit, if at all. Cool. Just be aware that if my speculation was correct that you're (1) dedicating a very large portion of system memory to cassandra, but (2)

Re: Instability and memory problems

2010-06-21 Thread James Golick
On Mon, Jun 21, 2010 at 12:24 PM, Peter Schuller < peter.schul...@infidyne.com> wrote: > > Just an update here. We're now entirely on standard IO mode, and > everything > > is stable and happy. There hasn't been much of a performance hit, if at > all. > > Cool. Just be aware that if my speculation

unsubscribe

2010-06-21 Thread Dean Steele
unsubscribe

unsubscibe

2010-06-21 Thread Steven Haar
unsubscibe

Re: java.lang.OutOfMemoryError: Map failed

2010-06-21 Thread Daniel
I am no expert in Cassandra, but it looks like you might get your answer from reading this thread: http://www.mail-archive.com/user@cassandra.apache.org/msg03702.html Daniel. On 06/21/2010 06:35 PM, j...@javajet.com wrote: I am using Lucandra to write Lucene documents to my cassandra server. I

Re: Is there a penalty to a SuperColumn?

2010-06-21 Thread Benjamin Black
If there is ambiguity, something else is wrong and you should probably stick with a regular CF. If you are indexing a regular CF with an SCF you are probably doing it right. If you are trying to model some hierarchical structure from your problem domain, I really recommend just using composite ke

Re: Is there a penalty to a SuperColumn?

2010-06-21 Thread Gavan Hood
I have been exploring Cassandra, thrift and hector recently. Very interesting technology. I am still looking for complete examples of code for thrift and hector that exercise and explain the API paramaters, I am also looking for direct API documentation. I found some at the locations below and on t

get_range_slices confused about token ranges after decommissioning a node

2010-06-21 Thread Joost Ouwerkerk
We're seeing very strange behaviour after decommissioning a node: when requesting a get_range_slices with a KeyRange by token, we are getting back tokens that are out of range. As a result, ColumnFamilyRecordReader gets confused, since it uses the last token from the result set to set the start tok

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-21 Thread Rob Coli
On 6/21/10 4:57 PM, Joost Ouwerkerk wrote: We're seeing very strange behaviour after decommissioning a node: when requesting a get_range_slices with a KeyRange by token, we are getting back tokens that are out of range. What sequence of actions did you take to "decommission" the node? What ver

Re: java.lang.OutOfMemoryError: Map failed

2010-06-21 Thread jeff
Daniel: Thanks. That thread helped me solve my problem. I was able to run a 700k MySQL record import without a single memory error.  I changed the following sections in storage-conf.xml to fix the OutofMemory errors:  standard batch  1-Daniel wrote: -To: "user@cassandra.apache.org" From: D

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-21 Thread Joost Ouwerkerk
I believe we did nodetool removetoken on nodes that were already down (due to hardware failure), but I will check to make sure. We're running Cassandra 0.6.2. On Mon, Jun 21, 2010 at 9:59 PM, Joost Ouwerkerk wrote: > Greg, can you describe the steps we took to decommission the nodes? > > > --

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-21 Thread Joost Ouwerkerk
I should add that we have a replication factor of 3 and a cluster with 30 nodes. On Mon, Jun 21, 2010 at 10:02 PM, Joost Ouwerkerk wrote: > I believe we did nodetool removetoken on nodes that were already down (due > to hardware failure), but I will check to make sure. We're running Cassandra > 0

Re: New to cassandra

2010-06-21 Thread Shahan Khan
The wiki is a great place: http://wiki.apache.org/cassandra/FrontPage Getting Started: http://wiki.apache.org/cassandra/GettingStarted [1] Cassandra interfaces with PHP via thrift http://wiki.apache.org/cassandra/ThriftExamples [2] Shahan On Mon, 21 Jun 2010 15:16:51 +0530, Ajay Singh

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-21 Thread Benjamin Black
Did you forget to run repair? On Mon, Jun 21, 2010 at 7:02 PM, Joost Ouwerkerk wrote: > I believe we did nodetool removetoken on nodes that were already down (due > to hardware failure), but I will check to make sure. We're running Cassandra > 0.6.2. > > On Mon, Jun 21, 2010 at 9:59 PM, Joost Ouw

Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-21 Thread Joost Ouwerkerk
Yes, although "forget" implies that we once knew we were supposed to do so. Given the following before-and-after states, on which nodes are we supposed to run repair? Should the cluster be restarted? Is there anything else we should be doing, or not doing? 1. Node is down due to hardware failure