Re: Help with getting Key range with some column limitations

2010-08-19 Thread Chen Xinli
Hi, If reading latency is tolerable, you can get 700 columns every time, set end key of last iteration as start key of next iteration, to retrieve all the data. Or you can implement a plugin of cassandra to do column filter, only returns the data you want. The computation is done locally in cassa

RE: kundera: Open source JPA 1.0 compliant ORM for Cassandra

2010-08-19 Thread Animesh Kumar
Hi Jonathan, Glad that you liked the effort. Kundera team is constantly watching Cassandra developments to pick features up as soon as they become available, for example, secondary indexes, schema generator support etc. I would like to highlight few important stuffs about Kundera: 1. Kundera h

Data Corruption

2010-08-19 Thread Waqas Badar
We are observing a strange behavior of Cassandra. We have a ring of two nodes. When we inserts data in cassandra then old data after some entries get vanished. Please note that it is not a data loss, as when we move that data to separate node then all data is shown. We are using Cassandra 0.6.3 an

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-19 Thread Julie
Peter Schuller infidyne.com> writes: > Without necessarily dumping all the information - approximately what > do they contain? Do they contain anything about compactions, > anti-compactions, streaming, etc? > > With an idle node after taking writes, I *think* the only expected > disk I/O (once i

Re: kundera: Open source JPA 1.0 compliant ORM for Cassandra

2010-08-19 Thread Chen Xinli
I have a question about lucandra: the row-ids from reversed-index is computed in cassandra , or using a client to get reversed index first and computed in client machine? 2010/8/19 Animesh Kumar > Hi Jonathan, > > Glad that you liked the effort. Kundera team is constantly watching > Cassandra de

Re: Data Corruption

2010-08-19 Thread Jonathan Ellis
You're moving data around manually? That sounds like a good way to confuse Cassandra's replication. On Thu, Aug 19, 2010 at 4:33 AM, Waqas Badar wrote: > We are observing a strange behavior of Cassandra. We have a ring of two > nodes. When we inserts data in cassandra then old data after some en

Re: Pig + Cassandra = Connection errors

2010-08-19 Thread Christian Decker
In the hopes to better understand the problem I took the liberty of putting the storage-conf.xml on the net [1]. I even tried starting from scratch again, and taking care about which interfaces I use, and what ports I bind to, but until now, nothing really got me anywhere. [1] http://pastebin.com/

Cassandra w/ Hadoop

2010-08-19 Thread Mark
Are there any examples/tutorials on the web for reading/writing from Cassandra into/from Hadoop? I found the example in contrib/word_count but I really can't make sense of it... a tutorial/explanation would help.

Re: Cassandra w/ Hadoop

2010-08-19 Thread Jeremy Hanna
I would check out http://wiki.apache.org/cassandra/HadoopSupport for more info. I'll try to explain a bit more here, but I don't think there's a tutorial out there yet. For input: - configure your main class where you're starting the mapreduce job the way the word_count is configured (with eit

Re: Thrift + PHP: help!

2010-08-19 Thread Dave Viner
I am a user of the perl api - so I'd like to lurk in case there are things that can benefit both perl & php. Dave Viner On Wed, Aug 18, 2010 at 1:35 PM, Gabriel Sosa wrote: > I would like to help with this too! > > > On Wed, Aug 18, 2010 at 5:15 PM, Bas Kok wrote: > >> I have some experience

Re: Cassandra w/ Hadoop

2010-08-19 Thread Christian Decker
If, like me, you prefer to write your jobs on the fly try taking a look at Pig. Cassandra provides a loadfunc under contrib/pig/ in the source package which allows you to load data directly from Cassandra. -- Christian Decker Software Architect http://blog.snyke.net On Thu, Aug 19, 2010 at 7:23 P

Re: Cassandra w/ Hadoop

2010-08-19 Thread Mark
On 8/19/10 10:34 AM, Christian Decker wrote: If, like me, you prefer to write your jobs on the fly try taking a look at Pig. Cassandra provides a loadfunc under contrib/pig/ in the source package which allows you to load data directly from Cassandra. -- Christian Decker Software Architect http

Re: Cassandra w/ Hadoop

2010-08-19 Thread Mark
On 8/19/10 10:23 AM, Jeremy Hanna wrote: I would check out http://wiki.apache.org/cassandra/HadoopSupport for more info. I'll try to explain a bit more here, but I don't think there's a tutorial out there yet. For input: - configure your main class where you're starting the mapreduce job the

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-19 Thread Robert Coli
On Thu, Aug 19, 2010 at 7:23 AM, Julie wrote: > At this point, I logged in.  The data distribution on this node was 122GB.  I > started performing a manual nodetool cleanup. Check the size of the Hinted Handoff CF? If your nodes are flapping under sustained write, they could be storing a non-triv

Re: Node OOM Problems

2010-08-19 Thread Edward Capriolo
On Thu, Aug 19, 2010 at 2:48 PM, Wayne wrote: > I am having some serious problems keeping a 6 node cluster up and running > and stable under load. Any help would be greatly appreciated. > > Basically it always comes back to OOM errors that never seem to subside. > After 5 minutes or 3 hours of hea

Re: Node OOM Problems

2010-08-19 Thread Edward Capriolo
On Thu, Aug 19, 2010 at 4:13 PM, Wayne wrote: > We are using the random partitioner. The tokens we defined manually and data > is almost totally equal among nodes, 15GB per node when the trouble started. > System vitals look fine. CPU load is ~500% for java, iostats are low, > everything for all p

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
So, these: >  INFO [GC inspection] 2010-08-19 16:34:46,656 GCInspector.java (line 116) GC > for ConcurrentMarkSweep: 41615 ms, 192522712 reclaimed leaving 8326856720 > used; max is 8700035072 [snip] > INFO [GC inspection] 2010-08-19 16:36:00,786 GCInspector.java (line 116) GC > for ConcurrentMark

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
> of a rogue large row is one I never considered. The largest row on the other > nodes is as much as 800megs. I can not get a cfstats reading on the bad node WIth 0.6 I can definitely see this being a problem if I understand its behavior correctly (I have not actually used 0.6 even for testing). I

Re: Node OOM Problems

2010-08-19 Thread Wayne
What is my "live set"? Is the system CPU bound given the few statements below? This is from running 4 concurrent processes against the node...do I need to throttle back the concurrent read/writers? I do all reads/writes as Quorum. (Replication factor of 3). The memtable threshold is the default o

Errors with Cassandra 0.7

2010-08-19 Thread Alaa Zubaidi
Hi, I am trying to run Cassandra 0.7 and I am getting different errors: First it was while calling client.insert and now while calling set_keyspace (see below). Any help is appreciated. Exception in thread "main" org.apache.thrift.transport.TTransportException at org.apache.thrift.transpo

Re: Errors with Cassandra 0.7

2010-08-19 Thread Peter Schuller
> I am trying to run Cassandra 0.7 and I am getting different errors: First it > was while calling client.insert and now while calling set_keyspace (see > below). Are you perhaps not using a framed transport with thrift? Cassandra 0.7 uses framed by default; 0.6 did not. -- / Peter Schuller

Re: Node OOM Problems

2010-08-19 Thread Edward Capriolo
On Thu, Aug 19, 2010 at 4:49 PM, Wayne wrote: > What is my "live set"? Is the system CPU bound given the few statements > below? This is from running 4 concurrent processes against the node...do I > need to throttle back the concurrent read/writers? > > I do all reads/writes as Quorum. (Replicatio

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
> What is my "live set"? Sorry; that meant the "set of data acually live (i.e., not garbage) in the heap". In other words, the amount of memory truly "used". > Is the system CPU bound given the few statements > below? This is from running 4 concurrent processes against the node...do I > need to t

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
> Sorry; that meant the "set of data acually live (i.e., not garbage) in > the heap". In other words, the amount of memory truly "used". And to clarify further this is not the same as the 'used' reported by GC statistics, except as printed after a CMS concurrent mark/sweep has completed (and even

recovering a failed node does not seem to recover replicas

2010-08-19 Thread Scott Dworkis
following the failure handling process described here: http://wiki.apache.org/cassandra/Operations i don't at the end seem to have all the data... i have half as much being reported by nodetool ring as i started with. i'm guessing the replicas are not being recovered. however if i take the e

Re: Errors with Cassandra 0.7

2010-08-19 Thread Alaa Zubaidi
That was it.. Thanks Peter Peter Schuller wrote: I am trying to run Cassandra 0.7 and I am getting different errors: First it was while calling client.insert and now while calling set_keyspace (see below). Are you perhaps not using a framed transport with thrift? Cassandra 0.7 uses framed

Replication factor and other schema changes in >= 0.7

2010-08-19 Thread Andres March
How should we go about changing the replication factor and other keyspace settings now that it and other KSMetaData are no longer managed in cassandra.yaml? I found makeDefinitionMutation() in the Migration class and see that it is called for the other schema migrations. There just seems to

Re: Cassandra w/ Hadoop

2010-08-19 Thread Mark
On 8/19/10 11:14 AM, Mark wrote: On 8/19/10 10:23 AM, Jeremy Hanna wrote: I would check out http://wiki.apache.org/cassandra/HadoopSupport for more info. I'll try to explain a bit more here, but I don't think there's a tutorial out there yet. For input: - configure your main class where yo

test

2010-08-19 Thread ChingShen
test

Re: Job Opportunity in Europe (Nosql, hadoop, crawling)

2010-08-19 Thread sharanabasava raddi
Hi, I have worked on Cassandra, Thrift and Hector, and I did it for 3 months, I have written code for loading and fetching data from Cassandra in single node, I want to work on this domain. Will u provide me the environment to work on this? On Wed, Aug 18, 2010 at 8:29 PM, Thibaut Britz wrote: >

Re: questions regarding read and write in cassandra

2010-08-19 Thread Benjamin Black
More recent. Newest timestamp always wins. And I am moving this to the user list (again) so it can be with all its friendly threads on the exact same topic. On Thu, Aug 19, 2010 at 10:22 AM, Maifi Khan wrote: > Hi David > Thanks for your reply. > But what happens if I read and get 2 nodes has v

Re: test

2010-08-19 Thread mallikarj...@iss-global.com
Don't keep these test mails and don't waist the time of others. ChingShen wrote: test

Re: Data Corruption

2010-08-19 Thread Waqas Badar
In fact, on one node hard disk filled up so thats why we have to shift cassandra manually on another machine. Can you please tell any work around to restore data? On Thu, 2010-08-19 at 09:56 -0500, Jonathan Ellis wrote: > You're moving data around manually? That sounds like a good way to > confu

SV: Replication factor and other schema changes in >= 0.7

2010-08-19 Thread Thorvaldsson Justus
KsDef CfDef <-has metadata And perhaps ColumnDef how to make a ksdef--- KsDef k = new KsDef(); k.setName(keyspacename); k.setReplication_factor(replicafactor); k.setStrategy_class("org.apache.cassandra.locator.RackUnawareStrategy"); List cfDefs = new ArrayList(); k.setCf_defs(cfDefs); c.syst

Re: Node OOM Problems

2010-08-19 Thread Wayne
The NullPointerException does not crash the node. It only makes it flap/go down a for short period and then it comes back up. I do not see anything abnormal in the system log, only that single error in the cassandra.log. On Thu, Aug 19, 2010 at 11:42 PM, Peter Schuller < peter.schul...@infidyne.c

Re: SV: Help with getting Key range with some column limitations

2010-08-19 Thread Jone Lura
Thanks for you suggestions. I tried to iterate them, however I could not get it to work (pretty sure its my code). Im still not to familiar with Cassandra, so could you provide a small example? The key count could be up to atleast 20k and maybe more, and users should not wait for more than