Re: Installing Thrift with Solandra

2011-06-06 Thread Jean-Nicolas Boulay Desjardins
I just saw a post you made on Stackoverflow, where you said: "The Solandra project which is replacing Lucandra no longer uses thrift, only Solr." So I use Solr to access my data in Cassandra? Thanks again... On Tue, Jun 7, 2011 at 1:39 AM, Jean-Nicolas Boulay Desjardins < jnbdzjn...@gmail.com>

Re: Installing Thrift with Solandra

2011-06-06 Thread Jean-Nicolas Boulay Desjardins
Thanks again :) Ok... But in the tutorial it says that I need to build a Thrift interface for Cassandra: ./compiler/cpp/thrift -gen php ../PATH-TO-CASSANDRA/interface/cassandra.thrift How do I do this? Where is the interface folder? Again, tjake thanks allot for your time and help. On Mon, J

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-06 Thread Benjamin Coverston
Hi AJ, inline: On 6/6/11 11:03 PM, AJ wrote: Hi, I am working on a backup strategy and am trying to understand what is going on in the data directory. I notice that after a write to a CF and then flush, a new set of data files are created with an index number incremented in their names, s

Backups, Snapshots, SSTable Data Files, Compaction

2011-06-06 Thread AJ
Hi, I am working on a backup strategy and am trying to understand what is going on in the data directory. I notice that after a write to a CF and then flush, a new set of data files are created with an index number incremented in their names, such as: Initially: Users-e-1-Filter.db Users-e-

Re: Installing Thrift with Solandra

2011-06-06 Thread Jake Luciani
To access Cassandra in Solandra it's the same as regular cassandra. To access Solr you use one of the Php Solr libraries http://wiki.apache.org/solr/SolPHP On Mon, Jun 6, 2011 at 11:04 PM, Jean-Nicolas Boulay Desjardins < jnbdzjn...@gmail.com> wrote: > I am trying to install Thrift with Soland

Installing Thrift with Solandra

2011-06-06 Thread Jean-Nicolas Boulay Desjardins
I am trying to install Thrift with Solandra. Normally when I just want to install Thrift with Cassandra, I followed this tutorial:https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP But how can I do the same for Solandra? Thrift with PHP... Using Ubuntu Server. Thanks in advance!

Re: Troubleshooting IO performance ?

2011-06-06 Thread aaron morton
There is a big IO queue and reads are spending a lot of time in the queue. Some more questions: - what version are you on ? - what is the concurrent_reads config setting ? - what is nodetool tpstats showing during the slow down ? - exactly how much data are you asking for ? how many rows and

Re: [RELEASE] 0.8.0

2011-06-06 Thread Terje Marthinussen
Yes, I am aware of it but it was not an alternative for this project which will face production soon. The patch I have is fairly non-intrusive (especially vs. 674) so I think it can be interesting depending on how quickly 674 will be integrated into cassandra releases. I plan to take a closer loo

Re: multiple clusters communicating

2011-06-06 Thread Jonathan Ellis
Set the internal port to be different. On Mon, Jun 6, 2011 at 7:01 PM, Jeffrey Wang wrote: > Hey all, > > > > We’re seeing a strange issue in which two completely separate clusters > (0.7.3) on the same subnet (X.X.X.146 through X.X.X.150) with 3 machines > (146-148) and 2 machines (149-150). Bot

multiple clusters communicating

2011-06-06 Thread Jeffrey Wang
Hey all, We're seeing a strange issue in which two completely separate clusters (0.7.3) on the same subnet (X.X.X.146 through X.X.X.150) with 3 machines (146-148) and 2 machines (149-150). Both of them are seeded with the respective machines in their cluster, yet when we run them they end up go

Re: CQL How to do

2011-06-06 Thread Nate McCall
It is specific to the Hector client API, but I just started on a guide that may be of some help, particularly in regards to column configuration and query encoding: https://github.com/rantav/hector/wiki/Using-CQL 2011/6/4 Yonder : > Hi, > > In Cassandra 0.8, CQL become the primary client interface

Re: hector-jpa

2011-06-06 Thread Nate McCall
All tests pass and everything compiles on a clean checkout of hector-jpa for me. I'm not sure how you got that error from hector-jpa, there is no reference in that project to me.prettyprint.hom.CassandraPersistenceProvider. Are you sure you don't have a reference to hector-object-mapper (which is

Re: hector-jpa

2011-06-06 Thread Patrick Julien
yeah, I saw that one, I was more interested in hector-jpa because I don't think this one supports @OneToMany. Was hoping the test failures were a temporary thing On Mon, Jun 6, 2011 at 6:41 PM, Ed Anuff wrote: > I'd recommend looking into Hector Object Mapper, it provides > annotation-based mapp

Re: hector-jpa

2011-06-06 Thread Ed Anuff
I'd recommend looking into Hector Object Mapper, it provides annotation-based mapping of Java object fields to columns: https://github.com/rantav/hector/wiki/Hector-Object-Mapper-%28HOM%29 On Mon, Jun 6, 2011 at 3:27 PM, Patrick Julien wrote: > So what's recommended for right now? The data nuc

Re: hector-jpa

2011-06-06 Thread Patrick Julien
So what's recommended for right now? The data nucleus plugin? I don't need the query parts or anything, I just don't want to do have to translate columns to java fields and vice versa On Mon, Jun 6, 2011 at 6:25 PM, Ed Anuff wrote: > That's a work in progress and actually represents the next g

Re: hector-jpa

2011-06-06 Thread Ed Anuff
That's a work in progress and actually represents the next generation of JPA in Hector. There is a more lightweight version present in the release version of Hector called Hector Object Mapper. I'm sure Nate or Todd who've worked more on hector-jpa can elaborate. Ed On Mon, Jun 6, 2011 at 2:58

Re: Troubleshooting IO performance ?

2011-06-06 Thread Philippe
Ok, here it goes again... No swapping at all... procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobibo in cs us sy id wa 1 63 32044 88736 37996 711652400 227156 0 18314 5607 30 5 11 53 1 63 32044

Re: hector-jpa

2011-06-06 Thread Patrick Julien
It also doesn't run, I get Exception in thread "main" javax.persistence.PersistenceException: Failed to load provider from META-INF/services at javax.persistence.spi.PersistenceProviderResolverHolder$DefaultPersistenceProviderResolver.getPersistenceProviders(PersistenceProviderResolverHol

hector-jpa

2011-06-06 Thread Patrick Julien
https://github.com/riptano/hector-jpa Is this solution usable? I had problems building it, now tests won't pass.

Re: problems with many columns on a row

2011-06-06 Thread aaron morton
Can you upgrade to the official 0.8 release and try again with logging set to DEBUG ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Jun 2011, at 23:41, Mario Micklisch wrote: > :-) > > There are several data Files: > > # l

Re: Replication-aware compaction

2011-06-06 Thread aaron morton
You should consider upgrading to 0.7.6 to get a fix to Gossip. Earlier 0.7 releases were prone to marking nodes up and down when they should not have been. See https://github.com/apache/cassandra/blob/cassandra-0.7/CHANGES.txt#L22 Are the TimedOutExceptions to the client for read or write reques

Re: [RELEASE] 0.8.0

2011-06-06 Thread Ryan King
On Mon, Jun 6, 2011 at 6:09 AM, Terje Marthinussen wrote: > Of course I talked too soon. > I saw a corrupted commitlog some days back after killing cassandra and I > just came across a committed hints file after a cluster restart for some > config changes :( > Will look into that. > Otherwise, not

Re: working with time uuid

2011-06-06 Thread Patrick Julien
thanks On Mon, Jun 6, 2011 at 11:52 AM, Paul Loy wrote: > private static int compareTimestampBytes(ByteBuffer o1, ByteBuffer o2) > { > int o1Pos = o1.position(); > int o2Pos = o2.position(); > > int d = (o1.get(o1Pos+6) & 0xF) - (o2.get(o2Pos+6) & 0xF); > i

Re: working with time uuid

2011-06-06 Thread Jonathan Ellis
... although it does break "ties" by comparing the other bytes. On Mon, Jun 6, 2011 at 10:52 AM, Paul Loy wrote: > private static int compareTimestampBytes(ByteBuffer o1, ByteBuffer o2) > { > int o1Pos = o1.position(); > int o2Pos = o2.position(); > > int d = (o1.g

Re: working with time uuid

2011-06-06 Thread Paul Loy
well, to clarify, it first checks the timestamp bytes, then the rest so it doesn;t say they're the same if they came from 2 different servers. On Mon, Jun 6, 2011 at 4:52 PM, Paul Loy wrote: > private static int compareTimestampBytes(ByteBuffer o1, ByteBuffer o2) > { > int o1Pos

Re: working with time uuid

2011-06-06 Thread Paul Loy
private static int compareTimestampBytes(ByteBuffer o1, ByteBuffer o2) { int o1Pos = o1.position(); int o2Pos = o2.position(); int d = (o1.get(o1Pos+6) & 0xF) - (o2.get(o2Pos+6) & 0xF); if (d != 0) return d; d = (o1.get(o1Pos+7) & 0xFF) - (o2.get(o2

Re: [RELEASE] 0.8.0

2011-06-06 Thread Sylvain Lebresne
On Mon, Jun 6, 2011 at 4:17 PM, Terje Marthinussen wrote: > How did that typo happen... > "across a committed hints file" > should be > "across a corrupted hints file" > Seems like the last supercolumn in the hints file has 0 subcolumns. > This actually seem to be correctly serialized, but my code

working with time uuid

2011-06-06 Thread Patrick Julien
How does this work exactly? If you're using generation 1 time uuids for your keys to get ordering, doesn't this mean the keys need to be generated all on the same host when you either query or insert? Or does cassandra only inspect the bits that represent the time stamp of the UUID when performin

Re: [RELEASE] 0.8.0

2011-06-06 Thread Terje Marthinussen
How did that typo happen... "across a committed hints file" should be "across a corrupted hints file" Seems like the last supercolumn in the hints file has 0 subcolumns. This actually seem to be correctly serialized, but my code has a bug and fail to read it. When that is said, I wonder why the h

Re: Troubleshooting IO performance ?

2011-06-06 Thread Philippe
hum..no, it wasn't swapping. cassandra was the only thing running on that server and i was querying the same keys over and over i restarted Cassandra and doing the same thing, io is now down to zero while cpu is up which dosen't surprise me as much. I'll report if it happens again. Le 5 juin 2011

Re: [RELEASE] 0.8.0

2011-06-06 Thread Marcos Ortiz
El 6/6/2011 1:00 AM, Terje Marthinussen escribió: 0.8 under load may turn out to be more stable and well behaving than any release so far Been doing a few test runs stuffing more than 1 billion records into a 12 node cluster and thing looks better than ever. VM's stable and nice at 11GB. No da

Re: [RELEASE] 0.8.0

2011-06-06 Thread Terje Marthinussen
Of course I talked too soon. I saw a corrupted commitlog some days back after killing cassandra and I just came across a committed hints file after a cluster restart for some config changes :( Will look into that. Otherwise, not defaults, but close. The dataset is fed from scratch so yes, memtable

Re: [SPAM] Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Donal Zang
On 06/06/2011 14:29, David Boxenhorn wrote: Jonathan, are Donal Zang's results (10x slowdown) typical? On Mon, Jun 6, 2011 at 3:14 PM, Jonathan Ellis > wrote: On Mon, Jun 6, 2011 at 6:28 AM, Donal Zang mailto:zan...@ihep.ac.cn>> wrote: > Another thing I notice

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Jonathan Ellis
If the rows you are updating are not cached, yes. (Otherwise maybe 10% slower.) On Mon, Jun 6, 2011 at 7:29 AM, David Boxenhorn wrote: > Jonathan, are Donal Zang's results (10x slowdown) typical? > > On Mon, Jun 6, 2011 at 3:14 PM, Jonathan Ellis wrote: >> >> On Mon, Jun 6, 2011 at 6:28 AM, Don

Re: Replication-aware compaction

2011-06-06 Thread David Boxenhorn
Version 0.7.3. Yes, I am talking about minor compactions. I have three nodes, RF=3. 3G data (before replication). Not many users (yet). It seems like 3 nodes should be plenty. But when all 3 nodes are compacting, I sometimes get timeouts on the client, and I see in my logs that each one is full of

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread David Boxenhorn
Jonathan, are Donal Zang's results (10x slowdown) typical? On Mon, Jun 6, 2011 at 3:14 PM, Jonathan Ellis wrote: > On Mon, Jun 6, 2011 at 6:28 AM, Donal Zang wrote: > > Another thing I noticed is : if you first do insertion, and then build > the > > secondary index use "update column family ...

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Jonathan Ellis
On Mon, Jun 6, 2011 at 6:28 AM, Donal Zang wrote: > Another thing I noticed is : if you first do insertion, and then build the > secondary index use "update column family ...", and then do select based on > the index, the result is not right (seems the index is still being built > though the "upda

Re: [RELEASE] 0.8.0

2011-06-06 Thread Jonathan Ellis
Has this been running w/ default settings (i.e. relying on the new memtable_total_space_in_mb) or was this an upgrade from 0.7 (or otherwise had the per-CF memtable settings applied?) On Mon, Jun 6, 2011 at 12:00 AM, Terje Marthinussen wrote: > 0.8 under load may turn out to be more stable and we

Re: Setting up cluster and nodetool ring in 0.8.0

2011-06-06 Thread David McNelis
Just to close this out, in case anyone was interested... my problem was firewall related, in that I didn't have my messaging/data port (7000) open on my seed node. Allowing traffic on this port resolved my issues. On Fri, Jun 3, 2011 at 1:43 PM, David McNelis wrote: > Thanks, Jonathan. Both

Re: Migration question

2011-06-06 Thread aaron morton
Sounds like you are OK to turn off the existing cluster first. Assuming so, deliver any hints using JMX then do a nodetool flush to write out all the memtables and checkpoint the commit logs. You can then copy the data directories. The System data directory contains the nodes token and the sc

Re: problems with many columns on a row

2011-06-06 Thread Mario Micklisch
:-) There are several data Files: # ls -al *-Data.db -rw-r--r-- 1 cassandra cassandra 53785327 2011-06-05 14:44 CFTest-g-21-Data.db -rw-r--r-- 1 cassandra cassandra 56474656 2011-06-05 18:04 CFTest-g-38-Data.db -rw-r--r-- 1 cassandra cassandra 21705904 2011-06-05 20:02 CFTest-g-45-Data.db -rw

Re: Replication-aware compaction

2011-06-06 Thread aaron morton
Are you talking about minor (automatic) compactions ? Can you provide some more information on what's happening to make the node unusable and what version you are using? It's not lightweight process, but it should not hurt the node that badly. It is considered an online operation. Delaying co

Migration question

2011-06-06 Thread Eric Czech
Hi, I have a quick question about migrating a cluster. We have a cassandra cluster with 10 nodes that we'd like to move to a new DC and what I was hoping to do is just copy the SSTables for each node to a corresponding node in the new DC (the new cluster will also have 10 nodes). Is there any reas

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Donal Zang
On 06/06/2011 10:15, David Boxenhorn wrote: Is there really a 10x difference between indexed CFs and non-indexed CFs? Well, as for my test, it is! I'm using 0.7.6-2, 9 nodes, 3 replicas, write_consistency_level QUORUM, about 90,000,000 rows (~ 1K per row) I use 20 process, 20rows for each inse

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread David Boxenhorn
Is there really a 10x difference between indexed CFs and non-indexed CFs? On Mon, Jun 6, 2011 at 11:05 AM, Donal Zang wrote: > On 06/06/2011 05:38, Jonathan Ellis wrote: > >> Index updates require read-before-write (to find out what the prior >> version was, if any, and update the index accordin

Replication-aware compaction

2011-06-06 Thread David Boxenhorn
Is there some deep architectural reason why compaction can't be replication-aware? What I mean is, if one node is doing compaction, its replicas shouldn't be doing compaction at the same time. Or, at least a quorum of nodes should be available at all times. For example, if RF=3, and one node is d

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Donal Zang
On 06/06/2011 05:38, Jonathan Ellis wrote: Index updates require read-before-write (to find out what the prior version was, if any, and update the index accordingly). This is random i/o. Index creation on the other hand is a lot of sequential i/o, hence more efficient. So, the classic bulk loa