Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
My nodes all have themselves in their list of seeds - always did - and everything works. (You may ask why I did this. I don't know, I must have copied it from an example somewhere.) On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory wrote: > I was able to make the node join the ring but I'm confused. >

Cassandra 0.7 - Query on network topology

2011-01-05 Thread Narendra Sharma
Hi, We are working on defining the ring topology for our cluster. One of the plans under discussion is to have a RF=2 and perform read/write operations with CL=ONE. I know this could be an issue since it doesn't satisfy R+W > RF. This will work if we can always force the clients to go to the first

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
My conclusion is lame: I tried this on several hosts and saw the same behavior, the only way I was able to join new nodes was to first start them when they are *not in* their own seeds list and after they finish transferring the data, then restart them with themselves *in* their own seeds list. Aft

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
I started all my nodes the first time with seeds in their own lists, and it worked. I think I started them in 0.6.1, but I'm not sure. (I'm now using 0.6.8). On Wed, Jan 5, 2011 at 2:07 PM, Ran Tavory wrote: > My conclusion is lame: I tried this on several hosts and saw the same > behavior, the

Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Have you tried not bootstrapping but setting the token and manually calling repair? On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory wrote: > My conclusion is lame: I tried this on several hosts and saw the same > behavior, the only way I was able to join new nodes was to first start them > when they

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I haven't tried repair. Should I? On Jan 5, 2011 3:48 PM, "Jake Luciani" wrote: > Have you tried not bootstrapping but setting the token and manually calling > repair? > > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory wrote: > >> My conclusion is lame: I tried this on several hosts and saw the same

Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Well your ring issues don't make sense to me, seed list should be the same across the cluster. I'm just thinking of other things to try, non-boostrapped nodes should join the ring instantly but reads will fail if you aren't using quorum. On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory wrote: > I hav

The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread David Boxenhorn
The CLI sometimes gets only 100 results (even though there are more) - and sometimes gets all the results, even when there are more than 100! What is going on here? Is there some logic that says if there are too many results return 100, even though "too many" can be more than 100?

The size of the data, I must be doing smth wrong....

2011-01-05 Thread nicolas lattuada
Hi i have some data size issues: i am storing super columns with the following content: {a=>1, b=>2, c=>3...n=>14} i am storing it 300 000 times and i have a data size on the disk about 283Mo And in other side i have a mysql table which stores a bunch of data the schema follows: 6 varch

Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Jonathan Ellis
On Wed, Jan 5, 2011 at 3:37 AM, Narendra Sharma wrote: > What I am looking for is: > 1. Some way to send requests for keys whose token fall between 0-25 to B and > never to C even though C will have the data due to it being replica of B. > 2. Only when B is down or not reachable, the request shoul

Re: The size of the data, I must be doing smth wrong....

2011-01-05 Thread Jonathan Ellis
It's normal for Cassandra to use more disk space than MySQL. It's part of what we trade for not having to rewrite every row when you add a new column. "SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs a GC." http://wiki.apache.org/cassandra/MemtableSSTa

Re: The size of the data, I must be doing smth wrong....

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 9:52 AM, Jonathan Ellis wrote: > It's normal for Cassandra to use more disk space than MySQL.  It's > part of what we trade for not having to rewrite every row when you add > a new column. > > "SSTables that are obsoleted by a compaction are deleted > asynchronously when the

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
If "seed list should be the same across the cluster" that means that nodes *should* have themselves as a seed. If that doesn't work for Ran, then that is the first problem, no? On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani wrote: > Well your ring issues don't make sense to me, seed list should b

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
In storage-conf I see this comment [1] from which I understand that the recommended way to bootstrap a new node is to set AutoBootstrap=true and remove itself from the seeds list. Moreover, I did try to set AutoBootstrap=true and have the node in its own seeds list, but it would not bootstrap. I do

Re: Bootstrapping taking long

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory wrote: > In storage-conf I see this comment [1] from which I understand that the > recommended way to bootstrap a new node is to set AutoBootstrap=true and > remove itself from the seeds list. > Moreover, I did try to set AutoBootstrap=true and have the

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
https://issues.apache.org/jira/browse/CASSANDRA-1676 you have to use at least 0.6.7 On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo wrote: > On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory wrote: > > In storage-conf I see this comment [1] from which I understand that the > > recommended way to boo

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
@Thibaut wrong email? Or how's "Avoid dropping messages off the client request path" (CASSANDRA-1676) related to the bootstrap questions I had? On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz wrote: > https://issues.apache.org/jira/browse/CASSANDRA-1676 > > you have to use at least 0.6.7 > > > > O

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
Had the same Problem a while ago. Upgrading solved the problem (Don't know if you have to redeploy your cluster though) http://www.mail-archive.com/user@cassandra.apache.org/msg07106.html On Wed, Jan 5, 2011 at 4:29 PM, Ran Tavory wrote: > @Thibaut wrong email? Or how's "Avoid dropping message

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
OK, thanks, so I see we had the same problem (I too had multiple keyspace, not that I know why it matters to the problem at hand) and I see that by upgrading to 0.6.7 you solved your problem (I didn't try it, had a different workaround) but frankly, I don't understand how https://issues.apache.org/

Re: Bootstrapping taking long

2011-01-05 Thread Jonathan Ellis
1676 says "Avoid dropping messages off the client request path." Bootstrap messages are "off the client requst path." So, if some of the nodes involved were loaded enough that they were dropping messages older than RPC_TIMEOUT to cope, it could lose part of the bootstrap communication permanently.

Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Nate McCall
It was our original intention on discussing this feature was to have back-and-forth conversion from timestamps (we were modelling similar functionality in Pycassa). It's lack of inclusion may have just been an oversight. We will add this in Hector trunk shortly - thanks for the complete code sample

Question about replication

2011-01-05 Thread Mayuresh Kulkarni
Hello, Is it possible to set the replication factor to some kind of "ALL" setting so that all data gets replicated to all nodes and if a new node is dynamically added to the cluster, the current nodes replicate their data to it? Thanks, Mayuresh

Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
> The CLI sometimes gets only 100 results (even though there are more) - and > sometimes gets all the results, even when there are more than 100! > > What is going on here? Is there some logic that says if there are too many > results return 100, even though "too many" can be more than 100? API ca

Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Peter Schuller
> 1. Some way to send requests for keys whose token fall between 0-25 to B and > never to C even though C will have the data due to it being replica of B. If your data set is large, be mindful of the fact that this will cause C to be completely cold in terms of caches. I.e., when B does go down, C

Re: Question about replication

2011-01-05 Thread Jonathan Ellis
No. On Wed, Jan 5, 2011 at 10:38 AM, Mayuresh Kulkarni wrote: > > Hello, > > Is it possible to set the replication factor to some kind of "ALL" setting > so that all data gets replicated to all nodes and if a new node is > dynamically added to the cluster, the current nodes replicate their data t

Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread David Boxenhorn
I know that there's a limit, and I just assumed that the CLI set it to 100, until I saw more than 100 results. On Wed, Jan 5, 2011 at 6:56 PM, Peter Schuller wrote: > > The CLI sometimes gets only 100 results (even though there are more) - > and > > sometimes gets all the results, even when there

Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
> I know that there's a limit, and I just assumed that the CLI set it to 100, > until I saw more than 100 results. Ooh, sorry. Didn't read carefully enough. Not sure why you see that behavior. Sounds strange; should not be supported at the thrift level AFAIK. -- / Peter Schuller

Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Patricio Echagüe
Roshan, just a comment in your solution. The time returned is not a simple long. It also contains some bits indicating the version. On the other hand, you are assuming that the same machine is processing your request and recreating a UUID base on a long you provide. The clockseqAndNode id will vary

Cassandra Meetup in San Francisco Bay Area

2011-01-05 Thread Mubarak Seyed
We are hosting a Cassandra meetup in BayArea. Jonathan will give a talk on Cassandra 0.7 The link to the meetup page is at http://www.meetup.com/Cassandra-User-Group-Meeting/ Thanks, Mubarak

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I see. Thanks for claryfing Jonathan. On Wednesday, January 5, 2011, Jonathan Ellis wrote: > 1676 says "Avoid dropping messages off the client request path." > Bootstrap messages are "off the client requst path."  So, if some of > the nodes involved were loaded enough that they were dropping mess

pig cassandra contribution

2011-01-05 Thread felix gao
I am having problem running the cassandra_loadfunc.jar on my build of cassandra. PIG_CLASSPATH=:bin/../build/cassandra_loadfunc.jar::bin/../../..//lib/antlr-3.1.3.jar:bin/../../..//lib/avro-1.2.0-dev.jar:bin/../../..//lib/clhm-production.jar:bin/../../..//lib/commons-cli-1.1.jar:bin/../../..//lib/c

Re: Reclaim deleted rows space

2011-01-05 Thread shimi
How does minor compaction is triggered? Is it triggered Only when a new SStable is added? I was wondering if triggering a compaction with minimumCompactionThreshold set to 1 would be useful. If this can happen I assume it will do compaction on files with similar size and remove deleted rows on the

Re: Cassandra Meetup in San Francisco Bay Area

2011-01-05 Thread Jonathan Ellis
Thanks for organizing this, Mubarak! A little more detail -- I'll explain the new features in Cassandra 0.7 including column time-to-live, columnfamily truncation, and secondary indexes, as well as some of the features that have been backported to recent 0.6 releases (aka Why You Should Upgrade Ye

Re: Reclaim deleted rows space

2011-01-05 Thread Jonathan Ellis
Pretty sure there's logic in there that says "don't bother compacting a single sstable." On Wed, Jan 5, 2011 at 2:26 PM, shimi wrote: > How does minor compaction is triggered? Is it triggered Only when a new > SStable is added? > > I was wondering if triggering a compaction with minimumCompaction

Re: pig cassandra contribution

2011-01-05 Thread felix gao
Ignore the above error, I somehow passed that stage. However, I am still having problem with it. grunt> register /home/felix/pig-0.7.0/pig-0.7.1-dev.jar; register /home/felix/cassandra/lib/libthrift.jar; grunt> rows = LOAD 'cassandra://test/data' USING CassandraStorage(); grunt> cols = FOREACH row

Re: Reclaim deleted rows space

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis wrote: > Pretty sure there's logic in there that says "don't bother compacting > a single sstable." > > On Wed, Jan 5, 2011 at 2:26 PM, shimi wrote: >> How does minor compaction is triggered? Is it triggered Only when a new >> SStable is added? >> >>

Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Roshan Dawrani
Hi Patricio, Thanks for your comment. Replying inline. 2011/1/5 Patricio Echagüe > Roshan, just a comment in your solution. The time returned is not a simple > long. It also contains some bits indicating the version. I don't think so. The version bits from the most significant 64 bits of the

Re: Reclaim deleted rows space

2011-01-05 Thread Tyler Hobbs
Although it's not exactly the ability to list specific SSTables, the ability to only compact specific CFs will be in upcoming releases: https://issues.apache.org/jira/browse/CASSANDRA-1812 - Tyler On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo wrote: > On Wed, Jan 5, 2011 at 4:31 PM, Jonathan

Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Patricio Echagüe
Roshan, the first 64 bits does contain the version. The method UUID.timestamp() indeed takes it out before returning. You are right in that point. I based my comment on the UUID spec. What I am not convinced is that the framework should provide support to create an almost identical UUID where only

Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Roshan Dawrani
Hi Patricio, Some thoughts inline. 2011/1/6 Patricio Echagüe > Roshan, the first 64 bits does contain the version. The method > UUID.timestamp() indeed takes it out before returning. You are right in that > point. I based my comment on the UUID spec. > I know 64 bits have the version, but time

Riptano Cassandra trainings in Baltimore and Santa Clara

2011-01-05 Thread Jonathan Ellis
Riptano has two Apache Cassandra training days coming up: Baltimore on Jan 19 and Santa Clara on Feb 4. The Baltimore training will be taught by Jake Luciani, author of Lucandra/Solandra. The Santa Clara training will be taught by Ben Coverston, Riptano's director of operations. These are both f