Re: Heap sudden jump during import

2010-04-03 Thread Benoit Perroud
It exists other tools than jhat to browse a heap dump, which stream the heap dump instead of loading it full in memory like jhat do. Kind regards, Benoit. 2010/4/3 Weijun Li : > I'm running a test to write 30 million columns (700bytes each) to Cassandra: > the process ran smoothly for about 20mi

multinode cluster wiki page

2010-04-03 Thread Benjamin Black
Just added this to the wiki as it seemed a very frequent request on irc: http://wiki.apache.org/cassandra/MultinodeCluster Would very much appreciate feedback and edits to improve it. b

Re: Heap sudden jump during import

2010-04-03 Thread Weijun Li
Thank you Benoit. I did a search but couldn't find any that you mentioned. Both jhat and netbean load entire map file int memory. Do you know the name of the tools that requires less memory to view map file? Thanks, -Weijun On Sat, Apr 3, 2010 at 12:55 AM, Benoit Perroud wrote: > It exists othe

Re: Heap sudden jump during import

2010-04-03 Thread Benoit Perroud
Have a look at either Eclipse Memory Analyser (they have a standalone version of the memory analyser) or YourKit Java Profiler (commercial, but with evaluation license). I successfully load and browse heap bigger than the available memory on the system. Regards, Benoit 2010/4/3 Weijun Li : > Tha

Re: Heap sudden jump during import

2010-04-03 Thread Weijun Li
Eclipse Memory Analyser rocks! Thanks a lot!! -Weijun On Sat, Apr 3, 2010 at 2:25 AM, Benoit Perroud wrote: > Have a look at either Eclipse Memory Analyser (they have a standalone > version of the memory analyser) or YourKit Java Profiler (commercial, > but with evaluation license). I successfu

Re: multinode cluster wiki page

2010-04-03 Thread Benoit Perroud
Hi, Nice work. I guess just a small mistake : the second 192.168.1.1 should be 192.168.2.34 And I would suggest to add a small part on making the thrift interface listening on more than localhost. Kind regards, Benoit. 2010/4/3 Benjamin Black : > Just added this to the wiki as it seemed a ver

Bug in Cassandra that occurs when removing a supercolumn.

2010-04-03 Thread Arash Bazrafshan
ello. A bug occurs for me when working with Cassandra. With this e-mail I intend to show what I do to recreate it, and then perhaps you can try it out too. SUMMARY OF THE BUG: (1): insert a row with a supercolumn that contains a subcolumn. (2) remove the supercolumn. (3) reinsert the sa

Re: 0.5.1 exception: java.io.IOException: Reached an EOL or something bizzare occured

2010-04-03 Thread Anty
Does anyone have solve the problem?I encounter the same error too. On Mon, Mar 29, 2010 at 12:12 AM, Benoit Perroud wrote: > I got the same error when the nodes are using lot of I/O, i.e during > compaction. > > 2010/3/28 Eric Yu : > > I have not restart my nodes. > > OK, may be I should give 0.

Re: 0.5.1 exception: java.io.IOException: Reached an EOL or something bizzare occured

2010-04-03 Thread Benoit Perroud
My guess is that the servers I use have not enough I/O nor CPU power. I run on a virtualized env, and even the vmstat command lag a lot. But it do not appears that the overall application behavior is degraded by this error, only the "eventually" takes a little longer. -- Kind regads, Benoit. 20

Re: multinode cluster wiki page

2010-04-03 Thread Benjamin Black
Thank you! Updated. On Sat, Apr 3, 2010 at 2:57 AM, Benoit Perroud wrote: > Hi, > > Nice work. > I guess just a small mistake : > > the second 192.168.1.1 should be > 192.168.2.34 > > And I would suggest to add a small part on making the thrift interface > listening on more than localhost. > > K

Re: multinode cluster wiki page

2010-04-03 Thread Joseph Ruscio
Ben, Great, was looking for something like this just the other day. One question I'm still unclear on, when setting up multiple nodes, say 4-8 (or more) what's the suggested ratio of seed vs. non-seed nodes? thanks, Joe On Apr 3, 2010, at 1:14 AM, Benjamin Black wrote: > Just added this to th

Re: multinode cluster wiki page

2010-04-03 Thread Benjamin Black
Seeds are used for ring discovery, so there really isn't a load concern for them, afaict. Have enough to meet your availability needs, including placement, and rock out. On Sat, Apr 3, 2010 at 9:01 AM, Joseph Ruscio wrote: > Ben, > > Great, was looking for something like this just the other day.

Re: multinode cluster wiki page

2010-04-03 Thread Avinash Lakshman
We use anywhere from 3-5 seeds for clusters that have over 150 nodes. That should suffice for larger sizes too since they are only for initial discovery. Avinash On Sat, Apr 3, 2010 at 9:19 AM, Benjamin Black wrote: > Seeds are used for ring discovery, so there really isn't a load > concern for

Re: Deployment on AWS

2010-04-03 Thread Joe Stump
On Apr 2, 2010, at 4:49 PM, Masood Mortazavi wrote: > Is there a ready recipe for deploying a Cassandra cluster in AWS? ... (Seeds > need some "fixed" IP addresses.) We have a lot of code around this that we're trying to get released. We have a rack aware strategy for cross-AZ clusters. We als

cascal - high level scala cassandra client (yes - another one)

2010-04-03 Thread Chris Shorrock
For the past week or so I've been developing (another) Scala based high level Cassandra client - Cascal. While I know there's several other (good quality) clients I thought developing my own would be a great way to familiarize myself with Cassandra as part of my analysis at work (which it was!). W

Re: Deployment on AWS

2010-04-03 Thread Peter Chang
Woot. Ver much looking forward to this stuff Joe. On Sat, Apr 3, 2010 at 10:14 AM, Joe Stump wrote: > > On Apr 2, 2010, at 4:49 PM, Masood Mortazavi wrote: > > > Is there a ready recipe for deploying a Cassandra cluster in AWS? ... > (Seeds need some "fixed" IP addresses.) > > We have a lot of c

Re: multinode cluster wiki page

2010-04-03 Thread Jonathan Ellis
IMO the "right" way to do it is to configure your machines so that autodetecting listenaddress Just Works, so you can deploy exactly the same config to all nodes. On Sat, Apr 3, 2010 at 3:14 AM, Benjamin Black wrote: > Just added this to the wiki as it seemed a very frequent request on > irc: htt

Re: multinode cluster wiki page

2010-04-03 Thread Benjamin Black
I do not claim it is the best/right way, just the one least likely to go wrong. On Sat, Apr 3, 2010 at 12:10 PM, Jonathan Ellis wrote: > IMO the "right" way to do it is to configure your machines so that > autodetecting listenaddress Just Works, so you can deploy exactly the > same config to all

RE: cascal - high level scala cassandra client (yes - another one)

2010-04-03 Thread Matthew Chambers
Your git page looks great, I like your cassandra explanation and graphic. Is that the 3rd scala library now? Scala must be growing. Too much strange punctuation for me but its good to have a viable functional language for the JVM. From: chris.shorr..

Re: multinode cluster wiki page

2010-04-03 Thread gabriele renzi
On Sat, Apr 3, 2010 at 6:40 PM, Avinash Lakshman wrote: > We use anywhere from 3-5 seeds for clusters that have over 150 nodes. That > should suffice for larger sizes too since they are only for initial > discovery. would it make sense to just use a round robin dns on the available nodes and use

Re: multinode cluster wiki page

2010-04-03 Thread Benjamin Black
Seems like a lot of complexity for a very small win (how often do you bootstrap new nodes? if you only need a handful of seeds, what's all that hard about listing them all on all nodes?). I prefer simple and predictable, and trying to do this with round robin DNS seems to be neither, to me. b

Re: Deployment on AWS

2010-04-03 Thread Lenin Gali
We are looking to take advantage of this as Well. Please let us know when it is ready. Lenin On Sat, Apr 3, 2010 at 11:32 AM, Peter Chang wrote: > Woot. Ver much looking forward to this stuff Joe. > > > On Sat, Apr 3, 2010 at 10:14 AM, Joe Stump wrote: > >> >> On Apr 2, 2010, at 4:49 PM, Masoo

Re: multinode cluster wiki page

2010-04-03 Thread banks
I can see the logic of having an internal DNS entry ' seednodes.internaldomain.com', this might only have 3 defined seed nodes out of 100, but the benefit is single point configuration, no need to edit configs across 100 machines, easily redefinable on the fly as needed... On Sat, Apr 3, 2010 at 1

Re: multinode cluster wiki page

2010-04-03 Thread Benjamin Black
What happens if the IP I get back is for a seed that happens to be down right then? And then that IP is cached locally by my resolver? There is certainly a tempting conceptual simplicity to using DNS, I just don't think the reality is that simple nor is it for the trade in predictability, for me.

Re: Deployment on AWS

2010-04-03 Thread Benjamin Black
What specific features are you looking for to operate on EC2? b On Sat, Apr 3, 2010 at 1:37 PM, Lenin Gali wrote: > We are looking to take advantage of this as Well. Please let us know when it > is ready. > > Lenin > > On Sat, Apr 3, 2010 at 11:32 AM, Peter Chang wrote: >> >> Woot. Ver much lo

Re: LazyBoy question

2010-04-03 Thread Jonathan Ellis
I don't think Lazyboy exposes range queries [that is, iterating rows whose keys you do not know ahead of time]. Pycassa does, though. On Thu, Apr 1, 2010 at 12:05 PM, Gary wrote: > I am trying out the lazyboy library to access cassandra, I was able to get > the data in and out using Record save/

Re: Deployment on AWS

2010-04-03 Thread Joe Stump
On Apr 3, 2010, at 1:53 PM, Benjamin Black wrote: > What specific features are you looking for to operate on EC2? It seemed people weren't looking for features, but tools to help with the management. The two things we've created that people might be interested in are: 1. An EC2-specific rack-a

Re: LazyBoy question

2010-04-03 Thread Joe Stump
On Apr 3, 2010, at 2:00 PM, Jonathan Ellis wrote: > I don't think Lazyboy exposes range queries [that is, iterating rows > whose keys you do not know ahead of time]. Pycassa does, though. I think ieure's fork has itertools support that will let you do crazy iteration stuff with it. I haven't d

Re: Deployment on AWS

2010-04-03 Thread Benjamin Black
I'm pretty familiar with EC2, hence the question. I don't believe any patches are required to do these things. Regardless, as I noted in that ticket, you definitely do NOT need AWS credentials to determine your availability zone. It is available through the metadata web server for each instance

Re: Deployment on AWS

2010-04-03 Thread Joe Stump
On Apr 3, 2010, at 2:54 PM, Benjamin Black wrote: > I'm pretty familiar with EC2, hence the question. I don't believe any > patches are required to do these things. Regardless, as I noted in > that ticket, you definitely do NOT need AWS credentials to determine > your availability zone. It is

Re: Deployment on AWS and replication strategies

2010-04-03 Thread Mike Gallamore
Hi everyone, At my work we are in the early stages of moving our data which lives on EC2 machines from a Flare/memcache system to Cassandra so your chat has been interesting to me. I realize that this might complicate things and make things less "simple" but would it be useful for the nodes th

Re: Deployment on AWS

2010-04-03 Thread Benjamin Black
Right, you determine AZ by looking at the metadata. us-east-1a is a different AZ from us-east-1b. You can't infer anything beyond that, either with the AWS API or guesses about IP addressing. My EC2 snitch recipe builds a config file for the property snitch that treats AZs like racks (just break

Re: Deployment on AWS and replication strategies

2010-04-03 Thread Benjamin Black
On Sat, Apr 3, 2010 at 3:41 PM, Mike Gallamore wrote: > > Useful things that nodes could advertise: > > data-centre they are in, This is what the snitches do. > performance info: mem, CPU etc (these could be used to more intelligently > decide how to partition the data that the new node gets fo

Re: Deployment on AWS and replication strategies

2010-04-03 Thread Mike Gallamore
Hi Benjamin, Thanks for the reply. On 2010-04-03, at 8:12 PM, Benjamin Black wrote: > On Sat, Apr 3, 2010 at 3:41 PM, Mike Gallamore > wrote: >> >> Useful things that nodes could advertise: >> >> data-centre they are in, > > This is what the snitches do. Cool. > >> performance info: mem, CPU

Re: Bug in Cassandra that occurs when removing a supercolumn.

2010-04-03 Thread Vijay
What version do you use? i think that bug was fixed in .6 https://issues.apache.org/jira/browse/CASSANDRA-703 Regards, On Sat, Apr 3, 2010 at 5:27 AM, Arash Bazrafshan wrote: > ello. > > A bug occurs for me when working with Cassandra. > > With this e-mail I intend to show what I do to