Re: ebs or ephemeral
6 nodes and RF3 will mean you can handle between 1 and 2 failed nodes. see http://thelastpickle.com/2011/06/13/Down-For-Me/ Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7/10/2011, at 9:37 PM, Madalina Matei wrote: > Hi Aaron, > > For a 6 nodes cluster, what RF can we use in order to support 2 failed nodes? > From the article that you sent i understood "avoid EMS" and use ephemeral. am > i missing anything? > > Thank you so much for your help, > Madaina > On Fri, Oct 7, 2011 at 9:15 AM, aaron morton wrote: > Data Stax have pre build AMI's here > http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami > > > And an explanation of why we normally avoid ephemeral. > > Also, I would go with 6 nodes. You will then be able to handle up to 2 failed > nodes. > > Hope that helps. > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 7/10/2011, at 9:11 PM, Yi Yang wrote: > >> Obviously ephemeral. It has higher IO availability, will not affect your >> Ethernet IO performance, and it is free (included in instance price) >> and the redundancy is provided by cassandra itself. >> 從我的 BlackBerry® 無線裝置 >> >> From: Madalina Matei >> Date: Fri, 7 Oct 2011 09:02:06 +0100 >> To: >> ReplyTo: user@cassandra.apache.org >> Subject: ebs or ephemeral >> >> Hi, >> >> I'm looking to deploy a 5 nodes cluster in EC2 with RF3 and QUORUM CL. >> >> Could you please advice me on EBS vs ephemeral storage ? >> >> Cheers, >> Madalina > >
Re: 0.7.9 RejectedExecutionException
Have you checked /var/log/cassandra/output.txt (the packaged install pipes std out/err to there) or the system logs ? If there are no errors in the logs it may well be something external killing it. With regard to memory usage, it's hard for people to help unless you provide some numbers. What do you mean by MAX heap ? Is this the max used heap size reported by JMX or the -Xmx setting passed to the server ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8/10/2011, at 7:02 AM, Ashley Martens wrote: > Okay, this is still a problem. This node keeps dieing at 1am every day, most > times without an error in the log. I'd appriciate any help in tracking down > why. > > Additionally, I don't understand why 0.7.x using *way* more RAM than 0.6.x > and 0.8.x, from a top or ps perspective. I'm now watching the JVM memory and > it seems to be more in line with 0.6.x but the MAX heap is crazy high (28G on > my servers).
Re: 54 memtable flushes in hour at peaktime
It's not a problem by it's self, compaction will do it's thing. It you are also seeing read latency increase it may be something you want to look it. What version are you using ? The tuning is different (i.e. it gets easier) between versions 0.7, 0.8 and 1.0. It's probably just the case that you are writing a lot of data. Look for log messages from ColumnFamilyStore that start with "Enqueuing flush of Memtable…" They will tell you how many serialized bytes and operations the memtable soaked up before been flushed. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 10/10/2011, at 6:56 AM, Tomer B wrote: > Hi > > at highest traffic hours i get 54 memtable flushes, this happens for a few > hours during the day and at the rest of hours its ranging from 0 to 10 . > > Should I be doing anything about it? is that number on critical level? can i > live with 54 memtable flushes per hour during peak hours? (I might expect > higher peaks coming during this year). > > (The rest of my memtables with lower traffic range at about 1-4 memtable > flushes per hour). > > thanks
Re: Existing column(s) not readable
What error are you seeing in the server logs ? Are the columns unreadable at all Consistency Levels ? i.e. are the columns unreadable on all nodes. What is the upgrade history of the cluster ? What version did it start at ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 10/10/2011, at 7:42 AM, Thomas Richter wrote: > Hi, > > here is some further information. Compaction did not help, but data is > still there when I dump the row with sstable2json. > > Best, > > Thomas > > On 10/08/2011 11:30 PM, Thomas Richter wrote: >> Hi, >> >> we are running a 3 node cassandra (0.7.6-2) cluster and some of our >> column families contain quite large rows (400k+ columns, 4-6GB row size). >> Replicaton factor is 3 for all keyspaces. The cluster is running fine >> for several months now and we never experienced any serious trouble. >> >> Some days ago we noticed, that some previously written columns could not >> be read. This does not always happen, and only some dozen columns out of >> 400k are affected. >> >> After ruling out application logic as a cause I dumped the row in >> question with sstable2json and the columns are there (and are not marked >> for deletion). >> >> Next thing was setting up a fresh single node cluster and copying the >> column family data to that node. Columns could not be read either. >> Right now I'm running a nodetool compact for the cf to see if data could >> be read afterwards. >> >> Is there any explanation for such behavior? Are there any suggestions >> for further investigation? >> >> TIA, >> >> Thomas >
Re: ebs or ephemeral
just catching the tail end of this discussion. aaron, in your previous email, you said "And an explanation of why we normally avoid ephemeral. " shouldn't this be, avoiding EBS? EBS was a nightmare for us in terms of performance. On Mon, Oct 10, 2011 at 9:23 AM, aaron morton wrote: > 6 nodes and RF3 will mean you can handle between 1 and 2 failed nodes. > > see http://thelastpickle.com/2011/06/13/Down-For-Me/ > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 7/10/2011, at 9:37 PM, Madalina Matei wrote: > > Hi Aaron, > > For a 6 nodes cluster, what RF can we use in order to support 2 failed > nodes? > From the article that you sent i understood "avoid EMS" and use ephemeral. > am i missing anything? > > Thank you so much for your help, > Madaina > On Fri, Oct 7, 2011 at 9:15 AM, aaron morton wrote: > >> Data Stax have pre build AMI's here >> http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami >> >> >> And an explanation of why we normally avoid ephemeral. >> >> Also, I would go with 6 nodes. You will then be able to handle up to 2 >> failed nodes. >> >> Hope that helps. >> >>
Re: ebs or ephemeral
yes, should have been And an explanation of why we normally avoid *EBS*. My bad. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 10/10/2011, at 9:03 PM, Sasha Dolgy wrote: > just catching the tail end of this discussion. aaron, in your previous > email, you said "And an explanation of why we normally avoid ephemeral. " > shouldn't this be, avoiding EBS? EBS was a nightmare for us in terms of > performance. > > On Mon, Oct 10, 2011 at 9:23 AM, aaron morton wrote: > 6 nodes and RF3 will mean you can handle between 1 and 2 failed nodes. > > see http://thelastpickle.com/2011/06/13/Down-For-Me/ > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 7/10/2011, at 9:37 PM, Madalina Matei wrote: > >> Hi Aaron, >> >> For a 6 nodes cluster, what RF can we use in order to support 2 failed nodes? >> From the article that you sent i understood "avoid EMS" and use ephemeral. >> am i missing anything? >> >> Thank you so much for your help, >> Madaina >> On Fri, Oct 7, 2011 at 9:15 AM, aaron morton wrote: >> Data Stax have pre build AMI's here >> http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami >> >> >> And an explanation of why we normally avoid ephemeral. >> >> Also, I would go with 6 nodes. You will then be able to handle up to 2 >> failed nodes. >> >> Hope that helps. >> >
Re: ebs or ephemeral
Agree, EBS systems are not so good for cassandra systems and during previous conversations in this mail list, people tend to use ephemeral. 從我的 BlackBerry® 無線裝置 -Original Message- From: Sasha Dolgy Date: Mon, 10 Oct 2011 10:03:26 To: Reply-To: user@cassandra.apache.org Subject: Re: ebs or ephemeral just catching the tail end of this discussion. aaron, in your previous email, you said "And an explanation of why we normally avoid ephemeral. " shouldn't this be, avoiding EBS? EBS was a nightmare for us in terms of performance. On Mon, Oct 10, 2011 at 9:23 AM, aaron morton wrote: > 6 nodes and RF3 will mean you can handle between 1 and 2 failed nodes. > > see http://thelastpickle.com/2011/06/13/Down-For-Me/ > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 7/10/2011, at 9:37 PM, Madalina Matei wrote: > > Hi Aaron, > > For a 6 nodes cluster, what RF can we use in order to support 2 failed > nodes? > From the article that you sent i understood "avoid EMS" and use ephemeral. > am i missing anything? > > Thank you so much for your help, > Madaina > On Fri, Oct 7, 2011 at 9:15 AM, aaron morton wrote: > >> Data Stax have pre build AMI's here >> http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami >> >> >> And an explanation of why we normally avoid ephemeral. >> >> Also, I would go with 6 nodes. You will then be able to handle up to 2 >> failed nodes. >> >> Hope that helps. >> >>
"Insufficient space" on 1.0.0-rc2 when compacting compressed CFs
Hi, I couldn't find anything on this issue, but maybe my google-fu is weak. I'm running a Cassandra 1.0.0-rc2 cluster with compression enabled for all of the two CFs I have right now. The load on a single node is about 32GB (disk is 80GB per node). Whenever I try to run a compaction using nodetool on one of the CFs, I get the message "insufficient space to compact all requested files" in the log (it goes on to compact some SSTables, but not all). As not even half of the disk is used, compaction should be possible, right? Or does Cassandra use the uncompressed size to check whether there is enough space or not? I estimate that the data is compressed by a factor of about 3x. Cheers, Günter -- Dipl.-Inform. Günter Ladwig Karlsruhe Institute of Technology (KIT) Institute AIFB Englerstraße 11 (Building 11.40, Room 250) 76131 Karlsruhe, Germany Phone: +49 721 608-47946 Email: guenter.lad...@kit.edu Web: www.aifb.kit.edu KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association smime.p7s Description: S/MIME cryptographic signature
Re: Existing column(s) not readable
Hi, no errors in the server logs. The columns are unreadable on all nodes at any consistency level (ONE, QUORUM, ALL). We started with 0.7.3 and upgraded to 0.7.6-2 two months ago. Best, Thomas On 10/10/2011 10:03 AM, aaron morton wrote: > What error are you seeing in the server logs ? Are the columns unreadable at > all Consistency Levels ? i.e. are the columns unreadable on all nodes. > > What is the upgrade history of the cluster ? What version did it start at ? > > Cheers > > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 10/10/2011, at 7:42 AM, Thomas Richter wrote: > >> Hi, >> >> here is some further information. Compaction did not help, but data is >> still there when I dump the row with sstable2json. >> >> Best, >> >> Thomas >> >> On 10/08/2011 11:30 PM, Thomas Richter wrote: >>> Hi, >>> >>> we are running a 3 node cassandra (0.7.6-2) cluster and some of our >>> column families contain quite large rows (400k+ columns, 4-6GB row size). >>> Replicaton factor is 3 for all keyspaces. The cluster is running fine >>> for several months now and we never experienced any serious trouble. >>> >>> Some days ago we noticed, that some previously written columns could not >>> be read. This does not always happen, and only some dozen columns out of >>> 400k are affected. >>> >>> After ruling out application logic as a cause I dumped the row in >>> question with sstable2json and the columns are there (and are not marked >>> for deletion). >>> >>> Next thing was setting up a fresh single node cluster and copying the >>> column family data to that node. Columns could not be read either. >>> Right now I'm running a nodetool compact for the cf to see if data could >>> be read afterwards. >>> >>> Is there any explanation for such behavior? Are there any suggestions >>> for further investigation? >>> >>> TIA, >>> >>> Thomas >> >
Re: "Insufficient space" on 1.0.0-rc2 when compacting compressed CFs
On Mon, Oct 10, 2011 at 10:08 AM, Günter Ladwig wrote: > Hi, > > I couldn't find anything on this issue, but maybe my google-fu is weak. > > I'm running a Cassandra 1.0.0-rc2 cluster with compression enabled for all of > the two CFs I have right now. The load on a single node is about 32GB (disk > is 80GB per node). > > Whenever I try to run a compaction using nodetool on one of the CFs, I get > the message "insufficient space to compact all requested files" in the log > (it goes on to compact some SSTables, but not all). As not even half of the disk is used, compaction should be possible, right? Or does Cassandra use the uncompressed size to check whether there is enough space or not? I estimate that the data is compressed by a factor of about 3x. We do use the uncompressed size to check if there is enough room to compact, which is a bug. I've created https://issues.apache.org/jira/browse/CASSANDRA-3338 to fix it. Thanks for the report. -- Sylvain > > Cheers, > Günter > > -- > Dipl.-Inform. Günter Ladwig > > Karlsruhe Institute of Technology (KIT) > Institute AIFB > > Englerstraße 11 (Building 11.40, Room 250) > 76131 Karlsruhe, Germany > Phone: +49 721 608-47946 > Email: guenter.lad...@kit.edu > Web: www.aifb.kit.edu > > KIT – University of the State of Baden-Württemberg and National Large-scale > Research Center of the Helmholtz Association > >
[RELEASE] Apache Cassandra 0.8.7 released
The Cassandra team is pleased to announce the release of Apache Cassandra version 0.8.7. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here: http://cassandra.apache.org/ Downloads of source and binary distributions are listed in our download section: http://cassandra.apache.org/download/ This version is a maintenance/bug fix release[1]. Please pay attention to the release notes[2] before upgrading and let us know[3] if you were to encounter any problem. Have fun! [1]: http://goo.gl/8bCMG (CHANGES.txt) [2]: http://goo.gl/nOkhy (NEWS.txt) [3]: https://issues.apache.org/jira/browse/CASSANDRA
Volunteers needed - Wiki
Hi there, The dev's have been very busy and Cassandra 1.0 is just around the corner and full of new features. To celebrate I'm trying to give the wiki some loving to make things a little more welcoming for new users. To keep things manageable I'd like to focus on completeness and correctness for now, and worry about being super awesome later. For example the nodetool page is incomplete http://wiki.apache.org/cassandra/NodeTool , we do not have anything about CQL and config page is from 0.7 http://wiki.apache.org/cassandra/StorageConfiguration As a starting point I've created a draft home page http://wiki.apache.org/cassandra/FrontPage_draft_aaron/ . I also hope to use this as a planning tool where we can mark off what's in progress or has been completed. The guidelines I think we should follow are: * ensure coverage of 1.0, a best effort for 0.8 and leave any content from previous versions. * where appropriate include examples from CQL and RPC as both are still supported. If you would like to contribute to this effort please let me know via the email list. It's a great way to contribute to the project and learn how Cassandra works, and I'll do my best to help with any questions you may have. Or if you have something you've already written that you feel may be of use let me know, and we'll see about linking to it. Thanks. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com
A good key for data distribution over nodes
Hi, I am planing to make tests on Cassandra with a few nodes. I want to create a column family where the key will be the date down to the second (like 2011/10/10-16:07:53). Doing so, my keys will be very similar from each others. Is it ok to use such keys if I want my data to be evenly distributed across my nodes or do I have to "do something" ? Thanks in advance. L. Aufrechter
Re: A good key for data distribution over nodes
You should be ok, depending on the partitioner strategy you use. The keys end up created as a hash (which is why when you're setting up your nodes you can give them a specific key. Then, whatever your key is will be used to create an MD5 hash, that hash will then determine what node your data will live on. So while your distribution won't necessarily be completely balanced, it should at least be in the right ballpark. To give you an idea of this in practice, we've got consecutive integer values as our keys and we're using the random partitioner...we have VERY close to the same number of keys on each of our nodes. Then the bigger question about balancing your load is how big each record is...if they are consistent in size, vary widely, ect, as that is just as likely to impact how balanced your loads are. On Mon, Oct 10, 2011 at 9:09 AM, Laurent Aufrechter < laurent.aufrech...@yahoo.fr> wrote: > Hi, > > I am planing to make tests on Cassandra with a few nodes. I want to create > a column family where the key will be the date down to the second (like > 2011/10/10-16:07:53). Doing so, my keys will be very similar from each > others. Is it ok to use such keys if I want my data to be evenly distributed > across my nodes or do I have to "do something" ? > > Thanks in advance. > > L. Aufrechter > -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.*
Re: 0.7.9 RejectedExecutionException
I have check both the output file and the system log, neither have errors in them. I don't believe anything external is killing the process, I could be wrong but this node's setup is the same as all my other nodes (including hardware) so it doesn't make much sense. jsvc.exec -user cassandra -home /usr/lib/jvm/java-6-openjdk/jre/bin/../ -pidfile /var/run/cassandra.pid -errfile &1 -outfile /var/log/cassandra/output.log -cp /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.7.8.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/avro-1.4.0-fixes.jar:/usr/share/cassandra/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/concurrentlinkedhashmap-lru-1.1.jar:/usr/share/casandra/guava-r05.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jetty-6.1.21.jar:/usr/share/cassandra/jetty-util-6.1.21.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/jug-2.0.0.jar:/usr/share/cassandra/libthrift-0.5.jar:/usr/share/cassandra/log4j-1.2.16.jar:/usr/share/cassandra/servlet-api-2.5-20081211.jar:/usr/share/cassandra/slf4j-api-1.6.1.jar:/usr/share/cassandra/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/snakeyaml-1.6.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar -Dlog4j.configuration=log4j-server.properties -XX:HeapDumpPath=/var/lib/cassandra/java_1318260751.hprof -XX:ErrorFile=/var/lib/casandra/hs_err_1318260751.log -ea -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms24196M -Xmx24196M -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=8080 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false org.apache.cassandra.thrift.CassandraDaemon I have munin monitoring of JMX so when I talk about heap max then I'm referring to: jmxObjectName java.lang:type=Memory jmxAttributeName HeapMemoryUsage jmxAttributeKey max The other crazy thing is the heap used is no where close to heap max. On Mon, Oct 10, 2011 at 12:40 AM, aaron morton wrote: > Have you checked /var/log/cassandra/output.txt (the packaged install pipes > std out/err to there) or the system logs ? If there are no errors in the > logs it may well be something external killing it. > > With regard to memory usage, it's hard for people to help unless you > provide some numbers. What do you mean by MAX heap ? Is this the max used > heap size reported by JMX or the -Xmx setting passed to the server ? > >
factors on the effectiveness of bloom filter?
I noticed that 2 of my CFs are showing very different bloom filter false ratios, one is close to 1.0; the other one is only 0.3 they have roughly the same sizes in SStables and counts, the difference is key construction, the one with 0.3 false ratio has a shorter key. assuming the key can not be changed (or the only possibility to change they key is simply juggle the byte order), is there any measure to increase the effectiveness of bloom filters? thanks Yang
Re: factors on the effectiveness of bloom filter?
Dne 10.10.2011 18:31, Yang napsal(a): I noticed that 2 of my CFs are showing very different bloom filter false ratios, one is close to 1.0; the other one is only 0.3 cassandra bloom filters are computed for 1% false positive ratio. is there any measure to increase the effectiveness of bloom filters? thanks Yang try hadoop hbase. you can configure it there.
MapReduce with two ethernet cards
Hi all, This may be a silly question, but I'm at a bit of a loss, and was hoping for some help. I have a Cassandra cluster set up with two NICs--one for internel communication between cassandra machines (10.1.1.*), and one to respond to Thrift RPC (172.28.*.*). I also have a Hadoop cluster set up, which, for unrelated reasons, has to remain separate from Cassandra, so I've written a little MapReduce job to copy data from Cassandra to Hadoop. However, when I try to run my job, I get java.io.IOException: failed connecting to all endpoints 10.1.1.24,10.1.1.17,10.1.1.16 which is puzzling to me. It seems like the MR is attempting to connect to the internal communication IPs instead of the external Thrift IPs. Since I set up a firewall to block external access to the internal IPs of Cassandra, this is obviously going to fail. So my question is: why does Cassandra MR seem to be grabbing the listen_address instead of the Thrift one. Presuming it's not a funky configuration error or something on my part, is that strictly necessary? All told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, should I open up port 7000 or port 9160 between Hadoop and Cassandra? Thanks for your help, Scott
Re: Volunteers needed - Wiki
Hi Aaron, I can help with the documentation... I grabbed tons of screenshots as I was installing Cassandra source trunk(1.0.0.rc2?) on my Mac OS X Snow leopard on Eclipse Galileo and later Eclipse Indigo, I will be installing it on Eclipse for Ubuntu 10.04 soon. I took the sceenshots after I noticed the missing picts in here: http://wiki.apache.org/cassandra/RunningCassandraInEclipse so I did plan on helping with the update... I am glad you sent your email though to get me going. I am just not sure of the logistics, how to do it, and if I needed to be granted some write access to the wiki. Please educate... I can definitely help on the NodeTool and StorageConfiguration as soon as I can grok them myself, or any other documentation. Also you draft front page and focusing first on 1.0 first match my thinking. Hani Elabed On Mon, Oct 10, 2011 at 4:10 AM, aaron morton wrote: > Hi there, > The dev's have been very busy and Cassandra 1.0 is just around the corner > and full of new features. To celebrate I'm trying to give the wiki some > loving to make things a little more welcoming for new users. > > To keep things manageable I'd like to focus on completeness and correctness > for now, and worry about being super awesome later. For example the nodetool > page is incomplete http://wiki.apache.org/cassandra/NodeTool , we do not > have anything about CQL and config page is from 0.7 > http://wiki.apache.org/cassandra/StorageConfiguration > > As a starting point I've created a draft home page > http://wiki.apache.org/cassandra/FrontPage_draft_aaron/ . I also hope to > use this as a planning tool where we can mark off what's in progress or has > been completed. > > The guidelines I think we should follow are: > * ensure coverage of 1.0, a best effort for 0.8 and leave any content from > previous versions. > * where appropriate include examples from CQL and RPC as both are still > supported. > > If you would like to contribute to this effort please let me know via the > email list. It's a great way to contribute to the project and learn how > Cassandra works, and I'll do my best to help with any questions you may > have. Or if you have something you've already written that you feel may be > of use let me know, and we'll see about linking to it. > > Thanks. > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > >
Re: how to reduce disk read? (and bloom filter performance)
Does it mean you are not updating a row or deleting them? Can you look at JMX values of BloomFilter* ? I don't believe bloom filter false positive % value is configurable. Someone else might be able to throw more light on this. I believe if you want to keep disk seeks to 1 ssTable you will need to compact more often. On Sun, Oct 9, 2011 at 7:09 AM, Radim Kolar wrote: > Dne 7.10.2011 23:16, Mohit Anchlia napsal(a): >> >> You'll see output like: >> >> Offset SSTables >> 1 8021 >> 2 783 >> >> Which means 783 read operations accessed 2 SSTables > > thank you for explaining it to me. I see this: > > Offset SSTables > 1 59323 > 2 857 > 3 56 > > it means bloom filter failure ratio over 1%. Cassandra in unit tests expects > bloom filter false positive less than 1.05%. HBase has configurable bloom > filters. You can choose 1% or 0.5% - it can make difference for large cache. > > But result is that my poor read performance should not be caused by bloom > filters. >
Re: Volunteers needed - Wiki
On Mon, Oct 10, 2011 at 11:51 AM, hani elabed wrote: > Hi Aaron, > I can help with the documentation... I grabbed tons of screenshots as I was > installing Cassandra source trunk(1.0.0.rc2?) on my Mac OS X Snow leopard on > Eclipse Galileo and later Eclipse Indigo, I will be installing it on Eclipse > for Ubuntu 10.04 soon. I took the sceenshots after I noticed the missing > picts in here: > http://wiki.apache.org/cassandra/RunningCassandraInEclipse Unfortunately, the ASF no longer allows attachments on the wiki. -Brandon
Re: MapReduce with two ethernet cards
On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines wrote: > Hi all, > This may be a silly question, but I'm at a bit of a loss, and was hoping for > some help. > I have a Cassandra cluster set up with two NICs--one for internel > communication between cassandra machines (10.1.1.*), and one to respond to > Thrift RPC (172.28.*.*). > I also have a Hadoop cluster set up, which, for unrelated reasons, has to > remain separate from Cassandra, so I've written a little MapReduce job to > copy data from Cassandra to Hadoop. However, when I try to run my job, I > get > java.io.IOException: failed connecting to all endpoints > 10.1.1.24,10.1.1.17,10.1.1.16 > which is puzzling to me. It seems like the MR is attempting to connect to > the internal communication IPs instead of the external Thrift IPs. Since I > set up a firewall to block external access to the internal IPs of Cassandra, > this is obviously going to fail. > So my question is: why does Cassandra MR seem to be grabbing the > listen_address instead of the Thrift one. Presuming it's not a funky > configuration error or something on my part, is that strictly necessary? All > told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, > should I open up port 7000 or port 9160 between Hadoop and Cassandra? > Thanks for your help, > Scott Your cassandra is old, upgrade to the latest version. -Brandon
AUTO: Manoj Chaudhary is out of the office (returning 10/14/2011)
I am out of the office until 10/14/2011. I am attending conference in Europe and meeting customers and parteners from 10/10/2011 to 10/15/2011. They are might be delay in responding the emails. I will try to respond to email periodically between meetings and some evenings in the local time zone. Note: This is an automated response to your message "A good key for data distribution over nodes" sent on 10/10/11 8:09:31. This is the only notification you will receive while this person is away.
Re: Existing column(s) not readable
How are they unreadable ? You need to go into some details about what is going wrong. What sort of read ? What client ? What is in the logging on client and server side ? Try turning the logging up to DEBUG on the server to watch what happens. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 10/10/2011, at 9:23 PM, Thomas Richter wrote: > Hi, > > no errors in the server logs. The columns are unreadable on all nodes at > any consistency level (ONE, QUORUM, ALL). We started with 0.7.3 and > upgraded to 0.7.6-2 two months ago. > > Best, > > Thomas > > On 10/10/2011 10:03 AM, aaron morton wrote: >> What error are you seeing in the server logs ? Are the columns unreadable >> at all Consistency Levels ? i.e. are the columns unreadable on all nodes. >> >> What is the upgrade history of the cluster ? What version did it start at ? >> >> Cheers >> >> >> - >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 10/10/2011, at 7:42 AM, Thomas Richter wrote: >> >>> Hi, >>> >>> here is some further information. Compaction did not help, but data is >>> still there when I dump the row with sstable2json. >>> >>> Best, >>> >>> Thomas >>> >>> On 10/08/2011 11:30 PM, Thomas Richter wrote: Hi, we are running a 3 node cassandra (0.7.6-2) cluster and some of our column families contain quite large rows (400k+ columns, 4-6GB row size). Replicaton factor is 3 for all keyspaces. The cluster is running fine for several months now and we never experienced any serious trouble. Some days ago we noticed, that some previously written columns could not be read. This does not always happen, and only some dozen columns out of 400k are affected. After ruling out application logic as a cause I dumped the row in question with sstable2json and the columns are there (and are not marked for deletion). Next thing was setting up a fresh single node cluster and copying the column family data to that node. Columns could not be read either. Right now I'm running a nodetool compact for the cf to see if data could be read afterwards. Is there any explanation for such behavior? Are there any suggestions for further investigation? TIA, Thomas >>> >> >
Re: 0.7.9 RejectedExecutionException
The service keeps dieing at the same time every day and there is nothing in the app logs, it's going to be something external. Sorry but I'm not sure what the problem with the memory usage is. Is the server running out of memory, or is it experiencing a lot of GC ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11/10/2011, at 5:00 AM, Ashley Martens wrote: > I have check both the output file and the system log, neither have errors in > them. I don't believe anything external is killing the process, I could be > wrong but this node's setup is the same as all my other nodes (including > hardware) so it doesn't make much sense. > > > jsvc.exec -user cassandra -home /usr/lib/jvm/java-6-openjdk/jre/bin/../ > -pidfile /var/run/cassandra.pid -errfile &1 -outfile > /var/log/cassandra/output.log -cp > /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.7.8.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/avro-1.4.0-fixes.jar:/usr/share/cassandra/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/concurrentlinkedhashmap-lru-1.1.jar:/usr/share/casandra/guava-r05.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jetty-6.1.21.jar:/usr/share/cassandra/jetty-util-6.1.21.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/jug-2.0.0.jar:/usr/share/cassandra/libthrift-0.5.jar:/usr/share/cassandra/log4j-1.2.16.jar:/usr/share/cassandra/servlet-api-2.5-20081211.jar:/usr/share/cassandra/slf4j-api-1.6.1.jar:/usr/share/cassandra/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/snakeyaml-1.6.jar:/usr/share/java/jna.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar > -Dlog4j.configuration=log4j-server.properties > -XX:HeapDumpPath=/var/lib/cassandra/java_1318260751.hprof > -XX:ErrorFile=/var/lib/casandra/hs_err_1318260751.log -ea > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms24196M -Xmx24196M > -Xmn1600M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 > -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true > -Dcom.sun.management.jmxremote.port=8080 > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.authenticate=false > org.apache.cassandra.thrift.CassandraDaemon > > I have munin monitoring of JMX so when I talk about heap max then I'm > referring to: > > jmxObjectName java.lang:type=Memory > jmxAttributeName HeapMemoryUsage > jmxAttributeKey max > > The other crazy thing is the heap used is no where close to heap max. > > On Mon, Oct 10, 2011 at 12:40 AM, aaron morton > wrote: > Have you checked /var/log/cassandra/output.txt (the packaged install pipes > std out/err to there) or the system logs ? If there are no errors in the logs > it may well be something external killing it. > > With regard to memory usage, it's hard for people to help unless you provide > some numbers. What do you mean by MAX heap ? Is this the max used heap size > reported by JMX or the -Xmx setting passed to the server ? >
Re: 0.7.9 RejectedExecutionException
It is actually not at the exact same time of the day. It varies but happens within certain blocks of time, like between 00hr and 02hr. The could be up for hours or it could crash again in 15 minutes. The memory is fine, just using a larger footprint than 0.6 in all ways. On Mon, Oct 10, 2011 at 1:18 PM, aaron morton wrote: > The service keeps dieing at the same time every day and there is nothing in > the app logs, it's going to be something external. > > Sorry but I'm not sure what the problem with the memory usage is. Is the > server running out of memory, or is it experiencing a lot of GC ? > >
cassandra on laptop
I'm running an underpowered laptop (ubuntu) for development work. Installing Cassandra was easy, and getting the twissandra example app up and working was also easy. Here's the problem: after about a day of letting it run (with no load generated to webapp or db), my laptop now becomes unresponsive. If I'm patient, I can shutdown the cassandra service and return things to normal. In each of these cases, the cassandra process is eating up almost all memory, and everything goes to swap. I can't develop against Cassandra in this environment. I know it isn't set up by default to work efficiently on a meager laptop, but are there some common setting somewhere that I can just tweak to make life not be so miserable? I just want to play with it and try it out for this project I'm working on, but that's impractical with default settings. I'm going to have to flee to mongodb or something not as good... I'm also a little nervous about this running on a server now -- I've read enough to understand that by default it's set up to eat lots of memory, and I'm fine with that... but it just lends itself to all the java bigotry that some of us accumulate over the years. Anyway, if someone can give me a pointer on how to set up to run on a laptop in a development setting, big thanks. Thanks! Gary
seeking contractor to assist with upgrade/expansion
hope this is not off topic? we've been struggling following ostensible procedures for awhile now, ready to pony up for some pro help (but not quite ready to pony up for datastax). please contact me at svd at mylife dot com if you are interested. -scott
Re: Volunteers needed - Wiki
Thanks, Hani. If you would like to update the storage config page that would be handy. Just update http://wiki.apache.org/cassandra/FrontPage_draft_aaron/ to say you are working on it. Just click the login link at the top to setup an account. wrt setting up eclipse, perhaps you could post your instructions on a blog somewhere and we can link to it. cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11/10/2011, at 5:51 AM, hani elabed wrote: > Hi Aaron, > > I can help with the documentation... I grabbed tons of screenshots as I was > installing Cassandra source trunk(1.0.0.rc2?) on my Mac OS X Snow leopard on > Eclipse Galileo and later Eclipse Indigo, I will be installing it on Eclipse > for Ubuntu 10.04 soon. I took the sceenshots after I noticed the missing > picts in here: > > http://wiki.apache.org/cassandra/RunningCassandraInEclipse > > so I did plan on helping with the update... I am glad you sent your email > though to get me going. > > I am just not sure of the logistics, how to do it, and if I needed to be > granted some write access to the wiki. Please educate... > > I can definitely help on the NodeTool and StorageConfiguration as soon as I > can grok them myself, or any other documentation. > > Also you draft front page and focusing first on 1.0 first match my thinking. > > Hani Elabed > > > On Mon, Oct 10, 2011 at 4:10 AM, aaron morton wrote: > Hi there, > The dev's have been very busy and Cassandra 1.0 is just around the > corner and full of new features. To celebrate I'm trying to give the wiki > some loving to make things a little more welcoming for new users. > > To keep things manageable I'd like to focus on completeness and > correctness for now, and worry about being super awesome later. For example > the nodetool page is incomplete http://wiki.apache.org/cassandra/NodeTool , > we do not have anything about CQL and config page is from 0.7 > http://wiki.apache.org/cassandra/StorageConfiguration > > As a starting point I've created a draft home page > http://wiki.apache.org/cassandra/FrontPage_draft_aaron/ . I also hope to use > this as a planning tool where we can mark off what's in progress or has been > completed. > > The guidelines I think we should follow are: > * ensure coverage of 1.0, a best effort for 0.8 and leave any content > from previous versions. > * where appropriate include examples from CQL and RPC as both are still > supported. > > If you would like to contribute to this effort please let me know via > the email list. It's a great way to contribute to the project and learn how > Cassandra works, and I'll do my best to help with any questions you may have. > Or if you have something you've already written that you feel may be of use > let me know, and we'll see about linking to it. > > Thanks. > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > >
Re: Existing column(s) not readable
Hi Aaron, normally we use hector to access cassandra, but for debugging I switched to cassandra-cli. Column can not be read by a simple get CFName['rowkey']['colname']; Response is "Value was not found" if i query another column, everything is just fine. Serverlog for unsuccessful read (keyspace and CF names replaced): DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,739 CassandraServer.java (line 280) get DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,744 StorageProxy.java (line 320) Command/ConsistencyLevel is SliceByNamesReadCommand(table='Keyspace', key=61636162626139322d396638312d343562382d396637352d393162303337383030393762, columnParent='QueryPath(columnFamilyName='ColumnFamily', superColumnName='null', columnName='null')', columns=[574c303030375030,])/ONE DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 ReadCallback.java (line 86) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 StorageProxy.java (line 343) reading data locally DEBUG [ReadStage:33] 2011-10-10 23:15:29,751 StorageProxy.java (line 448) LocalReadRunnable reading SliceByNamesReadCommand(table='Keyspace', key=61636162626139322d396638312d343562382d396637352d393162303337383030393762, columnParent='QueryPath(columnFamilyName='ColumnFamily', superColumnName='null', columnName='null')', columns=[574c303030375030,]) DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,818 StorageProxy.java (line 393) Read: 67 ms. Log looks fine to me, but no result is returned. Best, Thomas On 10/10/2011 10:00 PM, aaron morton wrote: > How are they unreadable ? You need to go into some details about what is > going wrong. > > What sort of read ? > What client ? > What is in the logging on client and server side ? > > > Try turning the logging up to DEBUG on the server to watch what happens. > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 10/10/2011, at 9:23 PM, Thomas Richter wrote: > >> Hi, >> >> no errors in the server logs. The columns are unreadable on all nodes at >> any consistency level (ONE, QUORUM, ALL). We started with 0.7.3 and >> upgraded to 0.7.6-2 two months ago. >> >> Best, >> >> Thomas >> >> On 10/10/2011 10:03 AM, aaron morton wrote: >>> What error are you seeing in the server logs ? Are the columns unreadable >>> at all Consistency Levels ? i.e. are the columns unreadable on all nodes. >>> >>> What is the upgrade history of the cluster ? What version did it start at ? >>> >>> Cheers >>> >>> >>> - >>> Aaron Morton >>> Freelance Cassandra Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 10/10/2011, at 7:42 AM, Thomas Richter wrote: >>> Hi, here is some further information. Compaction did not help, but data is still there when I dump the row with sstable2json. Best, Thomas On 10/08/2011 11:30 PM, Thomas Richter wrote: > Hi, > > we are running a 3 node cassandra (0.7.6-2) cluster and some of our > column families contain quite large rows (400k+ columns, 4-6GB row size). > Replicaton factor is 3 for all keyspaces. The cluster is running fine > for several months now and we never experienced any serious trouble. > > Some days ago we noticed, that some previously written columns could not > be read. This does not always happen, and only some dozen columns out of > 400k are affected. > > After ruling out application logic as a cause I dumped the row in > question with sstable2json and the columns are there (and are not marked > for deletion). > > Next thing was setting up a fresh single node cluster and copying the > column family data to that node. Columns could not be read either. > Right now I'm running a nodetool compact for the cf to see if data could > be read afterwards. > > Is there any explanation for such behavior? Are there any suggestions > for further investigation? > > TIA, > > Thomas >>> >> >
Efficiency of hector's setRowCount
Hector's IndexedSlicesQuery has a setRowCount method that you can use to page through the results, as described in https://github.com/rantav/hector/wiki/User-Guide . rangeSlicesQuery.setRowCount(1001); . rangeSlicesQuery.setKeys(lastRow.getKey(), ""); Is it efficient? Specifically, suppose my query returns 100,000 results and I page through batches of 1000 at a time (making 100 executes of the query). Will it internally retrieve all the results each time (but pass only the desired set of 1000 or so to me)? Or will it optimize queries to avoid the duplication? I presume the latter. :) Can IndexedSlicesQuery's setStartKey method be used for the same effect? Thanks, Don
Re: cassandra on laptop
By default, Cassandra is configured to use half the ram of your system. That's way overkill for playing around with it on a laptop. Edit /etc/cassandra/cassandra-env.sh and set max_heap_size_in_mb to something more suited for your environment. I have it set to 256M for my laptop (with 4G of ram). This works just fine for light development tasks and for running our test suite. -psanford On Mon, Oct 10, 2011 at 1:44 PM, Gary Jefferson wrote: > I'm running an underpowered laptop (ubuntu) for development work. Installing > Cassandra was easy, and getting the twissandra example app up and working was > also easy. > > Here's the problem: after about a day of letting it run (with no load > generated to webapp or db), my laptop now becomes unresponsive. If I'm > patient, I can shutdown the cassandra service and return things to normal. In > each of these cases, the cassandra process is eating up almost all memory, > and everything goes to swap. > > I can't develop against Cassandra in this environment. I know it isn't set up > by default to work efficiently on a meager laptop, but are there some common > setting somewhere that I can just tweak to make life not be so miserable? I > just want to play with it and try it out for this project I'm working on, but > that's impractical with default settings. I'm going to have to flee to > mongodb or something not as good... > > I'm also a little nervous about this running on a server now -- I've read > enough to understand that by default it's set up to eat lots of memory, and > I'm fine with that... but it just lends itself to all the java bigotry that > some of us accumulate over the years. > > Anyway, if someone can give me a pointer on how to set up to run on a laptop > in a development setting, big thanks. > > Thanks! > Gary >
Re: anyway to throttle nodetool repair?
so how about disk io? is there anyway to use ionice to control it? I have tried to adjust the priority by "ionice -c3 -p [cassandra pid]. seems not working... On Wed, Sep 28, 2011 at 12:02 AM, Peter Schuller < peter.schul...@infidyne.com> wrote: > > I saw the ticket about compaction throttling, just wonder is that > necessary > > to add an option or is there anyway to do repair throttling? > > every time I run nodetool repair, it uses all disk io and the server load > > goes up quickly, just wonder is there anyway to make it smoother. > > The validating compaction that is part of repair is subject to > compaction throttling. > > The streaming of sstables afterwards is not however. In 1.0 there is > thottling of streaming: > https://issues.apache.org/jira/browse/CASSANDRA-3080 > > -- > / Peter Schuller (@scode on twitter) >
Re: anyway to throttle nodetool repair?
I am using commodity hardware so even minor compact make disk io goes 100% and server load get very high On Tue, Oct 11, 2011 at 11:19 AM, Yan Chunlu wrote: > so how about disk io? is there anyway to use ionice to control it? > > I have tried to adjust the priority by "ionice -c3 -p [cassandra pid]. > seems not working... > > > On Wed, Sep 28, 2011 at 12:02 AM, Peter Schuller < > peter.schul...@infidyne.com> wrote: > >> > I saw the ticket about compaction throttling, just wonder is that >> necessary >> > to add an option or is there anyway to do repair throttling? >> > every time I run nodetool repair, it uses all disk io and the server >> load >> > goes up quickly, just wonder is there anyway to make it smoother. >> >> The validating compaction that is part of repair is subject to >> compaction throttling. >> >> The streaming of sstables afterwards is not however. In 1.0 there is >> thottling of streaming: >> https://issues.apache.org/jira/browse/CASSANDRA-3080 >> >> -- >> / Peter Schuller (@scode on twitter) >> > >
Multi DC setup
I am trying to understand multi DC setup for cassandra. As I understand, in this setup, replicas exists in same cluster ring, but physically nodes are distributed across DCs. Is this correct? I have two different cluster rings in two DCs, and want to replicate data bidirectionally. They both have same keyspace. They take data traffic from different sources, but we want to make sure, data exists in both the rings. What could be the way to achieve this? Thanks, L.
Re: Multi DC setup
Why have two rings? Cassandra manages the replication for youone ring with physical nodes in two dc might be a better option. Of course, depending on the inter-dc failure characteristics, might need to endure split-brain for a while. /*** sent from my android...please pardon occasional typos as I respond @ the speed of thought / On Oct 10, 2011 10:09 PM, "Cassa L" wrote: I am trying to understand multi DC setup for cassandra. As I understand, in this setup, replicas exists in same cluster ring, but physically nodes are distributed across DCs. Is this correct? I have two different cluster rings in two DCs, and want to replicate data bidirectionally. They both have same keyspace. They take data traffic from different sources, but we want to make sure, data exists in both the rings. What could be the way to achieve this? Thanks, L.
Re: Volunteers needed - Wiki
Hello aaron, I raise my hand too. If you have to-do list about the wiki, please let us know. maki 2011/10/10 aaron morton : > Hi there, > The dev's have been very busy and Cassandra 1.0 is just around the corner > and full of new features. To celebrate I'm trying to give the wiki some > loving to make things a little more welcoming for new users. > To keep things manageable I'd like to focus on completeness and correctness > for now, and worry about being super awesome later. For example the nodetool > page is incomplete http://wiki.apache.org/cassandra/NodeTool , we do not > have anything about CQL and config page is from > 0.7 http://wiki.apache.org/cassandra/StorageConfiguration > As a starting point I've created a draft home > page http://wiki.apache.org/cassandra/FrontPage_draft_aaron/ . I also hope > to use this as a planning tool where we can mark off what's in progress or > has been completed. > The guidelines I think we should follow are: > * ensure coverage of 1.0, a best effort for 0.8 and leave any content from > previous versions. > * where appropriate include examples from CQL and RPC as both are still > supported. > If you would like to contribute to this effort please let me know via the > email list. It's a great way to contribute to the project and learn how > Cassandra works, and I'll do my best to help with any questions you may > have. Or if you have something you've already written that you feel may be > of use let me know, and we'll see about linking to it. > Thanks. > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > -- w3m
Re: Multi DC setup
We already have two separate rings. Idea of bidirectional sync is, if one ring is down, we can still send the traffic to other ring. When original cluster comes back, it will pick up the data from available cluster. I'm not sure if it makes sense to have separate rings or combine these two rings into one. On Mon, Oct 10, 2011 at 10:17 PM, Milind Parikh wrote: > Why have two rings? Cassandra manages the replication for youone ring > with physical nodes in two dc might be a better option. Of course, depending > on the inter-dc failure characteristics, might need to endure split-brain > for a while. > > /*** > sent from my android...please pardon occasional typos as I respond @ the > speed of thought > / > > On Oct 10, 2011 10:09 PM, "Cassa L" wrote: > > I am trying to understand multi DC setup for cassandra. As I understand, in > this setup, replicas exists in same cluster ring, but physically nodes are > distributed across DCs. Is this correct? > I have two different cluster rings in two DCs, and want to replicate data > bidirectionally. They both have same keyspace. They take data traffic from > different sources, but we want to make sure, data exists in both the rings. > What could be the way to achieve this? > > Thanks, > L. > >
Re: Volunteers needed - Wiki
maybe that should be the first wiki update the TODO On Tue, Oct 11, 2011 at 7:21 AM, Maki Watanabe wrote: > Hello aaron, > I raise my hand too. > If you have to-do list about the wiki, please let us know. > > maki >