Re: how to solve one node is in heavy load in unbalanced cluster
move first removes the node from the cluster, then adds it back http://wiki.apache.org/cassandra/Operations#Moving_nodes If you have 3 nodes and rf 3, removing the node will result in the error you are seeing. There is not enough nodes in the cluster to implement the replication factor. You can drop the RF down to 2 temporarily and then put it back to 3 later, see http://wiki.apache.org/cassandra/Operations#Replication Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Aug 2011, at 03:39, Yan Chunlu wrote: > hi, any help? thanks! > > On Thu, Aug 4, 2011 at 5:02 AM, Yan Chunlu wrote: > forgot to mention I am using cassandra 0.7.4 > > > On Thu, Aug 4, 2011 at 5:00 PM, Yan Chunlu wrote: > also nothing happens about the streaming: > > nodetool -h node3 netstats > Mode: Normal > Not sending any streams. > Nothing streaming from /10.28.53.11 > Pool NameActive Pending Completed > Commandsn/a 0 165086750 > Responses n/a 0 99372520 > > > > On Thu, Aug 4, 2011 at 4:56 PM, Yan Chunlu wrote: > sorry the ring info should be this: > > nodetool -h node3 ring > Address Status State LoadOwnsToken > > > 84944475733633104818662955375549269696 > node1 Up Normal 13.18 GB81.09% > 52773518586096316348543097376923124102 > node2 Up Normal 22.85 GB10.48% > 70597222385644499881390884416714081360 > node3 Up Leaving 25.44 GB8.43% > 84944475733633104818662955375549269696 > > > > On Thu, Aug 4, 2011 at 4:55 PM, Yan Chunlu wrote: > I have tried the nodetool move but get the following error > > node3:~# nodetool -h node3 move 0 > Exception in thread "main" java.lang.IllegalStateException: replication > factor (3) exceeds number of endpoints (2) > at > org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) > at > org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:930) > at > org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:896) > at > org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1596) > at > org.apache.cassandra.service.StorageService.move(StorageService.java:1734) > at > org.apache.cassandra.service.StorageService.move(StorageService.java:1709) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) > at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) > at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) > at > javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) > at > javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) > at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > at sun.rmi.transport.Transport$1.run(Transport.java:159) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTas
Re: Fewer wide rows vs. more smaller rows
Wider rows may need to run through the slower 2-phase compaction process, see in_memory_compaction_limit_in_mb in the yaml file. They can also result in more GC, depending on work load etc. Some testing I did on query performance http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ There is no magic number. The best advice is to follow Jonathan's advice. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Aug 2011, at 08:22, Benoit Perroud wrote: > Thanks for your advise. Make sense. > > And without sticking to my dummy example, conceptually, what has a smaller > memory footprint : 1M rows of 1 column or 1 row with 1M columns ? > > And if the row key and column name are known, is there any performance > difference between both scenarios ? > > Thanks > > Benoit. > > > On 04. 08. 11 18:24, Jonathan Ellis wrote: >> "keep data you retrieve at the same time, in the same row."
Re: CF index or Solr
It will depend on your query needs, if you only need pattern matching on strings and single (or a few) term queries I would start with a custom index. If you need more features try https://github.com/tjake/Solandra Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Aug 2011, at 09:06, mcasandra wrote: > We have requirements to be able to search on lot of column values or even > names. In such scenario does it make sence to use Solr instead for indexed > data and leave data unindexed in Cassandra for most part? > > Is someone using it this way? > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CF-index-or-Solr-tp6654271p6654271.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com.
Re: How to solve this kind of schema disagreement...
did you check the logs in 1.27 for errors ? Could you be seeing this ? https://issues.apache.org/jira/browse/CASSANDRA-2867 Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 7 Aug 2011, at 16:24, Dikang Gu wrote: > I restart both nodes, and deleted the shcema* and migration* and restarted > them. > > The current cluster looks like this: > [default@unknown] describe cluster; > Cluster Information: >Snitch: org.apache.cassandra.locator.SimpleSnitch >Partitioner: org.apache.cassandra.dht.RandomPartitioner >Schema versions: > 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, > 192.168.1.25] > 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] > > the 1.28 looks good, and the 1.27 still can not get the schema agreement... > > I have tried several times, even delete all the data on 1.27, and rejoin it > as a new node, but it is still unhappy. > > And the ring looks like this: > > Address DC RackStatus State LoadOwns > Token > > 127605887595351923798765477786913079296 > 192.168.1.28datacenter1 rack1 Up Normal 8.38 GB 25.00% > 1 > 192.168.1.25datacenter1 rack1 Up Normal 8.55 GB 34.01% > 57856537434773737201679995572503935972 > 192.168.1.27datacenter1 rack1 Up Joining 1.81 GB 24.28% > 99165710459060760249270263771474737125 > 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% > 127605887595351923798765477786913079296 > > The 1.27 seems can not join the cluster, and it just hangs there... > > Any suggestions? > > Thanks. > > > On Sun, Aug 7, 2011 at 10:01 AM, aaron morton wrote: > After there restart you what was in the logs for the 1.27 machine from the > Migration.java logger ? Some of the messages will start with "Applying > migration" > > You should have shut down both of the nodes, then deleted the schema* and > migration* system sstables, then restarted one of them and watched to see if > it got to schema agreement. > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 6 Aug 2011, at 22:56, Dikang Gu wrote: > >> I have tried this, but the schema still does not agree in the cluster: >> >> [default@unknown] describe cluster; >> Cluster Information: >>Snitch: org.apache.cassandra.locator.SimpleSnitch >>Partitioner: org.apache.cassandra.dht.RandomPartitioner >>Schema versions: >> UNREACHABLE: [192.168.1.28] >> 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] >> 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] >> >> Any other suggestions to solve this? >> >> Because I have some production data saved in the cassandra cluster, so I can >> not afford data lost... >> >> Thanks. >> >> On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud wrote: >> Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, >> 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and >> remove the schema* and migration* sstables from both 192.168.1.28 and >> 192.168.1.27 >> >> >> 2011/8/5 Dikang Gu : >> > [default@unknown] describe cluster; >> > Cluster Information: >> >Snitch: org.apache.cassandra.locator.SimpleSnitch >> >Partitioner: org.apache.cassandra.dht.RandomPartitioner >> >Schema versions: >> > 743fe590-bf48-11e0--4d205df954a7: [192.168.1.28] >> > 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] >> > 06da9aa0-bda8-11e0--9510c23fceff: [192.168.1.27] >> > >> > three different schema versions in the cluster... >> > -- >> > Dikang Gu >> > 0086 - 18611140205 >> > >> >> >> >> -- >> Dikang Gu >> >> 0086 - 18611140205 >> > > > > > -- > Dikang Gu > > 0086 - 18611140205 >
Re: move one node for load re-balancing then it status stuck at "Leaving"
thanks for the help! On Sun, Aug 7, 2011 at 2:10 PM, Dikang Gu wrote: > Yes, I think you are right. > > The "nodetool move" will move the keys on the node to the other two nodes, > and the required replication is 3, but you will only have 2 live nodes after > the move, so you have the exception. > > > On Sun, Aug 7, 2011 at 2:03 PM, Yan Chunlu wrote: > >> is that possible that the implements of cassandra only calculate live >> nodes? >> >> for example: >> "node move node3" cause node3 "Leaving", then cassandra iterate over the >> endpoints and found node1 and node2. so the endpoints is 2, but RF=3, >> Exception raised. >> >> is that true? >> >> >> >> On Fri, Aug 5, 2011 at 3:20 PM, Yan Chunlu wrote: >> >>> nothing... >>> >>> nodetool -h node3 netstats >>> Mode: Normal >>> Not sending any streams. >>> Nothing streaming from /10.28.53.11 >>> Pool NameActive Pending Completed >>> Commandsn/a 0 186669475 >>> Responses n/a 0 117986130 >>> >>> >>> nodetool -h node3 compactionstats >>> compaction type: n/a >>> column family: n/a >>> bytes compacted: n/a >>> bytes total in progress: n/a >>> pending tasks: 0 >>> >>> >>> >>> On Fri, Aug 5, 2011 at 1:47 PM, mcasandra >>> wrote: >>> > Check things like netstats, disk space etc to see why it's in Leaving >>> state. >>> > Anything in the logs that shows Leaving? >>> > >>> > -- >>> > View this message in context: >>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/move-one-node-for-load-re-balancing-then-it-status-stuck-at-Leaving-tp6655168p6655326.html >>> > Sent from the cassandra-u...@incubator.apache.org mailing list archive >>> at Nabble.com. >>> > >>> >> >> > > > -- > Dikang Gu > > 0086 - 18611140205 > >
Re: how to solve one node is in heavy load in unbalanced cluster
thanks for the confirmation aaron! On Sun, Aug 7, 2011 at 4:01 PM, aaron morton wrote: > move first removes the node from the cluster, then adds it back > http://wiki.apache.org/cassandra/Operations#Moving_nodes > > If you have 3 nodes and rf 3, removing the node will result in the error > you are seeing. There is not enough nodes in the cluster to implement the > replication factor. > > You can drop the RF down to 2 temporarily and then put it back to 3 later, > see http://wiki.apache.org/cassandra/Operations#Replication > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 5 Aug 2011, at 03:39, Yan Chunlu wrote: > > hi, any help? thanks! > > On Thu, Aug 4, 2011 at 5:02 AM, Yan Chunlu wrote: > >> forgot to mention I am using cassandra 0.7.4 >> >> >> On Thu, Aug 4, 2011 at 5:00 PM, Yan Chunlu wrote: >> >>> also nothing happens about the streaming: >>> >>> nodetool -h node3 netstats >>> Mode: Normal >>> Not sending any streams. >>> Nothing streaming from /10.28.53.11 >>> Pool NameActive Pending Completed >>> Commandsn/a 0 165086750 >>> Responses n/a 0 99372520 >>> >>> >>> >>> On Thu, Aug 4, 2011 at 4:56 PM, Yan Chunlu wrote: >>> sorry the ring info should be this: nodetool -h node3 ring Address Status State LoadOwnsToken 84944475733633104818662955375549269696 node1 Up Normal 13.18 GB81.09% 52773518586096316348543097376923124102 node2 Up Normal 22.85 GB10.48% 70597222385644499881390884416714081360 node3 Up Leaving 25.44 GB8.43% 84944475733633104818662955375549269696 On Thu, Aug 4, 2011 at 4:55 PM, Yan Chunlu wrote: > I have tried the nodetool move but get the following error > > node3:~# nodetool -h node3 move 0 > Exception in thread "main" java.lang.IllegalStateException: replication > factor (3) exceeds number of endpoints (2) > at > org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) > at > org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:930) > at > org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:896) > at > org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1596) > at > org.apache.cassandra.service.StorageService.move(StorageService.java:1734) > at > org.apache.cassandra.service.StorageService.move(StorageService.java:1709) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) > at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) > at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) > at > javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) > at > javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) > at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > at sun.rmi.transport.Transport$1.run(Transport.java:159) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) > at > sun.rmi.transport.tcp.TCPTrans
Re: Planet Cassandra (an aggregation site for Cassandra News)
Great! If possible, please blog about full-text-search options + how to use them (Solandra, Elastic Search, Sphinx etc). Thanks! On Sun, Aug 7, 2011 at 5:58 AM, Edward Capriolo wrote: > > > On Thu, Aug 4, 2011 at 5:12 AM, Boris Yen wrote: > >> Looking forward to it. ^^ >> >> On Thu, Aug 4, 2011 at 1:56 PM, Eldad Yamin wrote: >> >>> Great! I hope it will be open soon! >>> >>> >>> On Wed, Aug 3, 2011 at 10:33 PM, Ed Anuff wrote: >>> Awesome, great news! On Wed, Aug 3, 2011 at 11:53 AM, Lynn Bender wrote: > Greetings all, > > I just wanted to send a note out to let everyone know about Planet > Cassandra -- an aggregation site for Cassandra news and blogs. Andrew > Llavore from DataStax and I built the site. > > We are currently waiting for approval from the Apache Software > Foundation before we publicly launch. However, in the meantime, we'd love > to > hear from you. If you have any favorite Cassandra-related blogs, or blogs > that frequently contain quality Cassandra content, please send us the URL, > so that we can contact the author about including a site feed. > > If you have any questions or comments, please send them to > pla...@geekaustin.org. > > -Lynn Bender > > -- > -Lynn Bender > http://geekaustin.org > http://linuxagainstpoverty.org > http://twitter.com/linearb > http://twitter.com/geekaustin > > > > >>> >> > I have started a blog to support the High Performance Cassandra Cookbook: > > http://www.jointhegrid.com/highperfcassandra/ > > I am going to use blog to continue writing about features and tips for > Cassandra in the writing style used for the book. > > Lynn, please consider it for syndication. All others, please enjoy. > >
Re: Fewer wide rows vs. more smaller rows
Great ! Thanks for the link. On 07. 08. 11 10:10, aaron morton wrote: Wider rows may need to run through the slower 2-phase compaction process, see in_memory_compaction_limit_in_mb in the yaml file. They can also result in more GC, depending on work load etc. Some testing I did on query performance http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ There is no magic number. The best advice is to follow Jonathan's advice. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Aug 2011, at 08:22, Benoit Perroud wrote: Thanks for your advise. Make sense. And without sticking to my dummy example, conceptually, what has a smaller memory footprint : 1M rows of 1 column or 1 row with 1M columns ? And if the row key and column name are known, is there any performance difference between both scenarios ? Thanks Benoit. On 04. 08. 11 18:24, Jonathan Ellis wrote: "keep data you retrieve at the same time, in the same row."
High perf. Book
Read On Aug 6, 2011 7:58 PM, "Edward Capriolo" wrote: > On Thu, Aug 4, 2011 at 5:12 AM, Boris Yen wrote: > >> Looking forward to it. ^^ >> >> On Thu, Aug 4, 2011 at 1:56 PM, Eldad Yamin wrote: >> >>> Great! I hope it will be open soon! >>> >>> >>> On Wed, Aug 3, 2011 at 10:33 PM, Ed Anuff wrote: >>> Awesome, great news! On Wed, Aug 3, 2011 at 11:53 AM, Lynn Bender wrote: > Greetings all, > > I just wanted to send a note out to let everyone know about Planet > Cassandra -- an aggregation site for Cassandra news and blogs. Andrew > Llavore from DataStax and I built the site. > > We are currently waiting for approval from the Apache Software > Foundation before we publicly launch. However, in the meantime, we'd love to > hear from you. If you have any favorite Cassandra-related blogs, or blogs > that frequently contain quality Cassandra content, please send us the URL, > so that we can contact the author about including a site feed. > > If you have any questions or comments, please send them to > pla...@geekaustin.org. > > -Lynn Bender > > -- > -Lynn Bender > http://geekaustin.org > http://linuxagainstpoverty.org > http://twitter.com/linearb > http://twitter.com/geekaustin > > > > >>> >> > I have started a blog to support the High Performance Cassandra Cookbook: > > http://www.jointhegrid.com/highperfcassandra/ > > I am going to use blog to continue writing about features and tips for > Cassandra in the writing style used for the book. > > Lynn, please consider it for syndication. All others, please enjoy.
Re: Dropped messages
Thanks Aaron. The first paragraph is very clear however the 2nd paragraph leaves me wondering regarding counter columns in my setup. I am writing at CL.ALL and reading at CL.ONE so if I get dropped messages, it will show up as Timeouts on the client side so possibly the mutation was not run on all nodes. Then when I come in reading at CL.ONE on any of the nodes (before the daily nodetool repair), I will get out of date data if the read_repair chance is < 1 right ? That's what I made out of the thread from my previous question. So basically, I am wondering if it makes any sense at all to have a CL.ALL/CL.ONE setup in the hope of maximizing read performance since i) I have to have the read repair chance percentage at 100% if I want consistency and I still take the hypothetical performance hit on write (of having to wait for every replica to respond) ii) if I set the read repair chance to <100% then I may get inconsistent data 2011/8/7 aaron morton > Just added this to the wiki > > http://wiki.apache.org/cassandra/FAQ#dropped_messages > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 6 Aug 2011, at 10:53, Philippe wrote: > > Hi, > I see lines like this in my log file > INFO [ScheduledTasks:1] 2011-08-06 00:51:57,650 MessagingService.java > (line 586) 358 MUTATION messages dropped in server lifetime > INFO [ScheduledTasks:1] 2011-08-06 00:51:57,658 MessagingService.java > (line 586) 297 READ messages dropped in server lifetime > INFO [ScheduledTasks:1] 2011-08-06 00:51:57,658 MessagingService.java > (line 586) 4696 RANGE_SLICE messages dropped in server lifetime > > How worried should I be ? > Are they reported as timeout exceptions on the client ? > > Thanks > > >
batch mutates & throughput
A question regarding batch mutates and how others might be throttling the system to prevent timeouts. My 3-node, RF=3 cluster has been performing ok while bulk loading data (applying counter updates). I've been able to run 16 threads in parallel that each perform about 400 mutates/s on a loaded cluster. Then I thought, hey, let's get rid of the network round trip and batch this thing... So I converted my code to use a mutator and addCounter instead of insertCounter (on Hector). However, when I do, the results are always bad. When I execute() - every 5000 lines, I get wonderful performance but I constantly get Timeouts - every 500, same thing - every 10, the timeouts take longer to appear but they're still there - every 1, it works just like before batching And this happens even with a single thread running So my question is not about the absolute performance of my cluster but about how I'm supposed to use batch updates : it doesn't look like the execute() call blocks until it's performed the mutation and tpstats has showed up to 200.000 mutations pending. Any ideas ?
Re: How to release a customised Cassandra from Eclipse?
Thanks guys. The problem is solved. I copied cassandra and cassandra.in to my bin folder. Then used "ant release " to generate my customized cassandra.jar in dist folder. it worked. To Aaron: I tried "ant artefacts", but it failed. is it because I am using Cassandra 0.7? What's the difference between "ant artefacts" and "ant release"? 2011/8/6 aaron morton > Have a look at this file in the source repo > https://github.com/apache/cassandra/blob/trunk/bin/cassandra > > try using "ant artefacts" and look in the build/dist dir. > > cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 7 Aug 2011, at 03:58, Alvin UW wrote: > > > Thanks. > > I am a beginner. > I checked bin folder under myCassandra. There are only some classes without > executable file. > after "ant release", I got the jar file from build folder. > > > > > 2011/8/6 Jonathan Ellis > >> look at bin/cassandra, you can't just run it with "java -jar" >> >> On Sat, Aug 6, 2011 at 10:43 AM, Alvin UW wrote: >> > Hello, >> > >> > I set up a Cassandra project in Eclipse following >> > http://wiki.apache.org/cassandra/RunningCassandraInEclipse >> > Then, I made a few modifications on it to form a customised Cassandra. >> > But I don't know how can I release this new Cassandra from Eclipse as a >> jar >> > file to use in EC2. >> > >> > I tried "ant release" command in command line. It can successful build >> .jar >> > file. >> > Then I typed java -jar apache-cassandra-0.7.0-beta1-SNAPSHOT.jar >> > >> > "Error: Failed to load Main-Class manifest attribute from " >> > >> > I edited a MANIFEST.MF like: >> > Manifest-Version: 1.0 >> > Ant-Version: Apache Ant 1.7.1 >> > Created-By: 16.3-b01 (Sun Microsystems Inc.) >> > Implementation-Title: Cassandra >> > Implementation-Version: 0.7.0-beta1-SNAPSHOT >> > Implementation-Vendor: Apache >> > Main-Class: org.apache.cassandra.thrift.CassandraDaemon >> > >> > and tried again. the error is like below: >> > >> > Exception in thread "main" java.lang.NoClassDefFoundError: >> > org/apache/thrift/transport/TTransportException >> > Caused by: java.lang.ClassNotFoundException: >> > org.apache.thrift.transport.TTransportException >> > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:319) >> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) >> > at java.lang.ClassLoader.loadClass(ClassLoader.java:264) >> > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:332) >> > Could not find the main class: >> org.apache.cassandra.thrift.CassandraDaemon. >> > Program will exit. >> > >> > So what's the problem? >> > >> > >> > Thanks. >> > Alvin >> > >> > >> > >> > >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > > >
Replicate on write stage errors
hello, I've got new errors showing up in my cassandra log file since I starting testing batch mutates (and it failed). I have done a rolling restart and they are not disappearing. How can I fix this ? What is this really saying about my data and my cluster ? Thanks ERROR [ReplicateOnWriteStage:35] 2011-08-07 23:22:39,147 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[ReplicateOnWriteStage:35,5,main] java.lang.RuntimeException: java.lang.IllegalArgumentException: ColumnFamily ColumnFamily(PUBLIC_MONTHLY_11 [SuperColumn(310260 [000200:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, 6}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, 6}]@1312752159034!-9223372036854775808,000201:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, -378}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, -378}]@1312752159034!-9223372036854775808,000300:false:[{223a4de0-b5fb-11e0--826f85850cbd, 17, 12124}*,{224ceb80-b5fb-11e0--848783ceb9bf, 28, 14097},{e0db20e0-b5ff-11e0--01687c4831ff, 15, 3737}]@1312752159034!-9223372036854775808,000301:false:[{223a4de0-b5fb-11e0--826f85850cbd, 17, -917234}*,{224ceb80-b5fb-11e0--848783ceb9bf, 28, -1148333},{e0db20e0-b5ff-11e0--01687c4831ff, 15, -287859}]@1312752159034!-9223372036854775808,]),]) already has modifications in this mutation: ColumnFamily(PUBLIC_MONTHLY_11 [SuperColumn(31026 [000200:false:[{223a4de0-b5fb-11e0--826f85850cbd, 13, 387}*,{224ceb80-b5fb-11e0--848783ceb9bf, 8, 1991},{e0db20e0-b5ff-11e0--01687c4831ff, 6, 237}]@1312752159034!-9223372036854775808,000201:false:[{223a4de0-b5fb-11e0--826f85850cbd, 13, -31485}*,{224ceb80-b5fb-11e0--848783ceb9bf, 8, -138769},{e0db20e0-b5ff-11e0--01687c4831ff, 6, -18209}]@1312752159034!-9223372036854775808,000300:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, 8}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, 8}]@1312752159034!-9223372036854775808,000301:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, -708}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, -708}]@1312752159034!-9223372036854775808,]),]) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.IllegalArgumentException: ColumnFamily ColumnFamily(PUBLIC_MONTHLY_11 [SuperColumn(310260 [000200:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, 6}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, 6}]@1312752159034!-9223372036854775808,000201:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, -378}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, -378}]@1312752159034!-9223372036854775808,000300:false:[{223a4de0-b5fb-11e0--826f85850cbd, 17, 12124}*,{224ceb80-b5fb-11e0--848783ceb9bf, 28, 14097},{e0db20e0-b5ff-11e0--01687c4831ff, 15, 3737}]@1312752159034!-9223372036854775808,000301:false:[{223a4de0-b5fb-11e0--826f85850cbd, 17, -917234}*,{224ceb80-b5fb-11e0--848783ceb9bf, 28, -1148333},{e0db20e0-b5ff-11e0--01687c4831ff, 15, -287859}]@1312752159034!-9223372036854775808,]),]) already has modifications in this mutation: ColumnFamily(PUBLIC_MONTHLY_11 [SuperColumn(31026 [000200:false:[{223a4de0-b5fb-11e0--826f85850cbd, 13, 387}*,{224ceb80-b5fb-11e0--848783ceb9bf, 8, 1991},{e0db20e0-b5ff-11e0--01687c4831ff, 6, 237}]@1312752159034!-9223372036854775808,000201:false:[{223a4de0-b5fb-11e0--826f85850cbd, 13, -31485}*,{224ceb80-b5fb-11e0--848783ceb9bf, 8, -138769},{e0db20e0-b5ff-11e0--01687c4831ff, 6, -18209}]@1312752159034!-9223372036854775808,000300:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, 8}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, 8}]@1312752159034!-9223372036854775808,000301:false:[{223a4de0-b5fb-11e0--826f85850cbd, 2, -708}*,{e0db20e0-b5ff-11e0--01687c4831ff, 2, -708}]@1312752159034!-9223372036854775808,]),]) at org.apache.cassandra.db.RowMutation.add(RowMutation.java:123) at org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:120) at org.apache.cassandra.service.StorageProxy$5$1.runMayThrow(StorageProxy.java:455) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more
Setup Cassandra0.8 in Eclipse
Hello, I am trying to Setup Cassandra0.8 in Eclipse following http://wiki.apache.org/cassandra/RunningCassandraInEclipse After right clicking on the build.xml -> "Run As" -> "Ant Build". Error appeared as follows: Buildfile: /workspace/Cassandra0.8/build.xml maven-ant-tasks-localrepo: maven-ant-tasks-download: [echo] Downloading Maven ANT Tasks... [mkdir] Created dir: /workspace/Cassandra0.8/build [get] Getting: http://repo2.maven.org/maven2/org/apache/maven/maven-ant-tasks/2.1.3/maven-ant-tasks-2.1.3.jar [get] To: /workspace/Cassandra0.8/build/maven-ant-tasks-2.1.3.jar maven-ant-tasks-init: [mkdir] Created dir: /workspace/Cassandra0.8/build/lib scm-svn-info: BUILD FAILED /workspace/Cassandra0.83/build.xml:235: Execute failed: java.io.IOException: Cannot run program "svn": java.io.IOException: error=2, No such file or directory Total time: 992 milliseconds It seems svn wasn't installed, but i did install it. Thanks. Alvin
Re: batch mutates & throughput
Quick followup. I have pushed the RPC timeout to 30s. Using Hector, I'm doing 1 thread doing batches of 10 mutates at a time so that's even slower than when I was doing 16 threads in parallel doing non-batched mutations. After a couple hundred execute() calls, I get a timeout for every node; I have a 15 second grace period between retries. tpstats indicate no pendings on any of the nodes. I never recover from that I then set the batch size to one and it seems to work a lot better. The only difference I note is that the Mutator.execute() method returns a result than sometimes has a null host and 0 microsecond time in the batch sizes of ten but never in batch sizes of 1. I'm stumped ! Any ideas ? Thanks
phpcassa occasional Not Found Exception
Cassandra: 0.7.8 phpcassa: 0.7.a.4 The front page of a site I manage displays a list of products available. If a product is available, a visitor can order, otherwise "not in stock" is displayed. Whether the product is in stock or not is determined from the cassandra cluster. Occasionally, but not frequently (maybe once or twice per day out of a hundred), the code throws a Not Found Exception. The read consistency level is explicitly set to quorum. Here is an approximation of what the exception looks like: exception 'cassandra_NotFoundException' in /home/www/phpcassa-0.7.a.4/columnfamily.php:251 Stack trace: #0 /home/www/foo.php(54): ColumnFamily->get('stock', Array) #1 {main} Repairs and compactions are performed regularly. I also did a complete json rebuild of the cluster last night, pulling data into a single file, and reloading the cluster from scratch. The NFE error persisted even after doing that. What could be the cause of this, and how can I prevent it from occurring? Thanks for the help
Re: batch mutates & throughput
Maybe you could try to adjust the setting "cassandraThriftSocketTimeout" of hector. https://github.com/rantav/hector/wiki/User-Guide On Mon, Aug 8, 2011 at 6:54 AM, Philippe wrote: > Quick followup. > I have pushed the RPC timeout to 30s. Using Hector, I'm doing 1 thread > doing batches of 10 mutates at a time so that's even slower than when I was > doing 16 threads in parallel doing non-batched mutations. > After a couple hundred execute() calls, I get a timeout for every node; I > have a 15 second grace period between retries. tpstats indicate no pendings > on any of the nodes. I never recover from that > > I then set the batch size to one and it seems to work a lot better. The > only difference I note is that the Mutator.execute() method returns a result > than sometimes has a null host and 0 microsecond time in the batch sizes of > ten but never in batch sizes of 1. > > > I'm stumped ! Any ideas ? > > Thanks >
Re: How to solve this kind of schema disagreement...
Hi Aaron, I repeat the whole procedure: 1. kill the cassandra instance on 1.27. 2. rm the data/system/Migrations-g-* 3. rm the data/system/Schema-g-* 4. bin/cassandra to start the cassandra. Now, the migration seems stop and I do not find any error in the system.log yet. The ring looks good: [root@yun-phy2 apache-cassandra-0.8.1]# bin/nodetool -h192.168.1.27 -p8090 ring Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 192.168.1.25 datacenter1 rack1 Up Normal 8.54 GB 34.01% 57856537434773737201679995572503935972 192.168.1.27 datacenter1 rack1 Up Normal 1.78 GB 24.28% 99165710459060760249270263771474737125 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% 127605887595351923798765477786913079296 But the schema still does not correct: Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, 192.168.1.25] 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] The 5a54ebd0-bd90-11e0--9510c23fceff is same as last time⦠And in the log, the last Migration.java log is: INFO [MigrationStage:1] 2011-08-08 11:41:30,293 Migration.java (line 116) Applying migration 5a54ebd0-bd90-11e0--9510c23fceff Add keyspace: SimpleDB_4E38DAA64894A9146105rep strategy:SimpleStrategy{}durable_writes: true Could you explain this? If I change the token given to 1.27 to another one, will it help? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, August 7, 2011 at 4:14 PM, aaron morton wrote: > did you check the logs in 1.27 for errors ? > > Could you be seeing this ? > https://issues.apache.org/jira/browse/CASSANDRA-2867 > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > > > > > On 7 Aug 2011, at 16:24, Dikang Gu wrote: > > I restart both nodes, and deleted the shcema* and migration* and restarted > > them. > > > > The current cluster looks like this: > > [default@unknown] describe cluster; > > Cluster Information: > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > Schema versions: > > 75eece10-bf48-11e0--4d205df954a7: [192.168.1.28, 192.168.1.9, > > 192.168.1.25] > > 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] > > > > > > the 1.28 looks good, and the 1.27 still can not get the schema agreement... > > > > I have tried several times, even delete all the data on 1.27, and rejoin it > > as a new node, but it is still unhappy. > > > > And the ring looks like this: > > > > Address DC Rack Status State Load Owns Token > > 127605887595351923798765477786913079296 > > 192.168.1.28 datacenter1 rack1 Up Normal 8.38 GB 25.00% 1 > > 192.168.1.25 datacenter1 rack1 Up Normal 8.55 GB 34.01% > > 57856537434773737201679995572503935972 > > 192.168.1.27 datacenter1 rack1 Up Joining 1.81 GB 24.28% > > 99165710459060760249270263771474737125 > > 192.168.1.9 datacenter1 rack1 Up Normal 8.75 GB 16.72% > > 127605887595351923798765477786913079296 > > > > > > The 1.27 seems can not join the cluster, and it just hangs there... > > > > Any suggestions? > > > > Thanks. > > > > > > On Sun, Aug 7, 2011 at 10:01 AM, aaron morton > > wrote: > > > After there restart you what was in the logs for the 1.27 machine from > > > the Migration.java logger ? Some of the messages will start with > > > "Applying migration" > > > > > > You should have shut down both of the nodes, then deleted the schema* and > > > migration* system sstables, then restarted one of them and watched to see > > > if it got to schema agreement. > > > > > > Cheers > > > > > > - > > > Aaron Morton > > > Freelance Cassandra Developer > > > @aaronmorton > > > http://www.thelastpickle.com > > > > > > > > > > > > > > > > > > On 6 Aug 2011, at 22:56, Dikang Gu wrote: > > > > I have tried this, but the schema still does not agree in the cluster: > > > > > > > > [default@unknown] describe cluster; > > > > Cluster Information: > > > > Snitch: org.apache.cassandra.locator.SimpleSnitch > > > > Partitioner: org.apache.cassandra.dht.RandomPartitioner > > > > Schema versions: > > > > UNREACHABLE: [192.168.1.28] > > > > 75eece10-bf48-11e0--4d205df954a7: [192.168.1.9, 192.168.1.25] > > > > 5a54ebd0-bd90-11e0--9510c23fceff: [192.168.1.27] > > > > > > > > Any other suggestions to solve this? > > > > > > > > Because I have some production data saved in the cassandra cluster, so > > > > I can not afford data lost... > > > > > > > > Thanks. > > > > On Fri, Aug 5, 2011 at 8:55 PM, Benoit Perroud > > > > wrote: > > > > > Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, > > > > > 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown
Re: batch mutates & throughput
Hi Boris, Thanks for the suggestion, I didn't know there was one. I believe have finally figured it out and it turns out my last two questions are related. First, my batch loading was ignoring a bunch of rows when reading the first file (so it took hundreds of potential mutations for the problem to show up) and secondly, the ReplicateOnWriteStage error was generated by the batch mutations themselves and explained the TimedOutException : I was doing multiple mutations on the same key in one batch 2011/8/8 Boris Yen > Maybe you could try to adjust the setting "cassandraThriftSocketTimeout" > of hector. https://github.com/rantav/hector/wiki/User-Guide > > > On Mon, Aug 8, 2011 at 6:54 AM, Philippe wrote: > >> Quick followup. >> I have pushed the RPC timeout to 30s. Using Hector, I'm doing 1 thread >> doing batches of 10 mutates at a time so that's even slower than when I was >> doing 16 threads in parallel doing non-batched mutations. >> After a couple hundred execute() calls, I get a timeout for every node; I >> have a 15 second grace period between retries. tpstats indicate no pendings >> on any of the nodes. I never recover from that >> >> I then set the batch size to one and it seems to work a lot better. The >> only difference I note is that the Mutator.execute() method returns a result >> than sometimes has a null host and 0 microsecond time in the batch sizes of >> ten but never in batch sizes of 1. >> >> >> I'm stumped ! Any ideas ? >> >> Thanks >> > >
Re: Setup Cassandra0.8 in Eclipse
Make sure svn is on the PATH. If you open a terminal (or cmd), running svn command should work. On 07. 08. 11 23:39, Alvin UW wrote: It seems svn wasn't installed, but i did install it.
Re: How to release a customised Cassandra from Eclipse?
Its ant artifacts Bye Norman 2011/8/7, Alvin UW : > Thanks guys. > > The problem is solved. I copied cassandra and cassandra.in to my bin folder. > Then used "ant release " to generate my customized cassandra.jar in dist > folder. > it worked. > > To Aaron: I tried "ant artefacts", but it failed. is it because I am using > Cassandra 0.7? > What's the difference between "ant artefacts" and "ant release"? > > 2011/8/6 aaron morton > >> Have a look at this file in the source repo >> https://github.com/apache/cassandra/blob/trunk/bin/cassandra >> >> try using "ant artefacts" and look in the build/dist dir. >> >> cheers >> >> - >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 7 Aug 2011, at 03:58, Alvin UW wrote: >> >> >> Thanks. >> >> I am a beginner. >> I checked bin folder under myCassandra. There are only some classes >> without >> executable file. >> after "ant release", I got the jar file from build folder. >> >> >> >> >> 2011/8/6 Jonathan Ellis >> >>> look at bin/cassandra, you can't just run it with "java -jar" >>> >>> On Sat, Aug 6, 2011 at 10:43 AM, Alvin UW wrote: >>> > Hello, >>> > >>> > I set up a Cassandra project in Eclipse following >>> > http://wiki.apache.org/cassandra/RunningCassandraInEclipse >>> > Then, I made a few modifications on it to form a customised Cassandra. >>> > But I don't know how can I release this new Cassandra from Eclipse as a >>> jar >>> > file to use in EC2. >>> > >>> > I tried "ant release" command in command line. It can successful build >>> .jar >>> > file. >>> > Then I typed java -jar apache-cassandra-0.7.0-beta1-SNAPSHOT.jar >>> > >>> > "Error: Failed to load Main-Class manifest attribute from " >>> > >>> > I edited a MANIFEST.MF like: >>> > Manifest-Version: 1.0 >>> > Ant-Version: Apache Ant 1.7.1 >>> > Created-By: 16.3-b01 (Sun Microsystems Inc.) >>> > Implementation-Title: Cassandra >>> > Implementation-Version: 0.7.0-beta1-SNAPSHOT >>> > Implementation-Vendor: Apache >>> > Main-Class: org.apache.cassandra.thrift.CassandraDaemon >>> > >>> > and tried again. the error is like below: >>> > >>> > Exception in thread "main" java.lang.NoClassDefFoundError: >>> > org/apache/thrift/transport/TTransportException >>> > Caused by: java.lang.ClassNotFoundException: >>> > org.apache.thrift.transport.TTransportException >>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) >>> > at java.security.AccessController.doPrivileged(Native Method) >>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) >>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:319) >>> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) >>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:264) >>> > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:332) >>> > Could not find the main class: >>> org.apache.cassandra.thrift.CassandraDaemon. >>> > Program will exit. >>> > >>> > So what's the problem? >>> > >>> > >>> > Thanks. >>> > Alvin >>> > >>> > >>> > >>> > >>> > >>> > >>> >>> >>> >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of DataStax, the source for professional Cassandra support >>> http://www.datastax.com >>> >> >> >> >