Key_Cache @ Row_Cache
Hi All, Can you give me a bit idea how key_cache and row_cache effects on performance of cassandra. How these things works in different scenario depending upon the data size? Thank You Nilabja Banerjee
Re:Key_Cache @ Row_Cache
row_Cache caches a whole row, Key_cache caches the key and the row location. thus, if the request is hit in row_Cache then the result can be given without disk seek. If it is hit in key_Cache, result can be obtains after one disk seek. without key_Cache or row_cache, it will check the index file for the record location. Your caching strategy should therefore be tuned in accordance with a few factors: • Consider your queries, and use the cache type that best fits your queries. • Consider the ratio of your heap size to your cache size, and do not allow the cache to overwhelm your heap. • Consider the size of your rows against the size of your keys. Typically keys will be much smaller than entire rows. If your column family gets far more reads than writes, then setting this number very high will needlessly consume considerable server resources. If your column family has a lower ratio of reads to writes, but has rows with lots of data in them (hundreds of columns), then you’ll need to do some math before setting this number very high. And unless you have certain rows that get hit a lot and others that get hit very little, you’re not going to see much of a boost here. At 2011-07-13 15:16:10,"Nilabja Banerjee" wrote: Hi All, Can you give me a bit idea how key_cache and row_cache effects on performance of cassandra. How these things works in different scenario depending upon the data size? Thank You Nilabja Banerjee
insert a super column
insert(key, column_path, column, consistency_level) can only insert a standard column. Is batch_mutate the only API to insert a super column? and also can someone tell why batch_insert,multi_get is removed in version 0.7.4?
R: Re: Re: Re: AntiEntropy?
Thanks for the confirmatio, Peter. In the company I work for I suggested many times to run repair at least 1 every 10 days (gcgraceseconds is set approx to 10 days in our config) -- but this book has been used against me :-) I will ask to run repair asap >Messaggio originale >Da: peter.schul...@infidyne.com >Data: 13/07/2011 5.07 >A: , "cbert...@libero.it" >Ogg: Re: Re: Re: AntiEntropy? > >> To be sure that I didn't misunderstand (English is not my mother tongue) here >> is what the entire "repair paragraph" says ... > >Read it, I maintain my position - the book is wrong or at the very >least strongly misleading. > >You *definitely* need to run nodetool repair periodically for the >reasons documented in the link I sent before, unless you have specific >reasons not to and know what you're doing. > >-- >/ Peter Schuller >
Re: insert a super column
For batch_insert, I think you could use batch_mutate instead. For multi_get, I think you could use multiget_slice instead. Boris 在 ,魏金仙 寫道: insert(key, column_path, column, consistency_level) can only insert a standard column.Is batch_mutate the only API to insert a super column? and also can someone tell why batch_insert,multi_get is removed in version 0.7.4?
Re: Re: Re: AntiEntropy?
I'll write a FAQ for this topic :-) maki 2011/7/13 Peter Schuller : >> To be sure that I didn't misunderstand (English is not my mother tongue) here >> is what the entire "repair paragraph" says ... > > Read it, I maintain my position - the book is wrong or at the very > least strongly misleading. > > You *definitely* need to run nodetool repair periodically for the > reasons documented in the link I sent before, unless you have specific > reasons not to and know what you're doing. > > -- > / Peter Schuller > -- w3m
Re: Key_Cache @ Row_Cache
> > Can you give me a bit idea how key_cache and row_cache effects on > performance of cassandra. How these things works in different scenario > depending upon the data size? > > While reading, if row_cached is set, it check for row_cache first then key_cached, memtable & disk. row_cache store all data on memory, need tuning, generally lowered preferred key_cache store only key and location of row in memory, higher is preferred if row if frequently read it is good to cache it but row size matters large row size can eat too much memory. Also this may help http://www.datastax.com/docs/0.8/operations/cache_tuning#configuring-key-and-row-caches /Samal
Why do Digest Queries return hash instead of timestamp?
I just saw this http://wiki.apache.org/cassandra/DigestQueries and I was wondering why it returns a hash of the data. Wouldn't it be better and easier to return the timestamp? You don't really care what the data is, you only care whether it is more or less recent than another piece of data.
Re: Why do Digest Queries return hash instead of timestamp?
I guess it is because the timestamp does not guarantee data consistency, but hash does. Boris On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote: > I just saw this > > http://wiki.apache.org/cassandra/DigestQueries > > and I was wondering why it returns a hash of the data. Wouldn't it be > better and easier to return the timestamp? You don't really care what the > data is, you only care whether it is more or less recent than another piece > of data. >
Re: Why do Digest Queries return hash instead of timestamp?
If you have to pieces of data that are different but have the same timestamp, how can you resolve consistency? This is a pathological situation to begin with, why should you waste effort to (not) solve it? On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote: > I guess it is because the timestamp does not guarantee data consistency, > but hash does. > > Boris > > > On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote: > >> I just saw this >> >> http://wiki.apache.org/cassandra/DigestQueries >> >> and I was wondering why it returns a hash of the data. Wouldn't it be >> better and easier to return the timestamp? You don't really care what the >> data is, you only care whether it is more or less recent than another piece >> of data. >> > >
Re: Why do Digest Queries return hash instead of timestamp?
I can only say, "data" does matter, that is why the developers use hash instead of timestamp. If hash value comes from other node is not a match, a read repair would perform. so that correct data can be returned. On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote: > If you have to pieces of data that are different but have the same > timestamp, how can you resolve consistency? > > This is a pathological situation to begin with, why should you waste effort > to (not) solve it? > > On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote: > >> I guess it is because the timestamp does not guarantee data consistency, >> but hash does. >> >> Boris >> >> >> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote: >> >>> I just saw this >>> >>> http://wiki.apache.org/cassandra/DigestQueries >>> >>> and I was wondering why it returns a hash of the data. Wouldn't it be >>> better and easier to return the timestamp? You don't really care what the >>> data is, you only care whether it is more or less recent than another piece >>> of data. >>> >> >> >
Re: Why do Digest Queries return hash instead of timestamp?
How would you know which data is correct, if they both have the same timestamp? On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen wrote: > I can only say, "data" does matter, that is why the developers use hash > instead of timestamp. If hash value comes from other node is not a match, a > read repair would perform. so that correct data can be returned. > > > On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote: > >> If you have to pieces of data that are different but have the same >> timestamp, how can you resolve consistency? >> >> This is a pathological situation to begin with, why should you waste >> effort to (not) solve it? >> >> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote: >> >>> I guess it is because the timestamp does not guarantee data consistency, >>> but hash does. >>> >>> Boris >>> >>> >>> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote: >>> I just saw this http://wiki.apache.org/cassandra/DigestQueries and I was wondering why it returns a hash of the data. Wouldn't it be better and easier to return the timestamp? You don't really care what the data is, you only care whether it is more or less recent than another piece of data. >>> >>> >> >
Re: Why do Digest Queries return hash instead of timestamp?
For a specific column, If there are two versions with the same timestamp, the value of the column is used to break the tie. if v1.value().compareTo(v2.value()) < 0, it means that v2 wins. On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn wrote: > How would you know which data is correct, if they both have the same > timestamp? > > On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen wrote: > >> I can only say, "data" does matter, that is why the developers use hash >> instead of timestamp. If hash value comes from other node is not a match, a >> read repair would perform. so that correct data can be returned. >> >> >> On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote: >> >>> If you have to pieces of data that are different but have the same >>> timestamp, how can you resolve consistency? >>> >>> This is a pathological situation to begin with, why should you waste >>> effort to (not) solve it? >>> >>> On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote: >>> I guess it is because the timestamp does not guarantee data consistency, but hash does. Boris On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote: > I just saw this > > http://wiki.apache.org/cassandra/DigestQueries > > and I was wondering why it returns a hash of the data. Wouldn't it be > better and easier to return the timestamp? You don't really care what the > data is, you only care whether it is more or less recent than another > piece > of data. > >>> >> >
RE: sstabletojson
Perfect, thanks! -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Tuesday, July 12, 2011 5:53 PM To: user@cassandra.apache.org Subject: Re: sstabletojson You can upgrade to 0.8.1 to fix this. :) On Tue, Jul 12, 2011 at 1:03 PM, Stephen Pope wrote: > Hey there. I'm trying to convert one of my sstables to json, but it doesn't > appear to be escaping quotes. As a result, I've got a line in my resulting > json like this: > > "3230303930373139313734303236efbfbf3331313733": [["6d6573736167655f6964", > ""<66AA9165386616028BD3FECF893BBAC204347F3BAF@CONFLICT,6.HUSHEDFIRE.COM>"", > 634447747524175316]], > > Attempting to convert this json back into an sstable results in: > > C:\cassandra\apache-cassandra-0.8.0\bin>json2sstable.bat -K BIM -c > TransactionLogs json.dat out.db > > org.codehaus.jackson.JsonParseException: Unexpected character ('<' (code > 60)): w > as expecting comma to separate ARRAY entries > at [Source: json.dat; line: 31175, column: 299] > at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:929) > at > org.codehaus.jackson.impl.JsonParserBase._reportError(JsonParserBase. > java:632) > at > org.codehaus.jackson.impl.JsonParserBase._reportUnexpectedChar(JsonPa > rserBase.java:565) > at > org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser > .java:128) > at > org.codehaus.jackson.map.deser.UntypedObjectDeserializer.mapArray(Unt > ypedObjectDeserializer.java:81) > at > org.codehaus.jackson.map.deser.UntypedObjectDeserializer.deserialize( > UntypedObjectDeserializer.java:62) > at > org.codehaus.jackson.map.deser.UntypedObjectDeserializer.mapArray(Unt > ypedObjectDeserializer.java:82) > at > org.codehaus.jackson.map.deser.UntypedObjectDeserializer.deserialize( > UntypedObjectDeserializer.java:62) > at > org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeseri > alizer.java:197) > at > org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeseria > lizer.java:145) > at > org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeseria > lizer.java:23) > at > org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:12 > 61) > at > org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:517 > ) > at org.codehaus.jackson.JsonParser.readValueAs(JsonParser.java:897) > at > org.apache.cassandra.tools.SSTableImport.importUnsorted(SSTableImport > .java:263) > at > org.apache.cassandra.tools.SSTableImport.importJson(SSTableImport.jav > a:252) > at > org.apache.cassandra.tools.SSTableImport.main(SSTableImport.java:476) > > > Is there anything I can do with my data to fix this? > > Cheers, > Steve > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
BulkLoader
I'm trying to figure out how to use the BulkLoader, and it looks like there's no way to run it against a local machine, because of this: Set hosts = Gossiper.instance.getLiveMembers(); hosts.remove(FBUtilities.getLocalAddress()); if (hosts.isEmpty()) throw new IllegalStateException("Cannot load any sstable, no live member found in the cluster"); Is this intended behavior? May I ask why? We'd like to be able to run it against the local machine. Cheers, Steve
Re: Survey: Cassandra/JVM Resident Set Size increase
Do you mean that it is using all of the available heap? That is the expected behavior of most long running Java applications. The JVM will not GC until it needs memory (or you explicitly ask it to) and will only free up a bit of memory at a time. That is very good behavior from a performance stand point since frequent, large GCs would make your application very unresponsive. It also makes Java applications take up all the memory you give them. - Original Message - From: "Sasha Dolgy" To: user@cassandra.apache.org Sent: Tuesday, July 12, 2011 10:23:02 PM Subject: Re: Survey: Cassandra/JVM Resident Set Size increase I'll post more tomorrow ... However, we set up one node in a single node cluster and have left it with no datareviewing memory consumption graphs...it increased daily until it gobbled (highly technical term) all memory...the system is now running just below 100% memory usagewhich i find peculiar seeings that it is doing nothingwith no data and no peers. On Jul 12, 2011 3:29 PM, "Chris Burroughs" wrote: > ### Preamble > > There have been several reports on the mailing list of the JVM running > Cassandra using "too much" memory. That is, the resident set size is >>>(max java heap size + mmaped segments) and continues to grow until the > process swaps, kernel oom killer comes along, or performance just > degrades too far due to the lack of space for the page cache. It has > been unclear from these reports if there is a pattern. My hope here is > that by comparing JVM versions, OS versions, JVM configuration etc., we > will find something. Thank you everyone for your time. > > > Some example reports: > - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html > - > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html > - https://issues.apache.org/jira/browse/CASSANDRA-2868 > - > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html > - > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html > > For reference theories include (in no particular order): > - memory fragmentation > - JVM bug > - OS/glibc bug > - direct memory > - swap induced fragmentation > - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity. > > ### Survey > > 1. Do you think you are experiencing this problem? > > 2. Why? (This is a good time to share a graph like > http://www.twitpic.com/5fdabn or > http://img24.imageshack.us/img24/1754/cassandrarss.png) > > 2. Are you using mmap? (If yes be sure to have read > http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have > used pmap [or another tool] to rule you mmap and top decieving you.) > > 3. Are you using JNA? Was mlockall succesful (it's in the logs on startup)? > > 4. Is swap enabled? Are you swapping? > > 5. What version of Apache Cassandra are you using? > > 6. What is the earliest version of Apache Cassandra you recall seeing > this problem with? > > 7. Have you tried the patch from CASSANDRA-2654 ? > > 8. What jvm and version are you using? > > 9. What OS and version are you using? > > 10. What are your jvm flags? > > 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize) > > 12. Can you characterise how much GC your cluster is doing? > > 13. Approximately how many read/writes per unit time is your cluster > doing (per node or the whole cluster)? > > 14. How are you column families configured (key cache size, row cache > size, etc.)? >
Re: insert a super column
A ColumnPath can contain a super column, so you should be fine inserting a super column family (in fact I do that). Quoting cassandra.thrift: struct ColumnPath { 3: required string column_family, 4: optional binary super_column, 5: optional binary column, } - Original Message - From: "魏金仙" To: "user" Sent: Wednesday, July 13, 2011 7:43:15 AM Subject: insert a super column insert(key, column_path, column, consistency_level) can only insert a standard column. Is batch_mutate the only API to insert a super column? and also can someone tell why batch_insert,multi_get is removed in version 0.7.4?
Re: Why do Digest Queries return hash instead of timestamp?
Is that the actual reason? This seems like a big inefficiency to me. For those of us who don't worry about this extreme edge case (that probably will NEVER happen in real life, for most applications), is there a way to turn this off? Or am I wrong about this making the operation MUCH more expensive? On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen wrote: > For a specific column, If there are two versions with the same timestamp, > the value of the column is used to break the tie. > > if v1.value().compareTo(v2.value()) < 0, it means that v2 wins. > > On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn wrote: > >> How would you know which data is correct, if they both have the same >> timestamp? >> >> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen wrote: >> >>> I can only say, "data" does matter, that is why the developers use hash >>> instead of timestamp. If hash value comes from other node is not a match, a >>> read repair would perform. so that correct data can be returned. >>> >>> >>> On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote: >>> If you have to pieces of data that are different but have the same timestamp, how can you resolve consistency? This is a pathological situation to begin with, why should you waste effort to (not) solve it? On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote: > I guess it is because the timestamp does not guarantee data > consistency, but hash does. > > Boris > > > On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn > wrote: > >> I just saw this >> >> http://wiki.apache.org/cassandra/DigestQueries >> >> and I was wondering why it returns a hash of the data. Wouldn't it be >> better and easier to return the timestamp? You don't really care what the >> data is, you only care whether it is more or less recent than another >> piece >> of data. >> > > >>> >> >
How to remove/add node
Hi, I have deleted the data, commitlog and saved cache directories. I have removed one of the nodes from the seeds of cassandra.yaml. When i tried to use nodetool, itshowing the removed node as up.. Thanks, Abdul
RE: BulkLoader
I think I've solved my own problem here. After generating the sstable using json2sstable it looks like I can simply copy the created sstable into my data directory. Can anyone think of any potential problems with doing it this way? -Original Message- From: Stephen Pope [mailto:stephen.p...@quest.com] Sent: Wednesday, July 13, 2011 9:32 AM To: user@cassandra.apache.org Subject: BulkLoader I'm trying to figure out how to use the BulkLoader, and it looks like there's no way to run it against a local machine, because of this: Set hosts = Gossiper.instance.getLiveMembers(); hosts.remove(FBUtilities.getLocalAddress()); if (hosts.isEmpty()) throw new IllegalStateException("Cannot load any sstable, no live member found in the cluster"); Is this intended behavior? May I ask why? We'd like to be able to run it against the local machine. Cheers, Steve
Re: AssertionError: No data found for NamesQueryFilter
This (https://issues.apache.org/jira/browse/CASSANDRA-2653) is fixed in 0.7.7, which will be out soon. On Tue, Jul 12, 2011 at 9:13 PM, Kyle Gibson wrote: > Running version 0.7.6-2, recently upgraded from 0.7.3. > > I am get a time out exception when I run a particular > get_indexed_slices, which results in the following error showing up on > a few nodes: > > ERROR [ReadStage:16] 2011-07-12 23:01:31,424 > AbstractCassandraDaemon.java (line 114) Fatal exception in thread > Thread[ReadStage:16,5,main] > java.lang.AssertionError: No data found for > NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=18 > cap=29],java.nio.HeapByteBuffer[pos=22 lim=28 cap=29]) in > DecoratedKey(39222808797828327646767854834585383073, > 464f2d47584f4c4833454a46384e4c54543341544c4339):QueryPath(columnFamilyName='subscriptions', > superColumnName='null', columnName='null') (original filter > NamesQueryFilter(columns=java.nio.HeapByteBuffer[pos=12 lim=18 > cap=29],java.nio.HeapByteBuffer[pos=22 lim=28 cap=29])) from > expression 'XXX EQ YYY' > at > org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1603) > at > org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > > > Thanks > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Why do Digest Queries return hash instead of timestamp?
(1) the hash calculation is a small amount of CPU -- MD5 is specifically designed to be efficient in this kind of situation (2) we compute one hash per query, so for multiple columns the advantage over timestamp-per-column gets large quickly. On Wed, Jul 13, 2011 at 7:31 AM, David Boxenhorn wrote: > Is that the actual reason? > > This seems like a big inefficiency to me. For those of us who don't worry > about this extreme edge case (that probably will NEVER happen in real life, > for most applications), is there a way to turn this off? > > Or am I wrong about this making the operation MUCH more expensive? > > > On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen wrote: >> >> For a specific column, If there are two versions with the same timestamp, >> the value of the column is used to break the tie. >> if v1.value().compareTo(v2.value()) < 0, it means that v2 wins. >> On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn >> wrote: >>> >>> How would you know which data is correct, if they both have the same >>> timestamp? >>> >>> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen wrote: I can only say, "data" does matter, that is why the developers use hash instead of timestamp. If hash value comes from other node is not a match, a read repair would perform. so that correct data can be returned. On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote: > > If you have to pieces of data that are different but have the same > timestamp, how can you resolve consistency? > > This is a pathological situation to begin with, why should you waste > effort to (not) solve it? > > On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote: >> >> I guess it is because the timestamp does not guarantee data >> consistency, but hash does. >> Boris >> >> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn >> wrote: >>> >>> I just saw this >>> >>> http://wiki.apache.org/cassandra/DigestQueries >>> >>> and I was wondering why it returns a hash of the data. Wouldn't it be >>> better and easier to return the timestamp? You don't really care what >>> the >>> data is, you only care whether it is more or less recent than another >>> piece >>> of data. >> > >>> >> > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: BulkLoader
Sure, that will work fine with a single machine. The advantage of bulkloader is it handles splitting the sstable up and sending each piece to the right place(s) when you have more than one. On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope wrote: > I think I've solved my own problem here. After generating the sstable using > json2sstable it looks like I can simply copy the created sstable into my data > directory. > > Can anyone think of any potential problems with doing it this way? > > -Original Message- > From: Stephen Pope [mailto:stephen.p...@quest.com] > Sent: Wednesday, July 13, 2011 9:32 AM > To: user@cassandra.apache.org > Subject: BulkLoader > > I'm trying to figure out how to use the BulkLoader, and it looks like > there's no way to run it against a local machine, because of this: > > Set hosts = Gossiper.instance.getLiveMembers(); > hosts.remove(FBUtilities.getLocalAddress()); > if (hosts.isEmpty()) > throw new IllegalStateException("Cannot load any sstable, > no live member found in the cluster"); > > Is this intended behavior? May I ask why? We'd like to be able to run it > against the local machine. > > Cheers, > Steve > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
One node down but it thinks its fine...
One of our nodes, which happens to be the seed thinks its Up and all the other nodes are down. However all the other nodes thinks the seed is down instead. The logs for the seed node show everything is running as it should be. I've tried restarting the node, turning on/off gossip and thrift and nothing seems to get the node to see the rest of its ring as up and running. I have also tried restarting one of the other nodes, which had no affect on the situation. Below is the ring outputs for the seed and one other node in the ring, plus a ping to show that the seed can ping the other node. # bin/nodetool -h 0.0.0.0 ring Address Status State Load Owns Token 141784319550391026443072753096570088105 127.0.0.1 Up Normal 4.61 GB 16.67% 0 xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621 xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242 xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863 xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484 xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105 # ping xx.xxx.30.210 PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data. 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms ^C --- xx.xxx.30.210 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms # bin/nodetool -h xx.xxx.30.210 ring Address Status State Load Owns Token 141784319550391026443072753096570088105 xx.xxx.23.40 Down Normal ? 16.67% 0 xx.xxx.30.210 Up Normal 10.58 GB 16.67% 28356863910078205288614550619314017621 xx.xx.90.87 Up Normal 10.47 GB 16.67% 56713727820156410577229101238628035242 xx.xx.22.236 Up Normal 9.63 GB 16.67% 85070591730234615865843651857942052863 xx.xx.97.96 Up Normal 10.68 GB 16.67% 113427455640312821154458202477256070484 xx.xxx.17.122 Up Normal 10.18 GB 16.67% 141784319550391026443072753096570088105 -- Ray Slakinski
RE: BulkLoader
Fair enough. My original question stands then. :) Why aren't you allowed to talk to a local installation using BulkLoader? -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Wednesday, July 13, 2011 11:06 AM To: user@cassandra.apache.org Subject: Re: BulkLoader Sure, that will work fine with a single machine. The advantage of bulkloader is it handles splitting the sstable up and sending each piece to the right place(s) when you have more than one. On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope wrote: > I think I've solved my own problem here. After generating the sstable using > json2sstable it looks like I can simply copy the created sstable into my data > directory. > > Can anyone think of any potential problems with doing it this way? > > -Original Message- > From: Stephen Pope [mailto:stephen.p...@quest.com] > Sent: Wednesday, July 13, 2011 9:32 AM > To: user@cassandra.apache.org > Subject: BulkLoader > > I'm trying to figure out how to use the BulkLoader, and it looks like > there's no way to run it against a local machine, because of this: > > Set hosts = Gossiper.instance.getLiveMembers(); > hosts.remove(FBUtilities.getLocalAddress()); > if (hosts.isEmpty()) > throw new IllegalStateException("Cannot load any sstable, > no live member found in the cluster"); > > Is this intended behavior? May I ask why? We'd like to be able to run it > against the local machine. > > Cheers, > Steve > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: BulkLoader
Because it's hooking directly into gossip, so the local instance it's ignoring is the bulkloader process, not Cassandra. You'd need to run the bulkloader from a different IP, than Cassandra. On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope wrote: > Fair enough. My original question stands then. :) > > Why aren't you allowed to talk to a local installation using BulkLoader? > > -Original Message- > From: Jonathan Ellis [mailto:jbel...@gmail.com] > Sent: Wednesday, July 13, 2011 11:06 AM > To: user@cassandra.apache.org > Subject: Re: BulkLoader > > Sure, that will work fine with a single machine. The advantage of > bulkloader is it handles splitting the sstable up and sending each > piece to the right place(s) when you have more than one. > > On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope wrote: >> I think I've solved my own problem here. After generating the sstable using >> json2sstable it looks like I can simply copy the created sstable into my >> data directory. >> >> Can anyone think of any potential problems with doing it this way? >> >> -Original Message- >> From: Stephen Pope [mailto:stephen.p...@quest.com] >> Sent: Wednesday, July 13, 2011 9:32 AM >> To: user@cassandra.apache.org >> Subject: BulkLoader >> >> I'm trying to figure out how to use the BulkLoader, and it looks like >> there's no way to run it against a local machine, because of this: >> >> Set hosts = Gossiper.instance.getLiveMembers(); >> hosts.remove(FBUtilities.getLocalAddress()); >> if (hosts.isEmpty()) >> throw new IllegalStateException("Cannot load any sstable, >> no live member found in the cluster"); >> >> Is this intended behavior? May I ask why? We'd like to be able to run it >> against the local machine. >> >> Cheers, >> Steve >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
RE: BulkLoader
Ahhh..ok. Thanks. -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Wednesday, July 13, 2011 11:35 AM To: user@cassandra.apache.org Subject: Re: BulkLoader Because it's hooking directly into gossip, so the local instance it's ignoring is the bulkloader process, not Cassandra. You'd need to run the bulkloader from a different IP, than Cassandra. On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope wrote: > Fair enough. My original question stands then. :) > > Why aren't you allowed to talk to a local installation using BulkLoader? > > -Original Message- > From: Jonathan Ellis [mailto:jbel...@gmail.com] > Sent: Wednesday, July 13, 2011 11:06 AM > To: user@cassandra.apache.org > Subject: Re: BulkLoader > > Sure, that will work fine with a single machine. The advantage of > bulkloader is it handles splitting the sstable up and sending each > piece to the right place(s) when you have more than one. > > On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope wrote: >> I think I've solved my own problem here. After generating the sstable using >> json2sstable it looks like I can simply copy the created sstable into my >> data directory. >> >> Can anyone think of any potential problems with doing it this way? >> >> -Original Message- >> From: Stephen Pope [mailto:stephen.p...@quest.com] >> Sent: Wednesday, July 13, 2011 9:32 AM >> To: user@cassandra.apache.org >> Subject: BulkLoader >> >> I'm trying to figure out how to use the BulkLoader, and it looks like >> there's no way to run it against a local machine, because of this: >> >> Set hosts = Gossiper.instance.getLiveMembers(); >> hosts.remove(FBUtilities.getLocalAddress()); >> if (hosts.isEmpty()) >> throw new IllegalStateException("Cannot load any sstable, >> no live member found in the cluster"); >> >> Is this intended behavior? May I ask why? We'd like to be able to run it >> against the local machine. >> >> Cheers, >> Steve >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Why do Digest Queries return hash instead of timestamp?
Got it. Thanks! On Wed, Jul 13, 2011 at 6:05 PM, Jonathan Ellis wrote: > (1) the hash calculation is a small amount of CPU -- MD5 is > specifically designed to be efficient in this kind of situation > (2) we compute one hash per query, so for multiple columns the > advantage over timestamp-per-column gets large quickly. > > On Wed, Jul 13, 2011 at 7:31 AM, David Boxenhorn > wrote: > > Is that the actual reason? > > > > This seems like a big inefficiency to me. For those of us who don't worry > > about this extreme edge case (that probably will NEVER happen in real > life, > > for most applications), is there a way to turn this off? > > > > Or am I wrong about this making the operation MUCH more expensive? > > > > > > On Wed, Jul 13, 2011 at 3:20 PM, Boris Yen wrote: > >> > >> For a specific column, If there are two versions with the same > timestamp, > >> the value of the column is used to break the tie. > >> if v1.value().compareTo(v2.value()) < 0, it means that v2 wins. > >> On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn > >> wrote: > >>> > >>> How would you know which data is correct, if they both have the same > >>> timestamp? > >>> > >>> On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen > wrote: > > I can only say, "data" does matter, that is why the developers use > hash > instead of timestamp. If hash value comes from other node is not a > match, a > read repair would perform. so that correct data can be returned. > > On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn > wrote: > > > > If you have to pieces of data that are different but have the same > > timestamp, how can you resolve consistency? > > > > This is a pathological situation to begin with, why should you waste > > effort to (not) solve it? > > > > On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen > wrote: > >> > >> I guess it is because the timestamp does not guarantee data > >> consistency, but hash does. > >> Boris > >> > >> On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn < > da...@citypath.com> > >> wrote: > >>> > >>> I just saw this > >>> > >>> http://wiki.apache.org/cassandra/DigestQueries > >>> > >>> and I was wondering why it returns a hash of the data. Wouldn't it > be > >>> better and easier to return the timestamp? You don't really care > what the > >>> data is, you only care whether it is more or less recent than > another piece > >>> of data. > >> > > > > >>> > >> > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >
Re: One node down but it thinks its fine...
Check seed ip is same in all node and should not be loopback ip on cluster. On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski wrote: > One of our nodes, which happens to be the seed thinks its Up and all the > other nodes are down. However all the other nodes thinks the seed is down > instead. The logs for the seed node show everything is running as it should > be. I've tried restarting the node, turning on/off gossip and thrift and > nothing seems to get the node to see the rest of its ring as up and running. > I have also tried restarting one of the other nodes, which had no affect on > the situation. Below is the ring outputs for the seed and one other node in > the ring, plus a ping to show that the seed can ping the other node. > > # bin/nodetool -h 0.0.0.0 ring > Address Status State Load Owns Token > 141784319550391026443072753096570088105 > 127.0.0.1 Up Normal 4.61 GB 16.67% 0 > xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621 > xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242 > xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863 > xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484 > xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105 > > > # ping xx.xxx.30.210 > PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data. > 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms > 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms > ^C > --- xx.xxx.30.210 ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 999ms > rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms > > > # bin/nodetool -h xx.xxx.30.210 ring > Address Status State Load Owns Token > 141784319550391026443072753096570088105 > xx.xxx.23.40 Down Normal ? 16.67% 0 > xx.xxx.30.210 Up Normal 10.58 GB 16.67% > 28356863910078205288614550619314017621 > xx.xx.90.87 Up Normal 10.47 GB 16.67% > 56713727820156410577229101238628035242 > xx.xx.22.236 Up Normal 9.63 GB 16.67% > 85070591730234615865843651857942052863 > xx.xx.97.96 Up Normal 10.68 GB 16.67% > 113427455640312821154458202477256070484 > xx.xxx.17.122 Up Normal 10.18 GB 16.67% > 141784319550391026443072753096570088105 > > -- > Ray Slakinski > > >
JSR-347
Hi, I am looking to "round out" the EG membership of JSR-347 so that we can get going with discussions. It would be great if someone from the Cassandra community could join to represent the experiences of developing HBase :-) We'll be communicating using https://groups.google.com/forum/#!forum/jsr347 - so that would be a good place to start whilst we wait for the JCP to process formal nominations! Let me know any queries Best, Pete
Re: One node down but it thinks its fine...
any firewall changes? ping is fine ... but if you can't get from node(a) to nodes(n) on the specific ports... On Wed, Jul 13, 2011 at 6:47 PM, samal wrote: > Check seed ip is same in all node and should not be loopback ip on cluster. > > On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski > wrote: >> >> One of our nodes, which happens to be the seed thinks its Up and all the >> other nodes are down. However all the other nodes thinks the seed is down >> instead. The logs for the seed node show everything is running as it should >> be. I've tried restarting the node, turning on/off gossip and thrift and >> nothing seems to get the node to see the rest of its ring as up and running. >> I have also tried restarting one of the other nodes, which had no affect on >> the situation. Below is the ring outputs for the seed and one other node in >> the ring, plus a ping to show that the seed can ping the other node. >> >> # bin/nodetool -h 0.0.0.0 ring >> Address Status State Load Owns Token >> 141784319550391026443072753096570088105 >> 127.0.0.1 Up Normal 4.61 GB 16.67% 0 >> xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621 >> xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242 >> xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863 >> xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484 >> xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105 >> >> >> # ping xx.xxx.30.210 >> PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data. >> 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms >> 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms >> ^C >> --- xx.xxx.30.210 ping statistics --- >> 2 packets transmitted, 2 received, 0% packet loss, time 999ms >> rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms >> >> >> # bin/nodetool -h xx.xxx.30.210 ring >> Address Status State Load Owns Token >> 141784319550391026443072753096570088105 >> xx.xxx.23.40 Down Normal ? 16.67% 0 >> xx.xxx.30.210 Up Normal 10.58 GB 16.67% >> 28356863910078205288614550619314017621 >> xx.xx.90.87 Up Normal 10.47 GB 16.67% >> 56713727820156410577229101238628035242 >> xx.xx.22.236 Up Normal 9.63 GB 16.67% >> 85070591730234615865843651857942052863 >> xx.xx.97.96 Up Normal 10.68 GB 16.67% >> 113427455640312821154458202477256070484 >> xx.xxx.17.122 Up Normal 10.18 GB 16.67% >> 141784319550391026443072753096570088105 >> >> -- >> Ray Slakinski >> >> > > -- Sasha Dolgy sasha.do...@gmail.com
Re: JSR-347
"data grids", it seems that this really does not have much relationship to "java", since all major noSQL solutions explicitly create interfaces in almost all languages and try to be language-agnostic by using RPC like thrift,avro etc. On Wed, Jul 13, 2011 at 9:06 AM, Pete Muir wrote: > Hi, > > I am looking to "round out" the EG membership of JSR-347 so that we can get > going with discussions. It would be great if someone from the Cassandra > community could join to represent the experiences of developing HBase :-) > > We'll be communicating using https://groups.google.com/forum/#!forum/jsr347 - > so that would be a good place to start whilst we wait for the JCP to process > formal nominations! > > Let me know any queries > > Best, > > Pete
Re: BulkLoader
Also note that if you have a cassandra node running on the local node from which you want to bulk load sstables, there is a JMX (StorageService->bulkLoad) call to do just that. May be simpler than using sstableloader if that is what you want to do. -- Sylvain On Wed, Jul 13, 2011 at 3:46 PM, Stephen Pope wrote: > Ahhh..ok. Thanks. > > -Original Message- > From: Jonathan Ellis [mailto:jbel...@gmail.com] > Sent: Wednesday, July 13, 2011 11:35 AM > To: user@cassandra.apache.org > Subject: Re: BulkLoader > > Because it's hooking directly into gossip, so the local instance it's > ignoring is the bulkloader process, not Cassandra. > > You'd need to run the bulkloader from a different IP, than Cassandra. > > On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope wrote: >> Fair enough. My original question stands then. :) >> >> Why aren't you allowed to talk to a local installation using BulkLoader? >> >> -Original Message- >> From: Jonathan Ellis [mailto:jbel...@gmail.com] >> Sent: Wednesday, July 13, 2011 11:06 AM >> To: user@cassandra.apache.org >> Subject: Re: BulkLoader >> >> Sure, that will work fine with a single machine. The advantage of >> bulkloader is it handles splitting the sstable up and sending each >> piece to the right place(s) when you have more than one. >> >> On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope wrote: >>> I think I've solved my own problem here. After generating the sstable >>> using json2sstable it looks like I can simply copy the created sstable into >>> my data directory. >>> >>> Can anyone think of any potential problems with doing it this way? >>> >>> -Original Message- >>> From: Stephen Pope [mailto:stephen.p...@quest.com] >>> Sent: Wednesday, July 13, 2011 9:32 AM >>> To: user@cassandra.apache.org >>> Subject: BulkLoader >>> >>> I'm trying to figure out how to use the BulkLoader, and it looks like >>> there's no way to run it against a local machine, because of this: >>> >>> Set hosts = Gossiper.instance.getLiveMembers(); >>> hosts.remove(FBUtilities.getLocalAddress()); >>> if (hosts.isEmpty()) >>> throw new IllegalStateException("Cannot load any >>> sstable, no live member found in the cluster"); >>> >>> Is this intended behavior? May I ask why? We'd like to be able to run it >>> against the local machine. >>> >>> Cheers, >>> Steve >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >
Escaping characters in cqlsh
I am trying to get all the columns named "fmd:" in cqlsh. I am using: select 'fmd:'..'fmd;' from feeds where; But I am getting errors (as expected). Is there any way to escape the colon or semicolon in cqlsh? Thanks, Blake
Re: One node down but it thinks its fine...
Was all working before, but we ran out of file handles and ended up restarting the nodes. No yaml changes have occurred. Ray Slakinski On 2011-07-13, at 12:55 PM, Sasha Dolgy wrote: > any firewall changes? ping is fine ... but if you can't get from > node(a) to nodes(n) on the specific ports... > > On Wed, Jul 13, 2011 at 6:47 PM, samal wrote: >> Check seed ip is same in all node and should not be loopback ip on cluster. >> >> On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski >> wrote: >>> >>> One of our nodes, which happens to be the seed thinks its Up and all the >>> other nodes are down. However all the other nodes thinks the seed is down >>> instead. The logs for the seed node show everything is running as it should >>> be. I've tried restarting the node, turning on/off gossip and thrift and >>> nothing seems to get the node to see the rest of its ring as up and running. >>> I have also tried restarting one of the other nodes, which had no affect on >>> the situation. Below is the ring outputs for the seed and one other node in >>> the ring, plus a ping to show that the seed can ping the other node. >>> >>> # bin/nodetool -h 0.0.0.0 ring >>> Address Status State Load Owns Token >>> 141784319550391026443072753096570088105 >>> 127.0.0.1 Up Normal 4.61 GB 16.67% 0 >>> xx.xxx.30.210 Down Normal ? 16.67% 28356863910078205288614550619314017621 >>> xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242 >>> xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863 >>> xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484 >>> xx.xxx.17.122 Down Normal ? 16.67% 141784319550391026443072753096570088105 >>> >>> >>> # ping xx.xxx.30.210 >>> PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data. >>> 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms >>> 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms >>> ^C >>> --- xx.xxx.30.210 ping statistics --- >>> 2 packets transmitted, 2 received, 0% packet loss, time 999ms >>> rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms >>> >>> >>> # bin/nodetool -h xx.xxx.30.210 ring >>> Address Status State Load Owns Token >>> 141784319550391026443072753096570088105 >>> xx.xxx.23.40 Down Normal ? 16.67% 0 >>> xx.xxx.30.210 Up Normal 10.58 GB 16.67% >>> 28356863910078205288614550619314017621 >>> xx.xx.90.87 Up Normal 10.47 GB 16.67% >>> 56713727820156410577229101238628035242 >>> xx.xx.22.236 Up Normal 9.63 GB 16.67% >>> 85070591730234615865843651857942052863 >>> xx.xx.97.96 Up Normal 10.68 GB 16.67% >>> 113427455640312821154458202477256070484 >>> xx.xxx.17.122 Up Normal 10.18 GB 16.67% >>> 141784319550391026443072753096570088105 >>> >>> -- >>> Ray Slakinski >>> >>> >> >> > > > > -- > Sasha Dolgy > sasha.do...@gmail.com
Re: CQL + Counters = bad request
I've tried using the Thrift/execute_cql_query() API as well, and it doesn't work either. I've also tried using a CF where the column names are of AsciiType to see if that was the problem (quoted and unquoted column names) and I get the exact same error of: no viable alternative at character '+' Frankly, I'm about ready to open a ticket against 0.8.1 saying CQL/Counter support does not work at all. Or is there a trick which isn't documented in the ticket? I tried reading the Java code referred to in ticket #2473, but i'm over my head. On Tue, Jul 12, 2011 at 6:46 PM, Aaron Turner wrote: > Doesn't seem to help: > > cqlsh> UPDATE RouterAggWeekly SET '1310367600' = '1310367600' + 17 > WHERE KEY = '1_20110728_ifoutmulticastpkts'; > Bad Request: line 1:55 no viable alternative at character '+' > > cqlsh> UPDATE RouterAggWeekly SET 1310367600 = '1310367600' + 17 WHERE > KEY = '1_20110728_ifoutmulticastpkts'; > Bad Request: line 1:53 no viable alternative at character '+' > > cqlsh> UPDATE RouterAggWeekly SET '1310367600' = 1310367600 + 17 WHERE > KEY = '1_20110728_ifoutmulticastpkts'; > Bad Request: line 1:53 no viable alternative at character '+' > > On Tue, Jul 12, 2011 at 5:35 PM, Jonathan Ellis wrote: >> Try quoting the column name. >> >> On Tue, Jul 12, 2011 at 5:30 PM, Aaron Turner wrote: >>> Using Cassandra 0.8.1 and cql 1.0.3 and following the syntax mentioned >>> in https://issues.apache.org/jira/browse/CASSANDRA-2473 >>> >>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE >>> KEY = '1_20110728_ifoutmulticastpkts'; >>> Bad Request: line 1:51 no viable alternative at character '+' >>> >>> Column names are Long's, hence the INT = INT + INT >>> >>> Ideas? >>> >>> -- >>> Aaron Turner >>> http://synfin.net/ Twitter: @synfinatic >>> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & >>> Windows >>> Those who would give up essential Liberty, to purchase a little temporary >>> Safety, deserve neither Liberty nor Safety. >>> -- Benjamin Franklin >>> "carpe diem quam minimum credula postero" >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > > > > -- > Aaron Turner > http://synfin.net/ Twitter: @synfinatic > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & > Windows > Those who would give up essential Liberty, to purchase a little temporary > Safety, deserve neither Liberty nor Safety. > -- Benjamin Franklin > "carpe diem quam minimum credula postero" > -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"
Re: CQL + Counters = bad request
> > >>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE > >>> KEY = '1_20110728_ifoutmulticastpkts'; > >>> Bad Request: line 1:51 no viable alternative at character '+' > I m able to insert it. ___ cqlsh> cqlsh> UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY = '1_20110728_ifoutmulticastpkts'; cqlsh> UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY = '1_20110728_ifoutmulticastpkts'; cqlsh> _ [default@test] list counts; Using default limit of 100 --- RowKey: 1_20110728_ifoutmulticastpkts => (counter=12, value=16) => (counter=1310367600, value=34) --- RowKey: 1 => (counter=1, value=10) 2 Rows Returned. [default@test]
Re: One node down but it thinks its fine...
And fixed! a co-worker put in a bad host line entry last night that through it all off :( Thanks for the assist guys. -- Ray Slakinski On Wednesday, July 13, 2011 at 1:32 PM, Ray Slakinski wrote: > Was all working before, but we ran out of file handles and ended up > restarting the nodes. No yaml changes have occurred. > > Ray Slakinski > > On 2011-07-13, at 12:55 PM, Sasha Dolgy (mailto:sdo...@gmail.com)> wrote: > > > any firewall changes? ping is fine ... but if you can't get from > > node(a) to nodes(n) on the specific ports... > > > > On Wed, Jul 13, 2011 at 6:47 PM, samal > (mailto:sa...@wakya.in)> wrote: > > > Check seed ip is same in all node and should not be loopback ip on > > > cluster. > > > > > > On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski > > (mailto:ray.slakin...@gmail.com)> > > > wrote: > > > > > > > > One of our nodes, which happens to be the seed thinks its Up and all the > > > > other nodes are down. However all the other nodes thinks the seed is > > > > down > > > > instead. The logs for the seed node show everything is running as it > > > > should > > > > be. I've tried restarting the node, turning on/off gossip and thrift and > > > > nothing seems to get the node to see the rest of its ring as up and > > > > running. > > > > I have also tried restarting one of the other nodes, which had no > > > > affect on > > > > the situation. Below is the ring outputs for the seed and one other > > > > node in > > > > the ring, plus a ping to show that the seed can ping the other node. > > > > > > > > # bin/nodetool -h 0.0.0.0 ring > > > > Address Status State Load Owns Token > > > > 141784319550391026443072753096570088105 > > > > 127.0.0.1 Up Normal 4.61 GB 16.67% 0 > > > > xx.xxx.30.210 Down Normal ? 16.67% > > > > 28356863910078205288614550619314017621 > > > > xx.xx.90.87 Down Normal ? 16.67% 56713727820156410577229101238628035242 > > > > xx.xx.22.236 Down Normal ? 16.67% 85070591730234615865843651857942052863 > > > > xx.xx.97.96 Down Normal ? 16.67% 113427455640312821154458202477256070484 > > > > xx.xxx.17.122 Down Normal ? 16.67% > > > > 141784319550391026443072753096570088105 > > > > > > > > > > > > # ping xx.xxx.30.210 > > > > PING xx.xxx.30.210 (xx.xxx.30.210) 56(84) bytes of data. > > > > 64 bytes from xx.xxx.30.210: icmp_req=1 ttl=61 time=0.299 ms > > > > 64 bytes from xx.xxx.30.210: icmp_req=2 ttl=61 time=0.287 ms > > > > ^C > > > > --- xx.xxx.30.210 ping statistics --- > > > > 2 packets transmitted, 2 received, 0% packet loss, time 999ms > > > > rtt min/avg/max/mdev = 0.287/0.293/0.299/0.006 ms > > > > > > > > > > > > # bin/nodetool -h xx.xxx.30.210 ring > > > > Address Status State Load Owns Token > > > > 141784319550391026443072753096570088105 > > > > xx.xxx.23.40 Down Normal ? 16.67% 0 > > > > xx.xxx.30.210 Up Normal 10.58 GB 16.67% > > > > 28356863910078205288614550619314017621 > > > > xx.xx.90.87 Up Normal 10.47 GB 16.67% > > > > 56713727820156410577229101238628035242 > > > > xx.xx.22.236 Up Normal 9.63 GB 16.67% > > > > 85070591730234615865843651857942052863 > > > > xx.xx.97.96 Up Normal 10.68 GB 16.67% > > > > 113427455640312821154458202477256070484 > > > > xx.xxx.17.122 Up Normal 10.18 GB 16.67% > > > > 141784319550391026443072753096570088105 > > > > > > > > -- > > > > Ray Slakinski > > > > > > > > -- > > Sasha Dolgy > > sasha.do...@gmail.com (mailto:sasha.do...@gmail.com)
Re: Escaping characters in cqlsh
You can escape quotes but I don't think you can escape semicolons. Can you create a ticket for us to fix this? On Wed, Jul 13, 2011 at 10:16 AM, Blake Visin wrote: > I am trying to get all the columns named "fmd:" in cqlsh. > I am using: > select 'fmd:'..'fmd;' from feeds where; > But I am getting errors (as expected). Is there any way to escape the colon > or semicolon in cqlsh? > Thanks, > Blake > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: BulkLoader
I'll have to apologize on that one. Just saw that the JMX call I was talking about doesn't work as it should. I'll fix that for 0.8.2 but in the meantime you'll want to use sstableloader on a different IP as pointed by Jonathan. -- Sylvain On Wed, Jul 13, 2011 at 5:11 PM, Sylvain Lebresne wrote: > Also note that if you have a cassandra node running on the local node > from which you want to bulk load sstables, there is a JMX > (StorageService->bulkLoad) call to do just that. May be simpler than > using sstableloader if that is what you want to do. > > -- > Sylvain > > On Wed, Jul 13, 2011 at 3:46 PM, Stephen Pope wrote: >> Ahhh..ok. Thanks. >> >> -Original Message- >> From: Jonathan Ellis [mailto:jbel...@gmail.com] >> Sent: Wednesday, July 13, 2011 11:35 AM >> To: user@cassandra.apache.org >> Subject: Re: BulkLoader >> >> Because it's hooking directly into gossip, so the local instance it's >> ignoring is the bulkloader process, not Cassandra. >> >> You'd need to run the bulkloader from a different IP, than Cassandra. >> >> On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope wrote: >>> Fair enough. My original question stands then. :) >>> >>> Why aren't you allowed to talk to a local installation using BulkLoader? >>> >>> -Original Message- >>> From: Jonathan Ellis [mailto:jbel...@gmail.com] >>> Sent: Wednesday, July 13, 2011 11:06 AM >>> To: user@cassandra.apache.org >>> Subject: Re: BulkLoader >>> >>> Sure, that will work fine with a single machine. The advantage of >>> bulkloader is it handles splitting the sstable up and sending each >>> piece to the right place(s) when you have more than one. >>> >>> On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope >>> wrote: I think I've solved my own problem here. After generating the sstable using json2sstable it looks like I can simply copy the created sstable into my data directory. Can anyone think of any potential problems with doing it this way? -Original Message- From: Stephen Pope [mailto:stephen.p...@quest.com] Sent: Wednesday, July 13, 2011 9:32 AM To: user@cassandra.apache.org Subject: BulkLoader I'm trying to figure out how to use the BulkLoader, and it looks like there's no way to run it against a local machine, because of this: Set hosts = Gossiper.instance.getLiveMembers(); hosts.remove(FBUtilities.getLocalAddress()); if (hosts.isEmpty()) throw new IllegalStateException("Cannot load any sstable, no live member found in the cluster"); Is this intended behavior? May I ask why? We'd like to be able to run it against the local machine. Cheers, Steve >>> >>> >>> >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of DataStax, the source for professional Cassandra support >>> http://www.datastax.com >>> >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> >
Re: Re: Re: Re: AntiEntropy?
> In the company I work for I suggested many times to run repair at least 1 > every 10 days (gcgraceseconds is set approx to 10 days in our config) -- but > this book has been used against me :-) I will ask to run repair asap Note that if GCGraceSeconds is 10 days, you want to run repair often enough that there will never be a moment where there is more than exactly 10 days since the last successfully completed repair *STARTED*. When scheduling repairs, factor in things like - what happens if repair fails? Who gets alerted and how, and will there be time to fix the problem? How long does repair take? So basically, leave significant margin. -- / Peter Schuller
Re: Off-heap Cache
How do I ensure it is indeed using the SerializingCacheProvider. Thanks -Rajesh On Tue, Jul 12, 2011 at 1:46 PM, Jonathan Ellis wrote: > You need to set row_cache_provider=SerializingCacheProvider on the > columnfamily definition (via the cli) > > On Tue, Jul 12, 2011 at 9:57 AM, Raj N wrote: > > Do we need to do anything special to turn off-heap cache on? > > https://issues.apache.org/jira/browse/CASSANDRA-1969 > > -Raj > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >
Re: commitlog replay missing data
Have you verified that data you expect to see is not in the server after shutdown? WRT the differed in the difference between the Memtable data size and SSTable live size, don't believe everything you read :) Memtable live size is increased by the serialised byte size of every column inserted, and is never decremented. Deletes and overwrites will inflate this value. What was your workload like? As of 0.8 we now have global memory management for cf's that tracks actual JVM bytes used by a CF. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12/07/2011, at 3:28 PM, Jeffrey Wang wrote: > Hey all, > > > > Recently upgraded to 0.8.1 and noticed what seems to be missing data after a > commitlog replay on a single-node cluster. I start the node, insert a bunch > of stuff (~600MB), stop it, and restart it. There are log messages pertaining > to the commitlog replay and no errors, but some of the data is missing. If I > flush before stopping the node, everything is fine, and running cfstats in > the two cases shows different amounts of data in the SSTables. Moreover, the > amount of data that is missing is nondeterministic. Has anyone run into this? > Thanks. > > > > Here is the output of a side-by-side diff between cfstats outputs for a > single CF before restarting (left) and after (right). Somehow a 37MB memtable > became a 2.9MB SSTable (note the difference in write count as well)? > > > > Column Family: Blocks Column > Family: Blocks > > SSTable count: 0 | SSTable > count: 1 > > Space used (live): 0 | Space used > (live): 2907637 > > Space used (total): 0 | Space used > (total): 2907637 > > Memtable Columns Count: 8198 | Memtable > Columns Count: 0 > > Memtable Data Size: 37550510 | Memtable Data > Size: 0 > > Memtable Switch Count: 0 | Memtable > Switch Count: 1 > > Read Count: 0 Read Count: 0 > > Read Latency: NaN ms. Read Latency: > NaN ms. > > Write Count: 8198 | Write Count: > 1526 > > Write Latency: 0.018 ms. | Write > Latency: 0.011 ms. > > Pending Tasks: 0Pending > Tasks: 0 > > Key cache capacity: 20 Key cache > capacity: 20 > > Key cache size: 0 Key cache > size: 0 > > Key cache hit rate: NaN Key cache hit > rate: NaN > > Row cache: disabled Row cache: > disabled > > Compacted row minimum size: 0 | Compacted row > minimum size: 1110 > > Compacted row maximum size: 0 | Compacted row > maximum size: 2299 > > Compacted row mean size: 0| Compacted row > mean size: 1960 > > > > Note that I patched https://issues.apache.org/jira/browse/CASSANDRA-2317 in > my version, but there are no deletions involved so I don’t think it’s > relevant unless I messed something up while patching. > > > > -Jeffrey >
Re: Storing counters in the standard column families along with non-counter columns ?
If you can provide some more details on the use case we may be able to provide some data model help. You can always use a dedicated CF for the counters, and use the same row key. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12/07/2011, at 6:36 AM, Aditya Narayan wrote: > Oops that's really very much disheartening and it could seriously impact our > plans for going live in near future. Without this facility I guess counters > currently have very little usefulness. > > On Mon, Jul 11, 2011 at 8:16 PM, Chris Burroughs > wrote: > On 07/10/2011 01:09 PM, Aditya Narayan wrote: > > Is there any target version in near future for which this has been promised > > ? > > The ticket is problematic in that it would -- unless someone has a > clever new idea -- require breaking thrift compatibility to add it to > the api. Since is unfortunate since it would be so useful. > > If it's in the 0.8.x series it will only be through CQL. >
Re: CQL + Counters = bad request
Thanks. Looks like we tracked down the problem to the datasax 0.8.1 rpm is actually 0.8.0. rpm -qa | grep cassandra apache-cassandra08-0.8.1-1 grep ' Cassandra version:' /var/log/cassandra/system.log | tail -1 INFO [main] 2011-07-13 12:04:31,039 StorageService.java (line 368) Cassandra version: 0.8.0 On Wed, Jul 13, 2011 at 11:40 AM, samal wrote: >> >>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE >> >>> KEY = '1_20110728_ifoutmulticastpkts'; >> >>> Bad Request: line 1:51 no viable alternative at character '+' > > I m able to insert it. > ___ > cqlsh> > cqlsh> UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY = > '1_20110728_ifoutmulticastpkts'; > cqlsh> UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY = > '1_20110728_ifoutmulticastpkts'; > cqlsh> > _ > [default@test] list counts; > Using default limit of 100 > --- > RowKey: 1_20110728_ifoutmulticastpkts > => (counter=12, value=16) > => (counter=1310367600, value=34) > --- > RowKey: 1 > => (counter=1, value=10) > 2 Rows Returned. > [default@test] > -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"
Re: commitlog replay missing data
> Recently upgraded to 0.8.1 and noticed what seems to be missing data after a > commitlog replay on a single-node cluster. I start the node, insert a bunch > of stuff (~600MB), stop it, and restart it. There are log messages If you stop by a kill, make sure you use batched commitlog synch mode instead of periodic if you want guarantees on individual writes. (I don't believe you'd expect a significant disk space discrepancy though since in practice the delay until write() should be small. But don't quote me on this because I'd have to check the code to make sure that commit log reply isn't dependent on some marker that isn't written until commit log synch.) -- / Peter Schuller (@scode on twitter)
Re: commitlog replay missing data
Peter Schuller wrote: > >> Recently upgraded to 0.8.1 and noticed what seems to be missing data >> after a >> commitlog replay on a single-node cluster. I start the node, insert a >> bunch >> of stuff (~600MB), stop it, and restart it. There are log messages > > If you stop by a kill, make sure you use batched commitlog synch mode > instead of periodic if you want guarantees on individual writes. > What are the other ways to stop Cassandra? What's the difference between batch vs periodic? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/commitlog-replay-missing-data-tp6573659p6580886.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: commitlog replay missing data
> What are the other ways to stop Cassandra? nodetool disablegossip nodetool disablethrift # wait for a bit until no one is sending it writes anymore nodetool flush # only relevant if in periodic mode # then kill it > What's the difference between batch vs periodic? Search for "batch" on http://wiki.apache.org/cassandra/StorageConfiguration -- / Peter Schuller (@scode on twitter)
Re: commitlog replay missing data
> # wait for a bit until no one is sending it writes anymore More accurately, until all other nodes have realized it's down (nodetool ring on each respective host). -- / Peter Schuller (@scode on twitter)
R: Re: Re: Re: Re: AntiEntropy?
>Note that if GCGraceSeconds is 10 days, you want to run repair often >enough that there will never be a moment where there is more than >exactly 10 days since the last successfully completed repair >*STARTED*. >When scheduling repairs, factor in things like - what happens if >repair fails? Who gets alerted and how, and will there be time to fix >the problem? How long does repair take? Peter thanks for the tip. I'm still very surprised for what I've read in the book about the repair. Best Regards Carlo
Replicating to all nodes
I am wondering if the following cluster figuration is possible with cassandra, and if so, how it could be achieved. Please also feel free to point out any issues that may make this configuration undesired that I may not have thought of. Suppose a cluster of N nodes. Each node replicates the data of all other nodes. Read and write operations should succeed even if only 1 node is online. When a read is performed, it is performed against all active nodes. When a write is performed, it is performed against all active nodes, inactive/offline nodes are updated when they come back online. Would this involve a new ConsistencyLevel, e.g. ConsistentLevel.Active? Does a facility exist which could mimic this behavior? I don't believe it does. Currently the replication factor is hard coded based on key space, not a function of the number of nodes in the cluster. You could say, if N = 7, configure replication factor = 7, but then if only 6 nodes are online, writes would fail. Is this correct?
JDBC CQL Driver unable to locate cassandra.yaml
I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL product. We have an interface that performs database queries using there respective JDBC drivers. When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace: Unable to locate cassandra.yaml I am using Cassandra 0.8.1. Is there a guide on how to utilize/setup the JDBC driver? Derek Tracy trac...@gmail.com -
Re: JDBC CQL Driver unable to locate cassandra.yaml
The current version of the driver does require having the server's cassandra.yaml on the classpath. This is a bug. On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy wrote: > I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL > product. > We have an interface that performs database queries using there respective > JDBC drivers. > When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace: > > Unable to locate cassandra.yaml > > I am using Cassandra 0.8.1. Is there a guide on how to utilize/setup the > JDBC driver? > > > > Derek Tracy > trac...@gmail.com > - > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Replicating to all nodes
> Read and write operations should succeed even if only 1 node is online. > > When a read is performed, it is performed against all active nodes. Using QUORUM is the closest thing you get for reads without modifying Cassandra. You can't make it wait for all nodes that happen to be up. > When a write is performed, it is performed against all active nodes, > inactive/offline nodes are updated when they come back online. Writes always go to all nodes that are up, but if you want to wait for them before returning "OK" to the client than no - except CL.ALL (which means you don't survive one being down) and CL.QUORUM (which means you don't wait for all if all are up). > I don't believe it does. Currently the replication factor is hard > coded based on key space, not a function of the number of nodes in the > cluster. You could say, if N = 7, configure replication factor = 7, > but then if only 6 nodes are online, writes would fail. Is this > correct? No. Reads/write fail according to the consistency level. The RF + consistency level tells how many nodes must be up and successfully service the request in order for the operation to succeed. RF just tells you the number of total nodes int he replicate set for a key; whether an operation fails is up to the consistency level. I would ask: Why are you trying to do this? It really seems you're trying to do the "wrong" thing. Why would you ever want to replicate to all? If you want 3 copies in total, then do RF=3 and keep a 3 node ring. If you need more capacity, you add nodes and retain RF. If you need more redundancy, you have to increase RF. Those are two very different axis along which to scale. I cannot think of any reason why you would want to tie RF to the total number of nodes. What is the goal you're trying to achieve? -- / Peter Schuller (@scode on twitter)
Question about compaction
Running Cassandra 0.8.1. Ran major compaction via: sudo /home/ubuntu/brisk/resources/cassandra/bin/nodetool -h localhost compact & >From what I'd read about Cassandra, I thought that after compaction all of the different SSTables on disk for a Column Family would be merged into one new file. However, there are now a bunch of 0-sized Compacted files and a bunch of Data files. Any ideas about why there are still so many files left? Also, is a minor compaction the same thing as a read-only compaction in 0.7? ubuntu@domU-12-31-39-0E-x-x:/raiddrive/data/DemoKS$ ls -l total 270527136 -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5670-Compacted -rw-r--r-- 1 root root 89457447799 2011-07-10 00:26 DemoCF-g-5670-Data.db -rw-r--r-- 1 root root 193456 2011-07-10 00:26 DemoCF-g-5670-Filter.db -rw-r--r-- 1 root root 2081159 2011-07-10 00:26 DemoCF-g-5670-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 00:26 DemoCF-g-5670-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5686-Compacted -rw-r--r-- 1 root root920521489 2011-07-09 22:03 DemoCF-g-5686-Data.db -rw-r--r-- 1 root root11776 2011-07-09 22:03 DemoCF-g-5686-Filter.db -rw-r--r-- 1 root root 126725 2011-07-09 22:03 DemoCF-g-5686-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 22:03 DemoCF-g-5686-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5781-Compacted -rw-r--r-- 1 root root223970446 2011-07-09 22:38 DemoCF-g-5781-Data.db -rw-r--r-- 1 root root 7216 2011-07-09 22:38 DemoCF-g-5781-Filter.db -rw-r--r-- 1 root root32750 2011-07-09 22:38 DemoCF-g-5781-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 22:38 DemoCF-g-5781-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5874-Compacted -rw-r--r-- 1 root root156284248 2011-07-09 23:20 DemoCF-g-5874-Data.db -rw-r--r-- 1 root root 5056 2011-07-09 23:20 DemoCF-g-5874-Filter.db -rw-r--r-- 1 root root10400 2011-07-09 23:20 DemoCF-g-5874-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 23:20 DemoCF-g-5874-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-6938-Compacted -rw-r--r-- 1 root root 22947541446 2011-07-10 11:43 DemoCF-g-6938-Data.db -rw-r--r-- 1 root root49936 2011-07-10 11:43 DemoCF-g-6938-Filter.db -rw-r--r-- 1 root root 563550 2011-07-10 11:43 DemoCF-g-6938-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 11:43 DemoCF-g-6938-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-6996-Compacted -rw-r--r-- 1 root root224253930 2011-07-10 11:28 DemoCF-g-6996-Data.db -rw-r--r-- 1 root root 7216 2011-07-10 11:27 DemoCF-g-6996-Filter.db -rw-r--r-- 1 root root26250 2011-07-10 11:28 DemoCF-g-6996-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 11:28 DemoCF-g-6996-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-8324-Compacted
Re: How to remove/add node
As long as you have no data in this cluster, try clearing out the /var/lib/cassandra directory from all nodes and restart Cassandra. The only way to change tokens after they've been set is using a nodetool move or clearing /var/lib/cassandra. On Wed, Jul 13, 2011 at 7:41 AM, Abdul Haq Shaik < abdulsk.cassan...@gmail.com> wrote: > Hi, > > I have deleted the data, commitlog and saved cache directories. I have > removed one of the nodes from the seeds of cassandra.yaml. When i tried to > use nodetool, itshowing the removed node as up.. > > Thanks, > > Abdul >
Re: Replicating to all nodes
Thanks for the reply Peter. The goal is to configure a cluster in which reads and writes can complete successfully even if only 1 node is online. For this to work, each node would need the entire dataset. Your example of a 3 node ring with RF=3 would satisfy this requirement. However, if two nodes are offline, CL.QUORUM would not work, I would need to use CL.ONE. If all 3 nodes are online, CL.ONE is undershooting, I would want to use CL.QUORUM (or maybe CL.ALL). Or does CL.ONE actually function this way, somewhat? A complication occurs when you want to add another node. Now there's a 4 node ring, but only 3 replicas, so each node isn't guaranteed to have all of the data, so the cluster can't completely function when N-1 nodes are offline. So this is why I would like to have the RF scale relative to the size of the cluster. Am I mistaken? Thanks! On Wed, Jul 13, 2011 at 6:41 PM, Peter Schuller wrote: >> Read and write operations should succeed even if only 1 node is online. >> >> When a read is performed, it is performed against all active nodes. > > Using QUORUM is the closest thing you get for reads without modifying > Cassandra. You can't make it wait for all nodes that happen to be up. > >> When a write is performed, it is performed against all active nodes, >> inactive/offline nodes are updated when they come back online. > > Writes always go to all nodes that are up, but if you want to wait for > them before returning "OK" to the client than no - except CL.ALL > (which means you don't survive one being down) and CL.QUORUM (which > means you don't wait for all if all are up). > >> I don't believe it does. Currently the replication factor is hard >> coded based on key space, not a function of the number of nodes in the >> cluster. You could say, if N = 7, configure replication factor = 7, >> but then if only 6 nodes are online, writes would fail. Is this >> correct? > > No. Reads/write fail according to the consistency level. The RF + > consistency level tells how many nodes must be up and successfully > service the request in order for the operation to succeed. RF just > tells you the number of total nodes int he replicate set for a key; > whether an operation fails is up to the consistency level. > > I would ask: Why are you trying to do this? It really seems you're > trying to do the "wrong" thing. Why would you ever want to replicate > to all? If you want 3 copies in total, then do RF=3 and keep a 3 node > ring. If you need more capacity, you add nodes and retain RF. If you > need more redundancy, you have to increase RF. Those are two very > different axis along which to scale. I cannot think of any reason why > you would want to tie RF to the total number of nodes. > > What is the goal you're trying to achieve? > > -- > / Peter Schuller (@scode on twitter) >
cassandra goes infinite loop and data lost.....
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 100zs:false:14@1310168625866434
Re: cassandra goes infinite loop and data lost.....
I gave cassandra 8GB heap size and somehow it run out of memory and crashed. after I start it, it just runs in to the following infinite loop, the last line: DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 100zs:false:14@1310168625866434 goes for ever I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and can't get it back? DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) collecting 20 of 2147483647: q74k:false:14@1308886095008943 DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: apbg:false:13@1305641597957086 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 1 of 2147483647: auje:false:13@1305641597957075 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 2 of 2147483647: ayj8:false:13@1305641597957060 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 3 of 2147483647: b4fz:false:13@1305641597957096 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 100zs:false:14@1310168625866434 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 1 of 2147483647: 1017f:false:14@1310168680375612 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 2 of 2147483647: 1018e:false:14@1310168759614715 DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) collecting 3 of 2147483647: 101dd:false:14@1310169260225339 On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu wrote: > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 -- 闫春路
Re: cassandra goes infinite loop and data lost.....
How much total memory does your machine have? -- Bret On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote: > I gave cassandra 8GB heap size and somehow it run out of memory and crashed. > after I start it, it just runs in to the following infinite loop, the last > line: > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > goes for ever > > I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and > can't get it back? > > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 20 of 2147483647: q74k:false:14@1308886095008943 > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: apbg:false:13@1305641597957086 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: auje:false:13@1305641597957075 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: ayj8:false:13@1305641597957060 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: b4fz:false:13@1305641597957096 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: 1017f:false:14@1310168680375612 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: 1018e:false:14@1310168759614715 > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: 101dd:false:14@1310169260225339 > > > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu (mailto:springri...@gmail.com)> wrote: > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 0 of 2147483647 (tel:2147483647): > > 100zs:false:14@1310168625866434 > > > -- > 闫春路
Re: cassandra goes infinite loop and data lost.....
16GB On Thu, Jul 14, 2011 at 11:29 AM, Bret Palsson wrote: > How much total memory does your machine have? > > -- > Bret > > On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote: > > I gave cassandra 8GB heap size and somehow it run out of memory and > crashed. after I start it, it just runs in to the following infinite loop, > the last line: > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > goes for ever > > I have 3 nodes and RF=2, so I am losing data. is that means I am screwed > and can't get it back? > > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 20 of 2147483647: q74k:false:14@1308886095008943 > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: apbg:false:13@1305641597957086 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: auje:false:13@1305641597957075 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: ayj8:false:13@1305641597957060 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: b4fz:false:13@1305641597957096 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: 1017f:false:14@1310168680375612 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: 1018e:false:14@1310168759614715 > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: 101dd:false:14@1310169260225339 > > > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu wrote: > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > > > > -- > 闫春路 > > > -- Charles
Re: cassandra goes infinite loop and data lost.....
problem is I can't take cassandra back does that because not enough memory for cassandra? On Thu, Jul 14, 2011 at 11:29 AM, Bret Palsson wrote: > How much total memory does your machine have? > > -- > Bret > > On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote: > > I gave cassandra 8GB heap size and somehow it run out of memory and > crashed. after I start it, it just runs in to the following infinite loop, > the last line: > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > goes for ever > > I have 3 nodes and RF=2, so I am losing data. is that means I am screwed > and can't get it back? > > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 20 of 2147483647: q74k:false:14@1308886095008943 > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: apbg:false:13@1305641597957086 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: auje:false:13@1305641597957075 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: ayj8:false:13@1305641597957060 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: b4fz:false:13@1305641597957096 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: 1017f:false:14@1310168680375612 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: 1018e:false:14@1310168759614715 > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: 101dd:false:14@1310169260225339 > > > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu wrote: > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > > > > -- > 闫春路 > > > -- 闫春路
Re: Replicating to all nodes
Consistency and Availability are in trade-off each other. If you use RF=7 + CL=ONE, your read/write will success if you have one node alive during replicate data to 7 nodes. Of course you will have a chance to read old data in this case. If you need strong consistency, you must use CL=QUORUM. maki 2011/7/14 Kyle Gibson : > Thanks for the reply Peter. > > The goal is to configure a cluster in which reads and writes can > complete successfully even if only 1 node is online. For this to work, > each node would need the entire dataset. Your example of a 3 node ring > with RF=3 would satisfy this requirement. However, if two nodes are > offline, CL.QUORUM would not work, I would need to use CL.ONE. If all > 3 nodes are online, CL.ONE is undershooting, I would want to use > CL.QUORUM (or maybe CL.ALL). Or does CL.ONE actually function this > way, somewhat? > > A complication occurs when you want to add another node. Now there's a > 4 node ring, but only 3 replicas, so each node isn't guaranteed to > have all of the data, so the cluster can't completely function when > N-1 nodes are offline. So this is why I would like to have the RF > scale relative to the size of the cluster. Am I mistaken? > > Thanks! > > On Wed, Jul 13, 2011 at 6:41 PM, Peter Schuller > wrote: >>> Read and write operations should succeed even if only 1 node is online. >>> >>> When a read is performed, it is performed against all active nodes. >> >> Using QUORUM is the closest thing you get for reads without modifying >> Cassandra. You can't make it wait for all nodes that happen to be up. >> >>> When a write is performed, it is performed against all active nodes, >>> inactive/offline nodes are updated when they come back online. >> >> Writes always go to all nodes that are up, but if you want to wait for >> them before returning "OK" to the client than no - except CL.ALL >> (which means you don't survive one being down) and CL.QUORUM (which >> means you don't wait for all if all are up). >> >>> I don't believe it does. Currently the replication factor is hard >>> coded based on key space, not a function of the number of nodes in the >>> cluster. You could say, if N = 7, configure replication factor = 7, >>> but then if only 6 nodes are online, writes would fail. Is this >>> correct? >> >> No. Reads/write fail according to the consistency level. The RF + >> consistency level tells how many nodes must be up and successfully >> service the request in order for the operation to succeed. RF just >> tells you the number of total nodes int he replicate set for a key; >> whether an operation fails is up to the consistency level. >> >> I would ask: Why are you trying to do this? It really seems you're >> trying to do the "wrong" thing. Why would you ever want to replicate >> to all? If you want 3 copies in total, then do RF=3 and keep a 3 node >> ring. If you need more capacity, you add nodes and retain RF. If you >> need more redundancy, you have to increase RF. Those are two very >> different axis along which to scale. I cannot think of any reason why >> you would want to tie RF to the total number of nodes. >> >> What is the goal you're trying to achieve? >> >> -- >> / Peter Schuller (@scode on twitter) >> > -- w3m
Re: Survey: Cassandra/JVM Resident Set Size increase
On Wed, Jul 13, 2011 at 9:45 PM, Konstantin Naryshkin wrote: > Do you mean that it is using all of the available heap? That is the > expected behavior of most long running Java applications. The JVM will not > GC until it needs memory (or you explicitly ask it to) and will only free up > a bit of memory at a time. That is very good behavior from a performance > stand point since frequent, large GCs would make your application very > unresponsive. It also makes Java applications take up all the memory you > give them. > > - Original Message - > From: "Sasha Dolgy" > To: user@cassandra.apache.org > Sent: Tuesday, July 12, 2011 10:23:02 PM > Subject: Re: Survey: Cassandra/JVM Resident Set Size increase > > I'll post more tomorrow ... However, we set up one node in a single node > cluster and have left it with no datareviewing memory consumption > graphs...it increased daily until it gobbled (highly technical term) all > memory...the system is now running just below 100% memory usagewhich i > find peculiar seeings that it is doing nothingwith no data and > no peers. > On Jul 12, 2011 3:29 PM, "Chris Burroughs" > wrote: > > ### Preamble > > > > There have been several reports on the mailing list of the JVM running > > Cassandra using "too much" memory. That is, the resident set size is > >>>(max java heap size + mmaped segments) and continues to grow until the > > process swaps, kernel oom killer comes along, or performance just > > degrades too far due to the lack of space for the page cache. It has > > been unclear from these reports if there is a pattern. My hope here is > > that by comparing JVM versions, OS versions, JVM configuration etc., we > > will find something. Thank you everyone for your time. > > > > > > Some example reports: > > - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html > > - > > > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html > > - https://issues.apache.org/jira/browse/CASSANDRA-2868 > > - > > > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html > > - > > > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html > > > > For reference theories include (in no particular order): > > - memory fragmentation > > - JVM bug > > - OS/glibc bug > > - direct memory > > - swap induced fragmentation > > - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity. > > > > ### Survey > > > > 1. Do you think you are experiencing this problem? > Yes. > > > > 2. Why? (This is a good time to share a graph like > > http://www.twitpic.com/5fdabn or > > http://img24.imageshack.us/img24/1754/cassandrarss.png) > I observe the RSS of cassandra process keeps going up to dozens of gigabytes, even if the dataset (sstables) is just hundreds of megabytes. > > > > 2. Are you using mmap? (If yes be sure to have read > > http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have > > used pmap [or another tool] to rule you mmap and top decieving you.) > Yes. pmap tells me a lot of anonymous regions are created and expanded during the life cycle of cassandra process. That is is primary reason of RSS occupy. I'm pretty these anonymous regions are not the Java heap used by JVM, as they are not continuous. > > > 3. Are you using JNA? Was mlockall succesful (it's in the logs on > startup)? > Yes. mlockall is successful either. I have not tried other settings. > > > > 4. Is swap enabled? Are you swapping? > No. Swap is disabled. > > > > 5. What version of Apache Cassandra are you using? > 0.6.13 > > > > 6. What is the earliest version of Apache Cassandra you recall seeing > > this problem with? > Earlier version of 0.6.x branch. > > > > 7. Have you tried the patch from CASSANDRA-2654 ? > Not yet, as I do not query large datasets. > > > > 8. What jvm and version are you using? > "java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)" I also tried openJDK. > > > 9. What OS and version are you using? > The kernel version is "2.6.18-194.26.1.el5.028stab079.2", which is from CentOS 5.4 The user level environment is Ubuntu 10.04 (Lucid) server edition. This strange combination is because cassandra runs inside OpenVZ container (Ubuntu 10.04) above Cent OS host. I am afraid the old kernel caused the memory fragmentation of cassandra process. But I can not prove it as I did not try it on latest kernel. > > > 10. What are your jvm flags? > Both CMS and parallel old GC can observe the problem. These are the flags used: "-ea -Xms3G-Xmx3G -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFractio
Re: cassandra goes infinite loop and data lost.....
That says "I'm collecting data to answer requests." I don't see anything here that indicates an infinite loop. I do see that it's saying "N of 2147483647" which looks like you're doing slices with a much larger limit than is advisable (good way to OOM the way you already did). On Wed, Jul 13, 2011 at 8:27 PM, Yan Chunlu wrote: > I gave cassandra 8GB heap size and somehow it run out of memory and crashed. > after I start it, it just runs in to the following infinite loop, the last > line: > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > goes for ever > I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and > can't get it back? > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 20 of 2147483647: q74k:false:14@1308886095008943 > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: apbg:false:13@1305641597957086 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: auje:false:13@1305641597957075 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: ayj8:false:13@1305641597957060 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: b4fz:false:13@1305641597957096 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 1 of 2147483647: 1017f:false:14@1310168680375612 > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > collecting 2 of 2147483647: 1018e:false:14@1310168759614715 > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) > collecting 3 of 2147483647: 101dd:false:14@1310169260225339 > > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu wrote: >> >> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) >> collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > > -- > 闫春路 > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
RE: JDBC CQL Driver unable to locate cassandra.yaml
setting server.config ->$SERVER_PATH/Cassandra.yaml as a system property should resolve this? -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Thursday, July 14, 2011 3:53 AM To: user@cassandra.apache.org Subject: Re: JDBC CQL Driver unable to locate cassandra.yaml The current version of the driver does require having the server's cassandra.yaml on the classpath. This is a bug. On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy wrote: > I am trying to integrate the Cassandra JDBC CQL driver with my > companies ETL product. > We have an interface that performs database queries using there > respective JDBC drivers. > When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace: > > Unable to locate cassandra.yaml > > I am using Cassandra 0.8.1. Is there a guide on how to utilize/setup > the JDBC driver? > > > > Derek Tracy > trac...@gmail.com > - > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com Register for Impetus Webinar on ‘Device Side Performance Optimization of Mobile Apps’, July 08 (10:00 am Pacific Time). Impetus is presenting a Cassandra case study on July 11 as a sponsor for Cassandra SF 2011 in San Francisco. Click http://www.impetus.com to know more. Follow us on www.twitter.com/impetuscalling NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: cassandra goes infinite loop and data lost.....
okay, I am not sure if it is infinite loop, I change log4j to "DEBUG" only because cassandra never get online after run cassandra, it seems just halt. I enable debug then it start showing those message very fast and never end. I have just run nodetool cleanup, and it start reading commitlog, seems normal now. thanks for the help, I am really newbie on cassandra and has no idea how does slice works, could you give me more information? thanks alot! On Thu, Jul 14, 2011 at 1:36 PM, Jonathan Ellis wrote: > That says "I'm collecting data to answer requests." > > I don't see anything here that indicates an infinite loop. > > I do see that it's saying "N of 2147483647" which looks like you're > doing slices with a much larger limit than is advisable (good way to > OOM the way you already did). > > On Wed, Jul 13, 2011 at 8:27 PM, Yan Chunlu wrote: > > I gave cassandra 8GB heap size and somehow it run out of memory and > crashed. > > after I start it, it just runs in to the following infinite loop, the > last > > line: > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > goes for ever > > I have 3 nodes and RF=2, so I am losing data. is that means I am screwed > and > > can't get it back? > > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > > collecting 20 of 2147483647: q74k:false:14@1308886095008943 > > DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) > > collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 0 of 2147483647: apbg:false:13@1305641597957086 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 1 of 2147483647: auje:false:13@1305641597957075 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 2 of 2147483647: ayj8:false:13@1305641597957060 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 3 of 2147483647: b4fz:false:13@1305641597957096 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 1 of 2147483647: 1017f:false:14@1310168680375612 > > DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > > collecting 2 of 2147483647: 1018e:false:14@1310168759614715 > > DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) > > collecting 3 of 2147483647: 101dd:false:14@1310169260225339 > > > > On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu > wrote: > >> > >> DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) > >> collecting 0 of 2147483647: 100zs:false:14@1310168625866434 > > > > > > -- > > 闫春路 > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- 闫春路
Re: How to remove/add node
Thanks a lot dear. I will try it out and will let you know if the problem persists. On Thu, Jul 14, 2011 at 5:52 AM, Sameer Farooqui wrote: > As long as you have no data in this cluster, try clearing out the > /var/lib/cassandra directory from all nodes and restart Cassandra. > > The only way to change tokens after they've been set is using a nodetool > move or clearing /var/lib/cassandra. > > > > On Wed, Jul 13, 2011 at 7:41 AM, Abdul Haq Shaik < > abdulsk.cassan...@gmail.com> wrote: > >> Hi, >> >> I have deleted the data, commitlog and saved cache directories. I have >> removed one of the nodes from the seeds of cassandra.yaml. When i tried to >> use nodetool, itshowing the removed node as up.. >> >> Thanks, >> >> Abdul >> > >