date:20110323

SSTables which can't be scrubbed or exported

2011-03-23 Thread Jason Harvey

Since the 0.7 upgrade, I've been going through and scrubbing all of our sstables on 0.7.4. Some of the tables have completely unordered keys, and the scrub fails to work on those tables. In those cases, I export the sstable via sstable2json, and reimport it with json2sstable. Tonight I've ran into

Re: Ghost node showing up in the ring

2011-03-23 Thread aaron morton

When the node starts it reads the stored token information from the LocationInfo CF in the System KS. It looks like the log message "is now part of the cluster" is only logged when an endpoint is added to a nodes view of the ring via gossip It is not logged when the endpoint is added during st

Re: change node IP address

2011-03-23 Thread aaron morton

Which version are you using ? It looks like using 0.7X (and prob 0.6) versions you can just shutdown the node and bring it back up with the new IP and It Just Works https://issues.apache.org/jira/browse/CASSANDRA-872 I've not done it before, anyone else ? Aaron On 23 Mar 2011, at 07:53, Cas

Re: Advice on mmap related swapping issue

2011-03-23 Thread Daniel Doubleday

FWIW: For whatever reason jna memlockall does not work for us. jna call is successful but cassandra process swaps anyway. see: http://www.mail-archive.com/user@cassandra.apache.org/msg11235.html We disabled swap entirely. On Mar 22, 2011, at 8:56 PM, Chris Goffinet wrote: > The easiest way to

what kind of bug?

2011-03-23 Thread pob

Hello, what kind of bug is it? If I do nodetool host1 ring, the output is: Address Status State LoadOwnsToken 141784319550391026443072753096570088105 1.174 Up Normal 4.14 GB 16.67% 0 1.173 Down Normal 4.07 GB 16.67% 283568639100782052886145

Error on startup after upgrading from 0.7.0 to 0.7.4

2011-03-23 Thread Sébastien Druon

Hello After having updgraded from 0.7.0 to 0.7.4, cassandra does not start anymore. I have the following errot stack: INFO 11:47:04,112 Finished reading /var/lib/cassandra/commitlog/CommitLog-1296815168287.log ERROR 11:47:04,114 Exception encountered during startup. java.lang.NullPointerExceptio

Re: Error on startup after upgrading from 0.7.0 to 0.7.4

2011-03-23 Thread Sylvain Lebresne

Did you drop a keyspace not too long before the upgrade, or did you mess up with your system tables in the process ? -- Sylvain On Wed, Mar 23, 2011 at 11:47 AM, Sébastien Druon wrote: > Hello > After having updgraded from 0.7.0 to 0.7.4, cassandra does not start > anymore. > I have the followin

Re: 0.7.2 choking on a 5 MB column

2011-03-23 Thread Jeremy.Truelove

- Original Message - From: buddhasystem To: cassandra-u...@incubator.apache.org Sent: Tue Mar 22 21:49:39 2011 Subject: Re: 0.7.2 choking on a 5 MB column I see. I'm doing something even more drastic then, because I'm only inserting one row in this case, and just use cf.insert(), witho

Re: change node IP address

2011-03-23 Thread ruslan usifov

2011/3/23 aaron morton > Which version are you using ? > > It looks like using 0.7X (and prob 0.6) versions you can just shutdown the > node and bring it back up with the new IP and It Just Works > https://issues.apache.org/jira/browse/CASSANDRA-872 > > > So to replace one machine on another, the

Re: Error on startup after upgrading from 0.7.0 to 0.7.4

2011-03-23 Thread Sylvain Lebresne

Then that shouldn't happen. Could you open a ticket on JIRA for this ? If you could set the DEBUG log level for log4j and include the parts relevant to the error in the ticket description, that could help. -- Sylvain On Wed, Mar 23, 2011 at 2:16 PM, Sébastien Druon wrote: > Hi, > I just installe

Re: Can the Cassandra to be hosted, with all your features and performance, on Microsoft Azure ?

2011-03-23 Thread FernandoVM

Sure! :) []'s FernandoVM On Tue, Mar 22, 2011 at 5:07 PM, aaron morton wrote: > Sounds interesting, please let the community know your findings. > > Aaron > > On 23 Mar 2011, at 01:31, FernandoVM wrote: > >> Hi, >> >>> contrib/py_stress is the easiest way to shake out any issues with your >>> in

Sizing a Cassandra cluster

2011-03-23 Thread Brian Fitzpatrick

I'm going through the process of specing out the hardware for a Cassandra cluster. The relevant specs: - Support 460 operations/sec (50/50 read/write workload). Row size ranges from 4 to 8K. - Support 29 million objects for the first year - Support 365 GB storage for the first year, based on Cassa

ParNew (promotion failed)

2011-03-23 Thread ruslan usifov

Hello Sometimes i seen in gc log follow message: 2011-03-23T14:40:56.049+0300: 14897.104: [GC 14897.104: [ParNew (promotion failed) Desired survivor size 41943040 bytes, new threshold 2 (max 2) - age 1:5573024 bytes,5573024 total - age 2:5064608 bytes, 10637632 total : 672577K->

Re: 0.7.2 choking on a 5 MB column

2011-03-23 Thread Robert Coli

On Tue, Mar 22, 2011 at 6:49 PM, buddhasystem wrote: > I see. I'm doing something even more drastic then, because I'm only inserting > one row in this case, and just use cf.insert(), without batch mutator. It > didn't occur to me that was a bad idea. Everything in a given RPC has to complete befo

Re: Error connection to remote JMX agent! on nodetool

2011-03-23 Thread ko...@vivinavi.com

Hi maki-san I am so sorry this was my mistake. I expected when I set data at one node, the data should be copied to the other node at the same keyspace and same column family by replication. This replication was working. I just made a mistake to get data by wrong Key.(First character of key wa

Re: ParNew (promotion failed)

2011-03-23 Thread Narendra Sharma

I think it is due to fragmentation in old gen, due to which survivor area cannot be moved to old gen. 300MB data size of memtable looks high for 3G heap. I learned that in memory overhead of memtable can be as high as 10x of memtable data size in memory. So either increase the heap or reduce the me

Re: ParNew (promotion failed)

2011-03-23 Thread ruslan usifov

2011/3/23 Narendra Sharma > I think it is due to fragmentation in old gen, due to which survivor area > cannot be moved to old gen. 300MB data size of memtable looks high for 3G > heap. I learned that in memory overhead of memtable can be as high as 10x of > memtable data size in memory. So eithe

Re: ParNew (promotion failed)

2011-03-23 Thread Narendra Sharma

I understand that. The overhead could be as high as 10x of memtable data size. So overall the overhead for 16CF collectively in your case could be 300*10 = 3G. Thanks, Naren On Wed, Mar 23, 2011 at 11:18 AM, ruslan usifov wrote: > > > 2011/3/23 Narendra Sharma > >> I think it is due to fragment

Re: Error connection to remote JMX agent! on nodetool

2011-03-23 Thread Aaron Morton

What process are you using to confirm the data was replicated to another server ? And what makes you say the data is not replicated ? I think your understanding of the replication may be a little off, you rarely read from one node. Have a look at http://thelastpickle.com/2011/02/07/Introduction

Re: what kind of bug?

2011-03-23 Thread Aaron Morton

First thing is check the logs on host 1. Check the view of the ring from all other nodes in the cluster, do they think nodes 2 and 3 are also down? Then confirm all nodes have the same config for listen port and all nodes can telnet to the listen port for the other nodes. I'm guessing the inse

Re: what kind of bug?

2011-03-23 Thread Dave Viner

I saw this once when my servers ran out of file descriptors. This caused totally weird problems. Make sure all nodes in the cluster are listening on the gossip port (7000 by default). Also check out http://www.datastax.com/docs/0.7/troubleshooting/index#view-of-ring-differs-between-some-nodesor

Re: EC2 - 2 regions

2011-03-23 Thread A J

7000 and 9160 are accessible. Don't think I need other ports for basic setup , right ? If anyone coud get 'nodetool repair' working with this patch (across regions), let me know. It may be I am doing something wrong. On Wed, Mar 23, 2011 at 1:08 AM, Milind Parikh wrote: > @aj > are you sure that

Re: Ghost node showing up in the ring

2011-03-23 Thread Alexis Lê-Quôc

I'm going to take a stab at a hypothesis: Sunday: I drain and decommission 2.3.4.193 *but* I forget to run node cleanup on the rest of the nodes. The ring looks clean but I did not see "Annoucing that ..." in the logs. Tuesday: the ghost node reappears on the ring for all nodes. Could this be c

Re: change node IP address

2011-03-23 Thread Casey Deccio

Hi Aaron, I'm using 7.3 (upgrading to 7.4). Basically I needed to swap the IPs of two of my nodes. Unfortunately, after I did so, neither was accessible in the original ring anymore. I saw the warning message in the logs about the token being reassigned. Maybe I didn't give it enough time, but

Re: ParNew (promotion failed)

2011-03-23 Thread ruslan usifov

2011/3/23 Narendra Sharma > I understand that. The overhead could be as high as 10x of memtable data > size. So overall the overhead for 16CF collectively in your case could be > 300*10 = 3G. > > > And how about G1 GC, it must prevent memory fragmentation. but some post on this email, told that i

SSTable Corruption

2011-03-23 Thread Erik Onnen

After an upgrade from 0.7.3 to 0.7.4, we're seeing the following on several data files: ERROR [main] 2011-03-23 18:58:33,137 ColumnFamilyStore.java (line 235) Corrupt sstable /mnt/services/cassandra/var/data/0.7.4/data/Helium/dp_idx-f-4844=[Index.db, Statistics.db, Data.db, Filter.db]; skipped jav

Re: Error connection to remote JMX agent! on nodetool

2011-03-23 Thread ko...@vivinavi.com

Hi Aaron Thank you so much for your reply and advice. I watched your presentation. it"s helpful for me. Anyway I did as following with 2 node servers.(53;1st node,54;2nd node) I started the following write/read.php program by through Thrift at 53 server. $users->insert($Key, array('Movie'

How to find what node a key is on

2011-03-23 Thread Sameer Farooqui

Does anybody know if it's possible to find out what node a specific key/row lives on? We have a 30 node cluster and I'm curious how much faster it'll be to read data directly from the node that stores the data. We're using random partitioner, by the way. *Sameer Farooqui *Accenture Technology L

Re: Ghost node showing up in the ring

2011-03-23 Thread Brandon Williams

On Wed, Mar 23, 2011 at 2:23 PM, Alexis Lê-Quôc wrote: > I'm going to take a stab at a hypothesis: > Sunday: I drain and decommission 2.3.4.193 *but* I forget to run node > cleanup on the rest of the nodes. The ring looks clean but I did not see > "Annoucing that ..." in the logs. > > Tuesday: t

Re: SSTable Corruption

2011-03-23 Thread Brandon Williams

On Wed, Mar 23, 2011 at 4:59 PM, Erik Onnen wrote: > After an upgrade from 0.7.3 to 0.7.4, we're seeing the following on > several data files: > > ERROR [main] 2011-03-23 18:58:33,137 ColumnFamilyStore.java (line 235) > Corrupt sstable > /mnt/services/cassandra/var/data/0.7.4/data/Helium/dp_idx-f

Re: change node IP address

2011-03-23 Thread aaron morton

Can any of the dev's jump in here ? Whats best practice ? Aaron On 24 Mar 2011, at 08:32, Casey Deccio wrote: > Hi Aaron, > > I'm using 7.3 (upgrading to 7.4). Basically I needed to swap the IPs of two > of my nodes. Unfortunately, after I did so, neither was accessible in the > original r

Re: How to find what node a key is on

2011-03-23 Thread aaron morton

Each row is stored on RF nodes, and your read will be sent to CL number of nodes. Messages only take a single hop from the coordinator to each node the read is performed on, so the networking overhead varies with the number of nodes involved in the request. There are man factors other than netw

Re: ParNew (promotion failed)

2011-03-23 Thread Narendra Sharma

I haven't used G1. I remember someone shared his experience in detail on G1. The bottom line is you need to test it for your deployment and based on test and results conclude if it will work for you. I believe for a small heap G1 will do well. -Naren On Wed, Mar 23, 2011 at 1:47 PM, ruslan usifo

Re: SSTable Corruption

2011-03-23 Thread Erik Onnen

Thanks, so is it the "[Index.db, Statistics.db, Data.db, Filter.db]; skipped" that indicates it's in Statistics? Basically I need a way to know if the same is true of all the other tables showing this issue. -erik

Re: How to find what node a key is on

2011-03-23 Thread Sameer Farooqui

No problems with read performance, just curious about what kind of overhead was being added b/c we're doing read tests. If it's easy to figure out where the row is stored, I'd be interested in trying it. If not, don't worry about it. - Sameer On Wed, Mar 23, 2011 at 4:31 PM, aaron morton wrote:

Re: How to find what node a key is on

2011-03-23 Thread Robert Coli

On Wed, Mar 23, 2011 at 4:31 PM, aaron morton wrote: > There are features available to determine which nodes holds replicas for a > particular key. AFAIK they are not intended for use by clients. Specifically : http://wiki.apache.org/cassandra/JmxInterface#org.apache.cassandra.service.StorageSer

Re: How to find what node a key is on

2011-03-23 Thread Narendra Sharma

The logic to find the node is not complicated. You compute the MD5 hash of the key. Create sorted list of tokens assigned to the nodes in the ring. Find the first token greater than the hash. This is the first node. Next in the list is the replica, which depends on the RF. Now this is simple becau

Re: ParNew (promotion failed)

2011-03-23 Thread Erik Onnen

It's been about 7 months now but at the time G1 would regularly segfault for me under load on Linux x64. I'd advise extra precautions in testing and make sure you test with representative load.

Re: SSTable Corruption

2011-03-23 Thread Brandon Williams

On Wed, Mar 23, 2011 at 6:52 PM, Erik Onnen wrote: > Thanks, so is it the "[Index.db, Statistics.db, Data.db, Filter.db]; > skipped" that indicates it's in Statistics? Basically I need a way to > know if the same is true of all the other tables showing this issue. It's the at org.apache.cassand

Re: EC2 - 2 regions

2011-03-23 Thread Milind Parikh

My nodetool repair does not hang. That's why I'm curious. /*** sent from my android...please pardon occasional typos as I respond @ the speed of thought / On Mar 23, 2011 2:54 PM, "A J" wrote: 7000 and 9160 are accessible. Don't think I need other por

Re: Error connection to remote JMX agent! on nodetool

2011-03-23 Thread aaron morton

If you are getting the "Cluster schema does not agree" then you have a sick cluster and replication will not be working properly. Open the cassandra-cli client and type "describe cluster;" you should see a single schema version such as... [default@unknown] describe cluster; Cluster Informatio

Re: Sizing a Cassandra cluster

2011-03-23 Thread aaron morton

It really does depend on what your workload is like, and in the end will involve a certain amount of fudge factor. http://wiki.apache.org/cassandra/CassandraHardware provides some guidance. http://wiki.apache.org/cassandra/MemtableThresholds can be used to get a rough idea of the memory requ

Cassandra Crash upon restart from hard system crash

2011-03-23 Thread Sanjeev Kulkarni

Hey guys, I have a one node system(with replication factor of 1) running cassandra. The machine has two disks. One is used as the commitlog and the other as cassandra's data directory. The node just had gotten unresponsive and had to be hard rebooted. After restart, cassandra started off fine. But

result of get_indexed_slices() seems wrong

2011-03-23 Thread Wangpei (Peter)

Hi, This problem occurs when the clause has multi expression and a expression with operator other than EQ. Is anyone meet the same problem? I trace the code, and seen this at ColumnFamilyStore.satisfies() method: int v = data.getComparator().compare(column.value(), expression.value)

Re: result of get_indexed_slices() seems wrong

2011-03-23 Thread aaron morton

Looks like this https://issues.apache.org/jira/browse/CASSANDRA-2347 From this discussion http://www.mail-archive.com/user@cassandra.apache.org/msg11291.html Aaron On 24 Mar 2011, at 17:17, Wangpei (Peter) wrote: > Hi, > > This problem occurs when the clause has multi expression and a expre

Re: Cassandra Crash upon restart from hard system crash

2011-03-23 Thread Jonathan Ellis

This looks like a bug (https://issues.apache.org/jira/browse/CASSANDRA-2376), but not one that would cause a crash. Actual process death is only caused by (a) running out of memory or (2) JVM bugs. On Wed, Mar 23, 2011 at 9:17 PM, Sanjeev Kulkarni wrote: > Hey guys, > I have a one node system(wit

Re: change node IP address

2011-03-23 Thread Jonathan Ellis

We support changing IPs to _new_ IPs but I'd be surprised if it can handle changing it to the IP of another node. I'd try using a 3rd "temporary" IP if I wanted to swap two. On Wed, Mar 23, 2011 at 6:18 PM, aaron morton wrote: > Can any of the dev's jump in here ? Whats best practice ? > Aaron >

log4j settings inquiry

2011-03-23 Thread david lee

hi, i'm trying to run cassandra 0.7 on my windows machine, and i don't seem to be able to go beyond the warning message C:\Program Files\Apache Software Foundation\apache-cassandra-0.7.4\bin>cassandra -f Starting Cassandra Server log4j:WARN No appenders could be found for logger (org.apache.cassa

48 matches

Mail list logo