SV: SV: What is consuming the heap?

2010-07-20 Thread Thorvaldsson Justus
There is some more information here about memory usage. http://wiki.apache.org/cassandra/StorageConfiguration /J Från: 王一锋 [mailto:wangyif...@aspire-tech.com] Skickat: den 20 juli 2010 08:56 Till: user Ämne: Re: SV: What is consuming the heap? No, I don't think so. Because I'm not using supercol

Re: Data from multiple tables (Join Data)

2010-07-20 Thread bujji
hi Aaron, Thanks for your reply I can integrate some transaction mechanism with Cassandra so that i can do the transactions. but is it possible to get data from more than one table without much overload and in an efficient way ? give me some good example if possible... Thanks, Visu On Tue, Jul

Re: Define keyspaces in cassandra 0.7

2010-07-20 Thread GH
Here is a snippet out of some of my test code to try, I cut out most of the irrellevant bits, hope it works for you... (the original code worked here :-) TSocket socket = new TSocket("localhost", 9160); TTransport transport; transport = socket; TBinaryProto

Re: Data from multiple tables (Join Data)

2010-07-20 Thread aaron morton
I'm not sure what your overload concern is. You either need to make multiple requests or de-normalise so that your query can be resolved from one CF. There are no joins. Try it and see would be the best advice. You can always add more nodes to the cluster. Aaron On 20 Jul 2010, at 19:19,

SV: UnavailableException on QUORUM write

2010-07-20 Thread Per Olesen
Hi, Think I might have found out the problem. I had only one seed node, and when that node is down, they all give UnavailableException. Guess at least one seed needs to be up then? Sounds fair. /Per Fra: Per Olesen [...@trifork.com] Sendt: 9. juli 2010 1

Re: How to stop Cassandra running in embeded mode

2010-07-20 Thread Bjorn Borud
Jonathan Ellis writes: > there's some support for this in 0.7 (see > http://issues.apache.org/jira/browse/CASSANDRA-1018) but fundamentally > it's not really designed to be started and stopped multiple times > within the same process. I am currently struggling with some of the same issues (on 0.

Re: How to get the 'system' keyspace info?

2010-07-20 Thread Jonathan Ellis
internal error means "check the cassandra log for the stacktrace" On Mon, Jul 19, 2010 at 10:36 PM, ChingShen wrote: > cassandra> get system.LocationInfo['L'] > Exception Internal error processing get_slice > > What's wrong? > > Thanks. > > Shen > -- Jonathan Ellis Project Chair, Apache Cassa

Re: What is consuming the heap?

2010-07-20 Thread Jonathan Ellis
you should post the full stack trace. 2010/7/20 王一锋 : > In my cluster, I have set both KeysCached and RowsCached of my column family > on all nodes to "0", > but it still happened that a few nodes crashed because of OutOfMemory > (from the gc.log, a full gc wasn't able to free up any memory space)

Re: UnavailableException on QUORUM write

2010-07-20 Thread Jonathan Ellis
Seed should only be important when joining the cluster. You're using the Thrift API, right? On Tue, Jul 20, 2010 at 5:34 AM, Per Olesen wrote: > Hi, > > Think I might have found out the problem. > I had only one seed node, and when that node is down, they all give > UnavailableException. Guess

SV: UnavailableException on QUORUM write

2010-07-20 Thread Per Olesen
>Seed should only be important when joining the cluster. You're using >the Thrift API, right? Yep! And when one of my non-seed nodes in my 3 node cluster is down, I do NOT get the exception. Anyway, guess I need to try and reproduce it in small scale. On Tue, Jul 20, 2010 at 5:34 AM, Per Oles

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-20 Thread Juho Mäkinen
I managed to run a few benchmarks. Servers r/s 164.5k 259.5k The configuration: Client: Machine with four Quad Core Intel Xeon CPU E5520 @ 2.27Ghz cpus (total 16 cores), 4530 bogomips per core. 12 GB ECC corrected memory. Supermicro mainboard (not sure about exact type).

Re: How to stop Cassandra running in embeded mode

2010-07-20 Thread Jesse McConnell
separate jvm is the only mechanism to 'shutdown' in a test scenario right nowand its unlikely to change in the short term so designing around forking is your best bet cheers, jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Tue, Jul 20, 2010 at 05:47, Bjorn Borud wrote: > Jonathan El

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-20 Thread Peter Schuller
> But what's then the point with adding nodes into the ring? Disk speed! Well, it may also be cheaper to service an RPC request than service a full read or write, even in terms of CPU. But: Even taking into account that requests are distributed randomly, the cluster should still scale. You will a

Re: What is consuming the heap?

2010-07-20 Thread Peter Schuller
> heap size is 10G and the load of data per node was around 300G, 16-core CPU, Are the 300 GB made up of *really* small values? Per SS table bloom filters do consume memory, but you'd have to have a *lot* of *really* small values for a 300 GB database to cause bloom filters to be a significant par

Script 'hangs' when i stop 1 cassandra node (of 4 nodes)

2010-07-20 Thread Pieter Maes
Hi, I'm currently using Cassandra 0.6.3 with php thrift (svn r959516) in the phpcassa wrapper (last git + a fix of mine that fixes strange timeouts..). (yeah i use php, don't shoot me for it) (i also mailed that mailing list, but no answer yet from there) When i was running my migration script 1

does Net::Cassandra work for 0.6.3?

2010-07-20 Thread Alexander Rothenberg
Hi, we consider using cassandra to replace a lot of old logging-mechanisms to record pageviews/userdata/searchparams/hits etc from websites. (later, we want to monitor those data). Looking at the API and ways to communicate to the cassandra-server, i would like to use the perl-client Net::Cas

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-20 Thread Ryan King
On Tue, Jul 20, 2010 at 6:20 AM, Juho Mäkinen wrote: > I managed to run a few benchmarks. > > Servers   r/s >   1        64.5k >   2        59.5k > > The configuration: > Client: Machine with four Quad Core Intel Xeon CPU E5520 @ 2.27Ghz > cpus (total 16 cores), 4530 bogomips per core. 12 GB ECC c

Re: UnavailableException on QUORUM write

2010-07-20 Thread Jonathan Ellis
On Tue, Jul 20, 2010 at 6:40 AM, Per Olesen wrote: >>Seed should only be important when joining the cluster.  You're using >>the Thrift API, right? > > Yep! > > And when one of my non-seed nodes in my 3 node cluster is down, I do NOT get > the exception. > Anyway, guess I need to try and reproduc

Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Dathan Pattishall
 INFO [HINTED-HANDOFF-POOL:1] 2010-07-20 15:10:43,721 HintedHandOffManager.java (line 210) Finished hinted handoff of 0 rows to endpoint /10.129.28.23 ERROR [pool-1-thread-37895] 2010-07-20 15:10:51,622 CassandraDaemon.java (line 83) Uncaught exception in thread Thread[pool-1-thread-37895,5,main] j

Re: Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Peter Schuller
> CassandraDaemon.java (line 83) Uncaught exception in thread > Thread[pool-1-thread-37895,5,main] > java.lang.OutOfMemoryError: Java heap space >     at > org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:296) >     at > org.apache.thrift.protocol.TBinaryProt

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-20 Thread Peter Schuller
> (I'm hoping to have time to run my test on EC2 tonight; will see.) Well, I needed three c1.xlarge EC2 instances running py_stress to even saturate more than one core on the c1.xlarge instance running a single cassandra node (at roughly 21k reqs/second)... Depending on how reliable vmstat/top is

more questions on Cassandra ACID properties

2010-07-20 Thread Alex Yiu
Hi, I have more questions on Cassandra ACID properties. Say, I have a row that has 3 columns already: colA, colB and colC And, if two *concurrent* clients perform a different insert(...) into the same row, one insert is for colD and the other insert is for colE. Then, Cassandra would guarantee bo

testing please ignore

2010-07-20 Thread Alex Yiu
testing please ignore

Re: Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Tristan Seligmann
On Tue, Jul 20, 2010 at 9:09 PM, Peter Schuller wrote: >> CassandraDaemon.java (line 83) Uncaught exception in thread >> Thread[pool-1-thread-37895,5,main] >> java.lang.OutOfMemoryError: Java heap space >>     at >> org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.jav

Re: Understanding atomicity in Cassandra

2010-07-20 Thread Patricio Echagüe
Hi, regarding the retrying strategy, I understand that it might make sense assuming that the client can actually perform a retry. We are trying to build a fault tolerance solution based on Cassandra. In some scenarios, the client machine can go down during a transaction. Would it be bad design to

Re: Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Dathan Pattishall
The storage structure is rather simple. For every 1 key there is 1 column and a timestamp for that column. We don't enable pulling a huge amount of data and all other nodes are up servicing the same request. I suspect there may be another problem with Memory management inside Cassandra. Attac

Re: Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Peter Schuller
> Attaching Jconsole shows that there is a growth of memory and weird > spikes. Unfortunately I did not take a screen shot of the growth of > the spike over time. I'll do that when it occurs again. Note that expected behavior for CMS is to have lots of small ups and downs as a result of young gene

Re: Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Ryan King
On Tue, Jul 20, 2010 at 1:28 PM, Peter Schuller wrote: >> Attaching Jconsole shows that there is a growth of memory and weird >> spikes. Unfortunately I did not take a screen shot of the growth of >> the spike over time. I'll do that when it occurs again. > > Note that expected behavior for CMS is

Estimated release for Cassandra 0.6.4

2010-07-20 Thread CassUser CassUser
Hey Is there a release date (or approximate date) for cassandra 0.6.4. We are mainly concerned about the Cassandra-1042 patch. The reason we don't simply apply the patch is because since we are shipping a product which interacts with the cassandra server (and the patch is server side), the custo

Re: more questions on Cassandra ACID properties

2010-07-20 Thread Jonathan Ellis
On Tue, Jul 20, 2010 at 2:58 PM, Alex Yiu wrote: > Say, I have a row that has 3 columns already: colA, colB and colC > And, if two *concurrent* clients perform a different insert(...) into the > same row, > one insert is for colD and the other insert is for colE. > Then, Cassandra would guarantee

Re: Understanding atomicity in Cassandra

2010-07-20 Thread Jonathan Ellis
2010/7/20 Patricio Echagüe : > Would it be bad design to store all the data that need to be > consistent under one big key? That really depends how unnatural it is from a query perspective. :) -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Ca

Re: more questions on Cassandra ACID properties

2010-07-20 Thread Jonathan Shook
You are correct. In this case, Cassandra would journal two writes to the same logical row, but they would be 2 independent writes. Writes do not depend on reads, so they are self-contained. If either column exists already, it will be overwritten. These journaled actions would then be applied to th

Re: more questions on Cassandra ACID properties

2010-07-20 Thread Aaron Morton
Yes, both inserts (colD and colE) will succeed if you send insert() or batch_mutation()s from the client. It's also correct to think of them as insert-or-update calls. AaronOn 21 Jul, 2010,at 07:58 AM, Alex Yiu wrote:Hi,I have more questions on Cassandra ACID properties. Say, I have a row that has

Re: Bootstrap question

2010-07-20 Thread Anthony Molinaro
I see this in the old nodes DEBUG [WRITE-/10.220.198.15] 2010-07-20 21:15:50,366 OutboundTcpConnection.java (line 142) attempting to connect to /10.220.198.15 INFO [GMFD:1] 2010-07-20 21:15:50,391 Gossiper.java (line 586) Node /10.220.198.15 is now part of the cluster INFO [GMFD:1] 2010-07-20 21

Re: more questions on Cassandra ACID properties

2010-07-20 Thread Alex Yiu
Hi, all, (Jonathan Ellis, Jonathan Shook, Aaron Morton) Thanks for the confirmation. JonE, the "update" wording has been added to wiki page w.r.t. to insert and mutation API. Regards, Alex Yiu On Tue, Jul 20, 2010 at 2:02 PM, Jonathan Ellis wrote: > On Tue, Jul 20, 2010 at 2:58 PM, Alex Yi

Re: Understanding atomicity in Cassandra

2010-07-20 Thread Alex Yiu
Hi, Patricio, It's hard to comment on your original questions without knowing details of your own domain specific data model and data processing expectation. W.R.T. lumping things into one big row, there is a limitation on data model in Cassandra. You got CF and SCF. That is, you have only 2 leve

how come some nodes will drop nodes from the ring and not others?

2010-07-20 Thread Dathan Pattishall
dsh is a distributed shell that basically runs the same command on multiple servers. Notice that cass03 sees all 4 servers, yet the other 3 only sees three servers? Storage-conf.xml is the same among all nodes i.e. 10.129.28.14 10.129.28.20 10.129.28.22 10.129.28.23 c

RE: how come some nodes will drop nodes from the ring and not others?

2010-07-20 Thread Stu Hood
Did you copy the data directories from one node to the others? http://wiki.apache.org/cassandra/FAQ#cloned -Original Message- From: "Dathan Pattishall" Sent: Tuesday, July 20, 2010 6:09pm To: user@cassandra.apache.org Subject: how come some nodes will drop nodes from the ring and not oth

Re: how come some nodes will drop nodes from the ring and not others?

2010-07-20 Thread Dathan Pattishall
No did not copy the data directories from one node to another. This data is new data, newly created from scratch. On Tue, Jul 20, 2010 at 4:17 PM, Stu Hood wrote: > Did you copy the data directories from one node to the others? > http://wiki.apache.org/cassandra/FAQ#cloned > > -Original Mes

Re: How to get the 'system' keyspace info?

2010-07-20 Thread ChingShen
Thanks Jonathan Ellis, I got an error message as below: ERROR [pool-1-thread-1] 2010-07-21 08:51:46,582 Cassandra.java (line 1242) Internal error processing get_slice java.lang.RuntimeException:* No replica strategy configured for system* Because the "system" keyspace is for Cassandra internals,

Re: Estimated release for Cassandra 0.6.4

2010-07-20 Thread Eric Evans
On Tue, 2010-07-20 at 13:53 -0700, CassUser CassUser wrote: > Is there a release date (or approximate date) for cassandra 0.6.4. We > are mainly concerned about the Cassandra-1042 patch. The reason we > don't simply apply the patch is because since we are shipping a > product which interacts with

Re: How to get the 'system' keyspace info?

2010-07-20 Thread Jonathan Ellis
That is correct. We should make that possible, can you open a ticket for it? On Tue, Jul 20, 2010 at 6:08 PM, ChingShen wrote: > Thanks Jonathan Ellis, > > I got an error message as below: > ERROR [pool-1-thread-1] 2010-07-21 08:51:46,582 Cassandra.java (line 1242) > Internal error processing ge

what causes a cassandra to block and throw a null exception

2010-07-20 Thread Dathan Pattishall
Type 'help' or '?' for help. Type 'quit' or 'exit' to quit. cassandra> connect cass01/9160 cassandra> get TimeFrameClicks.Standard2['test_cassandra_alive'] Exception null The data exists and I can grab the data after I restart all the nodes, but once the cluster runs for a few minutes I cannot gra

Re: what causes a cassandra to block and throw a null exception

2010-07-20 Thread Chris Goffinet
Can you provide the output from `nodetool tpstats`. -Chris On Jul 20, 2010, at 8:59 PM, Dathan Pattishall wrote: > Type 'help' or '?' for help. Type 'quit' or 'exit' to quit. > cassandra> connect cass01/9160 > cassandra> get TimeFrameClicks.Standard2['test_cassandra_alive'] > Exception null > >

Re: what causes a cassandra to block and throw a null exception

2010-07-20 Thread Dathan Pattishall
Just sent one of the nodes back. Pool NameActive Pending Completed STREAM-STAGE 0 0 0 RESPONSE-STAGE0 0 151071 ROW-READ-STAGE0 0 100398 LB-OPERATIONS

Re: Estimated release for Cassandra 0.6.4

2010-07-20 Thread CassUser CassUser
Thanks Eric. On Tue, Jul 20, 2010 at 8:14 PM, Eric Evans wrote: > On Tue, 2010-07-20 at 13:53 -0700, CassUser CassUser wrote: > > Is there a release date (or approximate date) for cassandra 0.6.4. We > > are mainly concerned about the Cassandra-1042 patch. The reason we > > don't simply apply

Re: Re: What is consuming the heap?

2010-07-20 Thread 王一锋
I can only find these in the system.log INFO [GC inspection] 2010-07-21 01:01:49,661 GCInspector.java (line 110) GC for ConcurrentMarkSweep: 11748 ms, 413673472 reclaimed leaving 9779542600 used; max is 10873667584 ERROR [Thread-35] 2010-07-21 01:02:10,941 CassandraDaemon.java (line 78) Fatal

Re: Re: What is consuming the heap?

2010-07-20 Thread 王一锋
So the bloom filters reside in memory completely? We do have a lot of small values, hundreds of millions of columns in a columnfamily. I count the total size of *-Filter.db files in my keyspace, it's 436,747,815bytes. I guess this means it won't consume a major part of 10g heap space 2010-07

Re: Re: What is consuming the heap?

2010-07-20 Thread Dathan Pattishall
By off chance on writes are you using ConsistencyLevel::ZERO? On Tue, Jul 20, 2010 at 9:41 PM, 王一锋 wrote: > So the bloom filters reside in memory completely? > > We do have a lot of small values, hundreds of millions of columns in a > columnfamily. > > I count the total size of *-Filter.db f

Re: Re: Re: What is consuming the heap?

2010-07-20 Thread 王一锋
no, I'm using QUORUM for both writes and reads Replication factor is 3 2010-07-21 发件人: Dathan Pattishall 发送时间: 2010-07-21 12:51:32 收件人: user 抄送: 主题: Re: Re: What is consuming the heap? By off chance on writes are you using ConsistencyLevel::ZERO? On Tue, Jul 20, 2010 at 9:41

get the latest column fails in cassandra 7

2010-07-20 Thread Bujji4Tech
hi all , I am trying Cassandra 7(using latest build) got problem in getting the latest column in a row. and my code is here SlicePredicate predicate = new SlicePredicate(); predicate.slice_range = new SliceRange(new byte[0], new byte[0], true,1); ColumnParent column_parent =

Re: Re: What is consuming the heap?

2010-07-20 Thread Peter Schuller
>  INFO [GC inspection] 2010-07-21 01:01:49,661 GCInspector.java (line 110) GC for ConcurrentMarkSweep: 11748 ms, 413673472 reclaimed leaving 9779542600 used; max is 10873667584 > ERROR [Thread-35] 2010-07-21 01:02:10,941 CassandraDaemon.java (line 78) Fatal exception in thread Thread[Thread-35,5,m

Re: Re: What is consuming the heap?

2010-07-20 Thread Peter Schuller
> So the bloom filters reside in memory completely? Yes. The point of bloom filters in cassandra is to act as a fast way to determine whether sstables need to be consulted. This check involves random access into the bloom filter. It needs to be in memory for this to be effective. But due to the n