Re: Cassandra yaml configuration

2011-09-22 Thread Maki Watanabe
You have a chance to write it by your own. I'll buy one :-) maki 2011/9/22 Sajith Kariyawasam : > Thanks Maki. > If you came across with any other book supporting latest Cassandara > versions, pls let me know. > > On Thu, Sep 22, 2011 at 12:03 PM, Maki Watanabe > wrote: >> >> The book is a bit o

Re: Moving to a new cluster

2011-09-22 Thread Philippe
Hi Aaron Thanks for the reply I should hhave mentionned that all current nodes are running 0.8.4. All current and future services have 2TB disks of which i have allocated only half. I don't expect any issues here. Should I? Le 22 sept. 2011 01:26, "aaron morton" a écrit : > How much data is on th

Re: benefits of off-heap (serializing) row cache?

2011-09-22 Thread Boris Yen
I think the cassandra team did not re-implement their own GC. I guess what they meant is the less heap being used, the better GC performance. AFAIK, only data that is not been updated frequently can benefit from off-heap row cache, because when a row is modified, the row inside cache need be inval

Re: Unable to create compaction marker

2011-09-22 Thread aaron morton
It's in the yaml file… # directories where Cassandra should store data on disk. data_file_directories: - /var/lib/cassandra/data The permissions are normally cassandra:cassandra Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On

Re: Possibility of going OOM using get_count

2011-09-22 Thread Boris Yen
I was wondering if it is possible to use similar way as CASSANDRA-2894 to have the slice_predict support the offset concept? With the offset, it would be much easier to implement the paging from the client side. Boris On Mon, Sep 19, 2011 at 9

Re: Thrift 7

2011-09-22 Thread Sylvain Lebresne
Cassandra uses Thrift 6 for now (a ticket is open for upgrading to Thrift 7 (https://issues.apache.org/jira/browse/CASSANDRA-3213), I refer you to the discussion there). That being said I don't know what is the story of Thrift for having a client and server of different versions, but I suspect this

Re: Moving to a new cluster

2011-09-22 Thread Jonas Borgström
On 09/22/2011 01:25 AM, aaron morton wrote: *snip* > When you start a repair it will repair will the other nodes it > replicates data with. So you only need to run it every RF nodes. Start > it one one, watch the logs to see who it talks to and then start it on > the first node it does not talk to.

Re: Unable to create compaction marker

2011-09-22 Thread Radim Kolar
Dne 21.9.2011 20:01, Jonathan Ellis napsal(a): Means Cassandra couldn't create an empty file in the data directory designating a sstable as compacted. I'd look for permissions problems. Short term there is no dire consequence, although it will keep re-compacting that sstable. Longer term you'l

Tool for SQL -> Cassandra data movement

2011-09-22 Thread Radim Kolar
I need tool which is able to dump tables via JDBC into JSON format for cassandra import. I am pretty sure that somebody already wrote that. Are there tools which can do direct JDBC -> cassandra import?

Re: Cassandra yaml configuration

2011-09-22 Thread aaron morton
This is pretty up to date http://www.packtpub.com/cassandra-apache-high-performance-cookbook/book A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22/09/2011, at 7:07 PM, Maki Watanabe wrote: > You have a chance to write it by your own.

Re: Thrift 7

2011-09-22 Thread aaron morton
I'm a bit confused about what you are trying to do. Are you trying to install thrift on windows ? Best I can to do help is point you to the readme file, or this http://wiki.apache.org/thrift/ThriftInstallationWin32 You may get more help on the thrift email list. You not need to install thrif

Re: Moving to a new cluster

2011-09-22 Thread aaron morton
the new nodes will have 1TB of data disk, but how much data will you put on them? A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22/09/2011, at 7:23 PM, Philippe wrote: > Hi Aaron > Thanks for the reply > > I should hhave mentionned

Re: Moving to a new cluster

2011-09-22 Thread Sylvain Lebresne
2011/9/22 Jonas Borgström : > On 09/22/2011 01:25 AM, aaron morton wrote: > *snip* >> When you start a repair it will repair will the other nodes it >> replicates data with. So you only need to run it every RF nodes. Start >> it one one, watch the logs to see who it talks to and then start it on >>

Re: Moving to a new cluster

2011-09-22 Thread Philippe
The current load on the nodes is around 300g. Le 22 sept. 2011 11:08, "aaron morton" a écrit : > the new nodes will have 1TB of data disk, but how much data will you put on them? > > A > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.co

Re: Cassandra yaml configuration

2011-09-22 Thread Sajith Kariyawasam
Thanks Aaron On Thu, Sep 22, 2011 at 2:37 PM, aaron morton wrote: > This is pretty up to date > http://www.packtpub.com/cassandra-apache-high-performance-cookbook/book > > A > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 2

How to enable JNA for Cassandra on Windows?

2011-09-22 Thread Viktor Jevdokimov
Hi, I'm trying without success to enable JNA for Cassandra on Windows. Tried to place JNA 3.3.0 libs jna.jar and platform.jar into Cassandra 0.8.6 lib dir, but getting in log: Unable to link C library. Native methods will be disabled. What is missed or what is wrong? One thing I've found on ine

RE: Cassandra reconfiguration

2011-09-22 Thread Hiren Shah
Our RF is 2 and we are planning to keep that. Thanks for addressing the decommissioning puzzle. That is where I could not find any doc. Hiren From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Wednesday, September 21, 2011 6:01 PM To: user@cassandra.apache.org Subject: Re: Cassandra recon

Performance degradation observed through embedded cassandra server - pointers needed

2011-09-22 Thread Roshan Dawrani
Hi, We recently switched from Cassandra 0.7.2 to 0.8.5 and observing considerable performance degradation in embedded server's response times that we use in integration tests. One thing that we do is that we truncate our app column families after each integration test so that the next one gets a

Re: Performance degradation observed through embedded cassandra server - pointers needed

2011-09-22 Thread Edward Capriolo
On Thu, Sep 22, 2011 at 9:27 AM, Roshan Dawrani wrote: > Hi, > > We recently switched from Cassandra 0.7.2 to 0.8.5 and observing > considerable performance degradation in embedded server's response times > that we use in integration tests. > > One thing that we do is that we truncate our app colu

progress of sstableloader keeps 0?

2011-09-22 Thread Yan Chunlu
I took a snapshot of one of my node in a cluster 0.7.4(N=RF=3). use sstableloader to load the snapshot data to another 1 node cluster(N=RF=1). after execute "bin/sstableloader /disk2/mykeyspace/" it says"Starting client (and waiting 30 seconds for gossip) ..." "Streaming revelant part of c

Re: Moving to a new cluster

2011-09-22 Thread Yan Chunlu
hi Aaron: could you explain more about the issue about repair make space usage going crazy? I am planning to upgrade my cluster from 0.7.4 to 0.8.6, which is because the repair never works on 0.7.4 for me. more specifically, CASSANDRA-2280 an

Re: Unable to create compaction marker

2011-09-22 Thread Daning
Thank you guys. I don't think have permission issue or out of space. might be ulimit set to low(1024), we will change ulimit -Hn and unlimit -Sn to see if that could solve the problem. Daning On 09/22/2011 01:12 AM, aaron morton wrote: It's in the yaml file… # directories where Cassandra sho

Re: Thrift 7

2011-09-22 Thread Suman Ghosh
Thanks Sylvain / Aaron! Can you tell me how to join the "Thrift mailing list"? Thanks, Suman. On Thu, Sep 22, 2011 at 2:37 PM, aaron morton wrote: > I'm a bit confused about what you are trying to do. > > Are you trying to install thrift on windows ? Best I can to do help is > point you to the

Lots of GC in log

2011-09-22 Thread Daning
We are testing Cassandra with pretty big load, I saw frequent GCs in the log, Do you have suggestion about how to reduce them? NFO [ScheduledTasks:1] 2011-09-22 09:38:41,080 GCInspector.java (line 122) GC for ParNew: 297 ms for 1 collections, 2503106624 used; max is 8015314944 INFO [Schedule

Re: Tool for SQL -> Cassandra data movement

2011-09-22 Thread Nehal Mehta
We are trying to carry out same stuff, but instead of migrating into JSON, we are exporting into CSV and than importing CSV into Cassandra. Which DB are you currently using? Thanks, Nehal Mehta. 2011/9/22 Radim Kolar > I need tool which is able to dump tables via JDBC into JSON format for > ca

Re: Tool for SQL -> Cassandra data movement

2011-09-22 Thread Jeremy Hanna
Take a look at http://www.datastax.com/dev/blog/bulk-loading I'm sure there is a way to make it more seamless for what you want to do and it could be built on, but the recent bulk loading additions will provide the best foundation. On Sep 22, 2011, at 12:25 PM, Nehal Mehta wrote: > We are tryi

RE: Cassandra reconfiguration

2011-09-22 Thread Hiren Shah
I just started decommissioning one of the nodes. It is streaming data to other nodes in the data center, though, which seems waste of time because all the nodes will be going away. Wouldn't it be better to just bring cassandra down on these nodes? And then run removetoken or change the topology

Re: Tool for SQL -> Cassandra data movement

2011-09-22 Thread Radim Kolar
Dne 22.9.2011 19:25, Nehal Mehta napsal(a): We are trying to carry out same stuff, but instead of migrating into JSON, we are exporting into CSV and than importing CSV into Cassandra. You are right CSV seems to be more portable Which DB are you currently using? Postgresql and Apache Derby.

Re: progress of sstableloader keeps 0?

2011-09-22 Thread Jonathan Ellis
Did you check for errors in logs on both loader + target? On Thu, Sep 22, 2011 at 10:52 AM, Yan Chunlu wrote: > I took a snapshot of one of my node in a cluster 0.7.4(N=RF=3).   use > sstableloader to load the snapshot data to another 1 node cluster(N=RF=1). > > after execute  "bin/sstableloader

Storing (python) objects

2011-09-22 Thread Ian Danforth
All, I find myself considering storing serialized python dicts in Cassandra. I'd like to store fairly complex, nested dicts, and it's just easier to do this rather than work out a lot of super columns / columns etc. Do others find themselves storing serialized data structures in Cassandra or is

Re: Lots of GC in log

2011-09-22 Thread Peter Schuller
> We are testing Cassandra with pretty big load, I saw frequent GCs in the > log, Do you have suggestion about how to reduce them? Do you have any actual problem that you are observing? Frequent young-generation GC:s (ParNew) are expected. If you want to cut down on the length of them you may wan

Re: Storing (python) objects

2011-09-22 Thread Aaron Turner
On Thu, Sep 22, 2011 at 11:28 AM, Ian Danforth wrote: > All, >  I find myself considering storing serialized python dicts in Cassandra. I'd > like to store fairly complex, nested dicts, and it's just easier to do this > rather than work out a lot of super columns / columns etc. >  Do others find t

Re: [VOTE] Release Mojo's Cassandra Maven Plugin 0.8.6-1

2011-09-22 Thread Nate McCall
+1 On Wed, Sep 21, 2011 at 4:39 PM, Colin Taylor wrote: > +1 (non binding but lgtm) > > On Wed, Sep 21, 2011 at 2:27 AM, Stephen Connolly > wrote: >> Hi, >> >> I'd like to release version 0.8.6-1 of Mojo's Cassandra Maven Plugin >> to sync up with the recent 0.8.6 release of Apache Cassandra. >>

Search over composite Column and Super Column name

2011-09-22 Thread Renato Costa
Hi, I started do modeling my application over cassandra data model. I will have to use composite Super columns name, i.e. "username:userid", i know that is a lot of different ways to deal with this case, but once i have modeled with composite Super Columns name is there any way to make a sarch ove

Re: Lots of GC in log

2011-09-22 Thread Daning
Thanks Peter. I saw cpu was shooting much higher. I am not sure if frequent GCs are caused by improperly sized generations. I'd like to get some tunning tips, or good document about Cassandra tuning. Daning On 09/22/2011 12:23 PM, Peter Schuller wrote: We are testing Cassandra with pretty big

Re: Search over composite Column and Super Column name

2011-09-22 Thread Konstantin Naryshkin
One thing you can do is search over the range from "username:" to "username;". "username:" is the first possible string starting with "username:". "username;" is the first possible sting after all of the stings that start with "username:" . This works because ; is the character right after : in

Re: Storing (python) objects

2011-09-22 Thread Alexis Lê-Quôc
On Thu, Sep 22, 2011 at 3:50 PM, Aaron Turner wrote: > On Thu, Sep 22, 2011 at 11:28 AM, Ian Danforth > wrote: > > All, > > I find myself considering storing serialized python dicts in Cassandra. > I'd > > like to store fairly complex, nested dicts, and it's just easier to do > this > > rather

Re: LevelDB type compaction

2011-09-22 Thread mcasandra
Can someone please help me understand this a little bit? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/LevelDB-type-compaction-tp6798334p6822344.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: Moving to a new cluster

2011-09-22 Thread Zhu Han
On Thu, Sep 22, 2011 at 11:04 PM, Yan Chunlu wrote: > > hi Aaron: > > could you explain more about the issue about repair make space usage going > crazy? > I guess repair defers the compaction progress as it brings a lot of load. For update-heavy workload, the space usage goes higher and higher

is it possible for light-traffic CF to hold down many commit logs?

2011-09-22 Thread Yang
in 1.0.0 we don't have memtable_throughput for each individual CF , and instead which memtable/CF to flush is determined by "largest getTotalMemtableLiveSize() ". (MeteredFlusher.java line 81) what would happen in the following case ? : I have only 2 CF, the traffic for one CF is 1000 times that

Re: Tool for SQL -> Cassandra data movement

2011-09-22 Thread Nehal Mehta
Hi Ramdin, I have cleaned up my code that imports CSV into Cassandra and I have put it open on https://github.com/nehalmehta/CSV2Cassandra. Have a look if it is useful to you. I have used Hector instead of sstableloader. For me it was necessary to have consistency level of EACH_QUORUM. Thanks, N

MessagingService.sendOneWay sending blank bytes?

2011-09-22 Thread Greg Hinkle
I noticed that on the 0.8 branch the implementation of MessagingService.sendOneWay is building up a DataOutputBuffer with a default size of 128 bytes, but then sending it as the full buffer no matter how many bytes the the data takes. I believe it should be calling DataOutputBuffer.asByteArray(

Re: Possibility of going OOM using get_count

2011-09-22 Thread aaron morton
Offsets have been discussed in previously. IIRC the main concerns were either: There is no way to reliably count to start the offset, i.e. we do not lock the row Or performance related in, as there is not a reliable way to skip 10,000 columns other than counting 10,000 columns. With a start col

Re: Thrift 7

2011-09-22 Thread aaron morton
http://thrift.apache.org/mailing/ Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 23/09/2011, at 4:50 AM, Suman Ghosh wrote: > Thanks Sylvain / Aaron! > > Can you tell me how to join the "Thrift mailing list"? > > Thanks, > Suma

Re: Cassandra reconfiguration

2011-09-22 Thread aaron morton
> Our RF is 2 Are you using the SimpleStrategy or the NetworkTopologyStrategy ? I assumed NTS in a multi DC setup. I think I sent before my brain caught up before. I was thinking that if you had an NTS setup like [ {DC1 : 3}, {DC2: 3}, {DC3:3}] you would change it to [ {DC1 :3}, {DC2:3}, {D

Re: progress of sstableloader keeps 0?

2011-09-22 Thread Yan Chunlu
sorry I did not look into it after check it I found version mismatch exception is in the log: ERROR [Thread-17] 2011-09-22 08:24:24,248 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-17,5,main] java.lang.RuntimeException: Cannot recover SSTable /disk2/cassandra

Re: is it possible for light-traffic CF to hold down many commit logs?

2011-09-22 Thread Philippe
It sure looks like what I'm seeing on my cluster where a 100G commit lot partition fills up in 12 hours (0.8.x) Le 23 sept. 2011 03:45, "Yang" a écrit : > in 1.0.0 we don't have memtable_throughput for each individual CF , > and instead > which memtable/CF to flush is determined by "largest > getT

Re: is it possible for light-traffic CF to hold down many commit logs?

2011-09-22 Thread Yang
thanks for the input. if that's the case, I think the solution would be to sort the CFs to flush by a more complex criteria than just size. for example the number of dirty commit logs that contain this CF should be considered as a score. Yang On Thu, Sep 22, 2011 at 10:40 PM, Philippe wrote: >

Re: Possibility of going OOM using get_count

2011-09-22 Thread Boris Yen
On Fri, Sep 23, 2011 at 12:28 PM, aaron morton wrote: > Offsets have been discussed in previously. IIRC the main concerns were > either: > > There is no way to reliably count to start the offset, i.e. we do not lock > the row > In the new get_count function, cassandra does the internal paging in