750Gb compaction task

2014-03-12 Thread Plotnik, Alexey
After rebalance and cleanup I have leveled CF (SSTable size = 100MB) and a compaction Task that is going to process ~750GB: > root@da1-node1:~# nodetool compactionstats pending tasks: 10556 compaction typekeyspace column family completed total unit progr

Re: Opscenter help?

2014-03-12 Thread Jack Krupansky
Please do use Stack Overflow - that is the appropriate forum for OpsCenter support (unless you are a DataStax customer). Use the OpsCenter tag: http://stackoverflow.com/tags/opscenter/info -- Jack Krupansky -Original Message- From: Drew from Zhrodague Sent: Wednesday, March 12, 2014

Re: Driver documentation questions

2014-03-12 Thread Alex Popescu
While this is a question that would fit better on the Java driver group [1], I'll try to provide a very short answer: 1. Cluster is an long-lived object and the application should have only 1 instance 2. Session is also a long-lived object and you should try to have 1 Session per keyspace. A

Re:

2014-03-12 Thread David McNelis
Not knowing anything about your data structure (to expand on what Edward said), you could be running into something where you've got some hot keys that are getting the majority of writes during those heavily loads more specifically I might look for a single key that you're writing, since you're

Re:

2014-03-12 Thread Russ Bradberry
I wouldn't go above 8G unless you have a very powerful machine that can keep the GC pauses low. Sent from my iPhone > On Mar 12, 2014, at 7:11 PM, Edward Capriolo wrote: > > That is too much ram for cassandra make that 6g to 10g. > > The uneven perf could be because your requests do not shar

Re:

2014-03-12 Thread Edward Capriolo
That is too much ram for cassandra make that 6g to 10g. The uneven perf could be because your requests do not shard evenly. On Wednesday, March 12, 2014, Batranut Bogdan wrote: > Hello all, > > The environment: > > I have a 6 node Cassandra cluster. On each node I have: > - 32 G RAM > - 24 G RAM

Re: Dead node seen as UP by replacement node

2014-03-12 Thread Paulo Ricardo Motta Gomes
Some further info: I'm not using Vnodes, so I'm using the 1.1 replace node trick of setting the initial_token in the cassandra.yaml file to the value of the dead node's token -1, and autobootstrap=true. However, according to the Apache wiki ( https://wiki.apache.org/cassandra/Operations#For_versio

Dead node seen as UP by replacement node

2014-03-12 Thread Paulo Ricardo Motta Gomes
Hello, I'm trying to replace a dead node using the procedure in [1], but the replacement node initially sees the dead node as UP, and after a few minutes the node is marked as DOWN again, failing the streaming/bootstrap procedure of the replacement node. This dead node is always seen as DOWN by th

[no subject]

2014-03-12 Thread Batranut Bogdan
Hello all, The environment: I have a 6 node Cassandra cluster. On each node I have: - 32 G RAM - 24 G RAM for cassa - ~150 - 200 MB/s disk speed - tomcat 6 with axis2 webservice that uses the datastax java driver to make asynch reads / writes  - replication factor for the keyspace is 3 All nodes

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Edward Capriolo
This brainstorming idea has already been -1 ed in jira. ROFL. On Wed, Mar 12, 2014 at 12:26 PM, Tupshin Harper wrote: > OK, so I'm greatly encouraged by the level of interest in this. I went > ahead and created https://issues.apache.org/jira/browse/CASSANDRA-6846, > and will be starting to look

Re: Java heap size does not change on Windows

2014-03-12 Thread Tyler Hobbs
cassandra-env.sh is only used on *nix systems. You'll need to change bin/cassandra.bat. Interestingly, that's hardcoded to use a 1G heap, which seems like a bug. On Wed, Mar 12, 2014 at 2:40 PM, Lukas Steiblys wrote: > I am running Windows Server 2008 R2 Enterprise on a 2 Core Intel Xeon > w

Java heap size does not change on Windows

2014-03-12 Thread Lukas Steiblys
I am running Windows Server 2008 R2 Enterprise on a 2 Core Intel Xeon with 16GB of RAM and I want to change the max heap size. I set MAX_HEAP_SIZE in cassandra-env.sh, but when I start Cassandra, it’s still reporting: INFO 12:37:36,221 Global memtable threshold is enabled at 247MB INFO 12:37:36,

Opscenter help?

2014-03-12 Thread Drew from Zhrodague
I am having a hard time installing the Datastax Opscenter agents on EL6 and EL5 hosts. Where is an appropriate place to ask for help? Datastax has move their forums to Stack Exchange, which seems to be a waste of time, as I don't have enough reputation points to properly tag my questions.

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Robert Coli
On Wed, Mar 12, 2014 at 9:10 AM, Edward Capriolo wrote: > Again, I am glad that the project has officially ended support for thrift > with this clear decree. For years the project kept saying "Thrift is not > going anywhere". It was obviously meant literally like the project would do > the absolut

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Edward Capriolo
@Tushpin I like that approach, right now I think of that piece as the "StorageProxy". I agree, over the years people have take that approach. Solandra and is a good example and I am guessing DSE SOLR works this way. This says something about the entire "thrift vs cql" thing as there are clearly po

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Nate McCall
Awesome! Thanks Tupshin (and everyone else). I'll put some of my thoughts up there shortly. On Wed, Mar 12, 2014 at 11:26 AM, Tupshin Harper wrote: > OK, so I'm greatly encouraged by the level of interest in this. I went > ahead and created https://issues.apache.org/jira/browse/CASSANDRA-6846, >

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Tupshin Harper
OK, so I'm greatly encouraged by the level of interest in this. I went ahead and created https://issues.apache.org/jira/browse/CASSANDRA-6846, and will be starting to look into what the interface would have to look like. Anybody feel free to continue the discussion here, email me privately, or comm

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
@Tupshin LOL, there's always enough rope to hang oneself. I agree it's badly needed for folks that really do need more "messy" queries. I was just discussing a similar concept with a co-worker and going over the pros/cons of various approaches to realizing the goal. I'm still digging into Presto. I

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
@Nate & Tupshin, glad to help where I can On Wed, Mar 12, 2014 at 12:14 PM, Russell Bradberry wrote: > @Nate, @Tupshin, this is pretty close to what I had in mind. I would be > open to helping out with a formal proposal. > > > > On March 12, 2014 at 12:11:41 PM, Tupshin Harper (tups...@tupshin.c

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Tupshin Harper
Peter, I didn't specifically call it out, but the interface I just proposed in my last email would be very much with the goal of "make writing complex queries less painful and more efficient." by providing a deep integration mechanism to host that code. It's very much a "enough rope to hang ourse

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Russell Bradberry
@Nate, @Tupshin, this is pretty close to what I had in mind. I would be open to helping out with a formal proposal. On March 12, 2014 at 12:11:41 PM, Tupshin Harper (tups...@tupshin.com) wrote: I agree that we are way off the initial topic, but I think we are spot on the most important topic.

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
@Nate I don't want to change the separation of components in cassandra. My ultimate goal is "make writing complex queries less painful and more efficient." How that becomes reality is anyone's guess. There's different ways to get there. I also like having a plugging transport layer, which is why I

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Tupshin Harper
I agree that we are way off the initial topic, but I think we are spot on the most important topic. As seen in various tickets, including #6704 (wide row scanners), #6167 (end-slice termination predicate), the existence of intravert-ug (Cassandra interface to intravert), and a number of others, the

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Edward Capriolo
Great points about the CQL driver and the supposed spec. It shows how a driver living outside the project poses a problem to open source development. How could custom types have been implemented without a spec? In the apache world the saying is "If it did not happen on the list, it did not happen."

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Nate McCall
IME/O one of the best things about Cassandra was the separation of (and I'm over-simplifying a bit, but still): - The transport/API layer - The Datacenter layer - The Storage layer > I don't think we're well-served by the "construction kit" approach. > It's difficult enough to evaluate NoSQL wit

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
@Theo I totally understand that. Spending time to maintain support for 2 different protocols is a significant overhead. From my own experience contributing to open source projects, time is the biggest limiting factor. My bias perspective, CQL can be extended with additional features so that query v

Re: NetworkTopologyStrategy ring distribution across 2 DC

2014-03-12 Thread Ramesh Natarajan
Thanks. The error is gone if i specify the keyspace name. However the replicas in the ring output is not correct. Shouldn't it say 3 because I have DC1:3, DC2:3 in my schema? thanks Ramesh Datacenter: DC1 == Replicas: 2 AddressRackStatus State LoadOwns T

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Theo Hultberg
Speaking as a CQL driver maintainer (Ruby) I'm +1 for end-of-lining Thrift. I agree with Edward that it's unfortunate that there are no official drivers being maintained by the Cassandra maintainers -- even though the current state with the Datastax drivers is in practice very close (it is not the

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
I'm enjoying the discussion also. @Brian I've been looking at spark/shark along with other recent developments the last few years. Berkeley has been doing some interesting stuff. One reason I like Thrift is for type safety and the benefits for query validation and query optimization. One could do

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Russell Bradberry
I would love to help with the REST interface, however my point was not to add REST into Cassandra.  My point was that if we had an abstract interface that even CQL used to access data, and this interface was made available for other drop in modules to access, then the project becomes extensible

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Brian O'Neill
just when you thought the thread diedŠ First, let me say we are *WAY* off topic. But that is a good thing. I love this community because there are a ton of passionate, smart people. (often with differing perspectives ;) RE: Reporting against C* (@Peter Lin) We¹ve had the same experience. Pig

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
yes, I was looking at intravert last nite. For the kinds of reports my customers ask us to do, joins and subqueries are important. Having tried to do a simple join in PIG, the level of pain is high. I'm a masochist, so I don't mind breaking a simple join into multiple MR tasks, though I do find m

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread DuyHai Doan
"I would love to see Cassandra get to the point where users can define complex queries with subqueries, like, group by and joins" --> Did you have a look at Intravert ? I think it does union & intersection on server side for you. Not sure about join though.. On Wed, Mar 12, 2014 at 12:44 PM, Pete

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
Hi Ed, I agree Solr is deeply integrated into DSE. I've looked at Solandra in the past and studied the code. My understanding is DSE uses Cassandra for storage and the user has both API available. I do think it can be integrated further to make moderate to complex queries easier and probably fast