Re: Can SSTables overlap with SizeTieredCompactionStrategy?

2014-05-28 Thread Aaron Morton
cold_reads_to_omit defaults to 0.0 which disabled the feature, so it may not have been responsible in this case. There are a couple of things that could explain the difference: * after nodetool compaction there was one SSTable, so one -Filter.db file rather than 8 that each had 700 entires. Ho

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread user 01
Personally I like thrift based APIs more than the CQL based ones, as it is more intuitive & easy to understand relating it to cassandra's internal storage design. CQL to deal with dynamic columns/wide rows is really not intuitive or easily graspable as of now. On Thu, May 29, 2014 at 12:32 AM, Pe

Re: Tombstones on secondary indexes

2014-05-28 Thread Robert Coli
On Thu, May 15, 2014 at 1:43 AM, Joel Samuelsson wrote: > https://issues.apache.org/jira/browse/CASSANDRA-4314 seems to say that > tombstones on secondary indexes are not removed by a compaction. Do I need > to do it manually? > The ticket you have pasted says : "It's not exposed through nodetoo

Re: Changing default_time_to_live

2014-05-28 Thread Robert Coli
On Mon, May 19, 2014 at 5:51 AM, Keith Wright wrote: > Hi all, we are using C* 2.0.6 and have set the default_time_to_live > parameter on a number of our LCS column families. I was wondering what > would happen if we were to decrease this value via a table alter. Would > subsequent compactions

Re: Cassandra CSV & JSON uploader

2014-05-28 Thread Robert Coli
On Tue, May 27, 2014 at 8:33 PM, Samir Faci wrote: > http://www.datastax.com/docs/1.0/references/sstable2json might be what > you're looking for. It's in the bin folder of your cassandra installation. > > Though I really doubt you'd want to just drop what is in Oracle into > cassandra. SQL to

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Peter Lin
I don't think anyone can predict the future. CQL is nice, but there's still lots of room for improvement. There's a reason why projects like spark, shark, impala and presto exist. I would expect something to replace CQL in the future as things evolve. Plus, the type safety that thrift clients shou

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Robert Coli
On Wed, May 28, 2014 at 7:19 AM, user 01 wrote: > 2. *What is the future of thrift based APIs *(& specifically Astyanax) ? > Logic suggests that the thrift based API will ultimately be removed, at the time in the future when the cost of working around its moribund area of the code exceeds the c

alternative vnode upgrade strategy?

2014-05-28 Thread William Oberman
I'm concerned about the bad reports of using shuffle to do a vnode upgrade (and I did a "smoke test" trying shuffle a test cluster, and had out of disk space issues). I then started to plan out the "dual DC" upgrade path, but I wonder if this option is easier: Starting point: N node cluster, no v

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Peter Reilly
According to the astynax blog the project will support the java-driver - https://github.com/Netflix/astyanax/wiki/Astyanax-over-Java-Driver On Wed, May 28, 2014 at 8:36 AM, Andrew wrote: > 1. Astyanax does not *officially* support 2.0, and I’m not sure what the > future plans are for them. > 2.

RE: Performance migrating from MySQL to C*

2014-05-28 Thread moshe.kranc
Just looking at the data modeling issue: Your queries seem to always be for a single dataName. So, that should be the main part of the row key. Within that, it seems you need to be able to select a range based on time. So, time should be the primary sort key for the column name. Based on those

Re: Avoiding High Cell Tombstone Count

2014-05-28 Thread Charlie Mason
Hi all, Thanks for all for the info. I think Nates suggestion was what I was trying to articulate in my question. Just to confirm: So if I add a timeuuid as a row level primary key and reverse the clustering, so its stored newest first. I can query it by just the partion key with a limit of 1.

Re: Performance migrating from MySQL to C*

2014-05-28 Thread DuyHai Doan
Hello Simon There is definitely some data modeling issue there. Your first data model was quite good, except the usage of map. Collections in C* are not meant to be used to store lots of values because they are loaded entirely in memory server side every time you access them. An alternative for

Re: Avoiding High Cell Tombstone Count

2014-05-28 Thread Nate McCall
You could turn gc_grace_seconds down to zero and tune compaction options for this CF to keep the tombstone count down. But... This query looks a lot like a ledger. If that is so, treat it as such and skip the updates by: - modifying the schema to include a timeuuid as part of a compound key (and

Performance migrating from MySQL to C*

2014-05-28 Thread Simon Chemouil
Hi, First, sorry for the length of this mail. TL;DR: DataModeling timeseries with an extra dimension, and C* not handling stress well; MySQL doesn't scale as well but handles the queries way better on similar hardware. == Context: We've been evaluating Cassandra for a while now (~1 m

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Andrew
1. Astyanax does not *officially* support 2.0, and I’m not sure what the future plans are for them. 2. Thrift is deprecated but not removed.  However,  3. It’s an open-source project based on Netflix’ internal usage of it (they wrote Astyanax).  There is a mailing list available for questions (h

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Peter Lin
I contribute to Hector. It is still being maintained. I still benefits of using thrift over CQL. On Wed, May 28, 2014 at 10:19 AM, user 01 wrote: > Currently I am using Hector which is no longer maintained by its > developers. So, for the past few days I have been looking at Astyanax & to > be

Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread user 01
Currently I am using Hector which is no longer maintained by its developers. So, for the past few days I have been looking at Astyanax & to be fair, I think I'm just loving its API. For sometime now I also had a look at CQL Java driver maintained by Datastax but right now, I don't very much love/u

Re: Multi-DC Repairs and Token Questions

2014-05-28 Thread Rameez Thonnakkal
as Chovatia mentioned, the keyspaces seems to be different. try "Describe keyspace SN_KEYSPACE" and "describe keyspace MY_KEYSPACE" from CQL. This will give you an idea about how many replicas are there for these keyspaces. On Wed, May 28, 2014 at 11:49 AM, chovatia jaydeep < chovatia_jayd...@ya

What are the advantages of static column family over a dynamic column family?

2014-05-28 Thread user 01
What are the advantages of static column family over a dynamic column family? Otherwise why shouldn't I just make all my column families just dynamic for the reasons of ease? Do static column families save diskspace or offer better reads/writes performance ?

Re: Cassandra CSV & JSON uploader

2014-05-28 Thread Peter Lin
I think it's important to remember that distributed cache are different than NoSql database. As much as people like to think both of them are hammers, they're not. The kinds of workloads each is good at is different, so let's not recommend people misuse and abuse cassandra, dse or coherence. On T