Thanks Jim. I think you understand the pain of migrating TBs of data to new 
tables. There is no command to change from compact to non compact storage and 
the fastest solution to migrate data using Spark is too slow for production 
And the pain gets bigger when your performance dips after moving to non compact 
storage table. Thats because non compact storage is quite inefficient storage 
format till 3.x and its incurs heavy penalty on Row Scan performance in 
Analytics workload.Please go throught the link to understand how old Compact 
storage gives much better performance than non compact storage as far as Row 
Scans are concerned:
The flexibility of Cql comes at heavy cost until 3.x.

ThanksAnujSent from Yahoo Mail on Android 
  On Mon, 11 Apr, 2016 at 10:35 PM, Jim Ancona<> wrote:   
Jack, the Datastax link he posted 
( says that for column families 
with mixed dynamic and static columns: "The only solution to be able to access 
the column family fully is to remove the declared columns from the thrift 
schema altogether..." I think that page describes the problem and the potential 
solutions well. I haven't seen an answer to Anuj's question about why the 
native CQL solution using collections doesn't perform as well.
Keep in mind that some of us understand CQL just fine but have working pre-CQL 
Thrift-based systems storing hundreds of terabytes of data and with 
requirements that mean that saying "bite the bullet and re-model your data" is 
not really helpful. Another quote from that Datastax link: "Thrift isn't going 
anywhere." Granted that that link is three-plus years old, but Thrift now *is* 
now going away, so it's not unexpected that people will be trying to figure out 
how to deal with that. It's bad enough that we need to rewrite our clients to 
use CQL instead of Thrift. It's not helpful to say that we should also re-model 
and migrate all our data.
On Mon, Apr 11, 2016 at 11:29 AM, Jack Krupansky <> 

Sorry, but your message is too confusing - you say "reading dynamic columns in 
CQL" and "make the table schema less", but neither has any relevance to CQL! 1. 
CQL tables always have schemas. 2. All columns in CQL are statically declared 
(even maps/collections are statically declared columns.) Granted, it is a 
challenge for Thrift users to get used to the terminology of CQL, but it is 
required. If necessary, review some of the free online training videos for data 
Unless your data model is very simply and does directly translate into CQL, you 
probably do need to bite the bullet and re-model your data to exploit the 
features of CQL rather than fight CQL trying to mimic Thrift per se.
In any case, take another shot at framing the problem and then maybe people 
here can help you out.
-- Jack Krupansky
On Mon, Apr 11, 2016 at 10:39 AM, Anuj Wadehra <> wrote:

Any comments or suggestions on this one? 

Sent from Yahoo Mail on Android 
 On Sun, 10 Apr, 2016 at 11:39 PM, Anuj Wadehra<> wrote:  
We are on 2.0.14 and Thrift. We are planning to migrate to CQL soon but facing 
some challenges.
We have a cf with a mix of statically defined columns and dynamic columns 
(created at run time). For reading dynamic columns in CQL, we have two options:
1. Drop all columns and make the table schema less. This way, we will get a Cql 
row for each column defined for a row key--As mentioned here:
2.Migrate entire data to a new non compact storage table and create collections 
for dynamic columns in new table.
In our case, we have observed that approach 2 causes 3 times slower performance 
in Range scan queries used by Spark. This is not acceptable. Cassandra 3 has 
optimized storage engine but we are not comfortable moving to 3.x in production.
Moreover, data migration to new table using Spark takes hours. 

Any suggestions for the two issues?


Sent from Yahoo Mail on Android  


Reply via email to