I work with Amir and further experimentation I can shed a little more light on what exactly is going on under the hood. For background our goal is to take data that is currently being read and written to via thrift, switch reads to CQL, and then switch writes to CQL. This is in alternative to deleting all of our data and starting over, or being forever struck on super old thrift clients (both of those options obviously suck.) The data models involved are absurdly simple (and single key with a handful of static columns).
TLDR: Metadata is complicated. What is the least dangerous way to make direct changes to system.schema_columnfamilies and system.schema_columns? Anyway, given some super simple Foo and Bar column families: create keyspace Test with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1}; use Test; create column family Foo with comparator = UTF8Type and key_validation_class=UTF8Type and column_metadata = [ {column_name: title, validation_class: UTF8Type}]; create column family Bar with comparator = UTF8Type and key_validation_class=UTF8Type; update column family Bar with column_metadata = [ {column_name: title, validation_class: UTF8Type}]; (The salient difference as described by Amir is when the column_metadata is set; at the same time as creation or later.) Now we can inject a little data and see that from thrift everything looks fine: [default@Test] set Foo['testkey']['title']='mytitle'; Value inserted. Elapsed time: 19 msec(s). [default@Test] set Bar['testkey']['title']='mytitle'; Value inserted. Elapsed time: 4.47 msec(s). [default@Test] list Foo; Using default limit of 100 Using default cell limit of 100 ------------------- RowKey: testkey => (name=title, value=mytitle, timestamp=1452108082972000) 1 Row Returned. Elapsed time: 268 msec(s). [default@Test] list Bar; Using default limit of 100 Using default cell limit of 100 ------------------- RowKey: testkey => (name=title, value=mytitle, timestamp=1452108093739000) 1 Row Returned. Elapsed time: 9.3 msec(s). But from cql the Bar column does not look like the data we wrote: cqlsh> select * from "Test"."Foo"; key | title ---------+--------- testkey | mytitle (1 rows) cqlsh> select * from "Test"."Bar"; key | column1 | value | title ---------+---------+------------------+--------- testkey | title | 0x6d797469746c65 | mytitle It's not just that these phantom columns are ugly, cql thinks column1 is part of a composite primary key. Since there **is no column1**, that renderes the data un-query-able with WHERE clauses. Just to make sure it's not thrift that is doing something unexpected, the sstables show the expected structure: $ ./tools/bin/sstable2json /data/sstables/data/Test/Foo-d3348860b4af11e5b456639406f48f1b/Test-Foo-ka-1-Data.db [ {"key": "testkey", "cells": [["title","mytitle",1452110466924000]]} ] $ ./tools/bin/sstable2json /data/sstables/data/Test/Foo-d3348860b4af11e5b456639406f48f1b/Test-Foo-ka-1-Data.db [ {"key": "testkey", "cells": [["title","mytitle",1452110466924000]]} ] So, what appeared as innocent variation made years ago when the thrift schema was written causes very different results to cql. Digging into the schema tables shows what is going on in more detail: > select > keyspace_name,columnfamily_name,column_aliases,comparator,is_dense,key_aliases,value_alias > from system.schema_columnfamilies where keyspace_name='Test'; keyspace_name | columnfamily_name | column_aliases | comparator | is_dense | key_aliases | value_alias ---------------+-------------------+----------------+ ----------------------------------------+----------+-------------+------------- Test | Bar | ["column1"] | org.apache.cassandra.db.marshal.UTF8Type | True | ["key"] | value Test | Foo | [] | org.apache.cassandra.db.marshal.UTF8Type | False | ["key"] | null > select keyspace_name,columnfamily_name,column_name,validator from > system.schema_columns where keyspace_name='Test'; keyspace_name | columnfamily_name | column_name | validator ---------------+-------------------+-------------+------------------------------------------- Test | Bar | column1 | org.apache.cassandra.db.marshal.UTF8Type Test | Bar | key | org.apache.cassandra.db.marshal.UTF8Type Test | Bar | title | org.apache.cassandra.db.marshal.UTF8Type Test | Bar | value | org.apache.cassandra.db.marshal.BytesType Test | Foo | key | org.apache.cassandra.db.marshal.UTF8Type Test | Foo | title | org.apache.cassandra.db.marshal.UTF8Type Now the interesting bit is that the metadata can be manually "fixed": UPDATE system.schema_columnfamilies SET column_aliases='[]',is_dense=False,value_alias=null WHERE keyspace_name='Test' AND columnfamily_name = 'Bar'; DELETE FROM system.schema_columns where keyspace_name='Test' and columnfamily_name='Bar' and column_name = 'column1'; DELETE FROM system.schema_columns where keyspace_name='Test' and columnfamily_name='Bar' and column_name = 'value'; (drain and restart) > select * from "Test"."Bar"; key | title ---------+--------- testkey | mytitle The problem with that approach is that manually editing the local schema tables in live cluster is wildly dangerous. I *think* this would work: * Make triple sure no schema changes are happening on the cluster. * Update schema tables on each node --> drain --> restart Questions: * Is there a better/safer way to correct the problem without some sort of bulk import/export? * Is it possible to choose which node resetlocalschema pulls from?