Hi Vladamir,
Thanks for the response. I assume then that it is safe to remove the 
directories that are not current as per the system_schema.tables table. I have 
dozens of the same table and haven't dropped and added nearly that many times. 
Do any of the nodetool or other commands clean up these unused directories?

Thanks,
Jason Kania

      From: Vladimir Yudovin <vla...@winguzone.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Saturday, October 8, 2016 2:05 PM
 Subject: Re: Understanding cassandra data directory contents
   
Each table has unique id (suffix). If you drop and then recreate table with the 
same name it gets new id.

Try
SELECT keyspace_name, table_name, id FROM system_schema.tables ;
to determinate actual ID.

You can limit request to specific keyspace or table.


Best regards, Vladimir Yudovin, 
Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.



---- On Sat, 08 Oct 2016 13:42:19 -0400 Jason Kania<jason.ka...@ymail.com> 
wrote ---- 

Hello,
I am using Cassandra 3.0.9 and I have encountered an issue where the nodes in 
my 3 node cluster have vastly different amounts of data even though they should 
be roughly the same. When I looked through the data directory for my database 
on two of the nodes, I see a number of directories with the same prefix, eg:
periodicReading-76eb7510096811e68a7421c8b9466352,periodicReading-453d55a0501d11e68623a9d2b6f96e86...

Only one directory with a specific table name prefix has a current date and the 
rest are older.
In contrast, on the node with the least space used, each directory has a unique 
prefix (not shared).
I am wondering what the contents of a Cassandra database directory should look 
like. Are there supposed to be multiple entries for a given table or just one?
If just one, what would be a procedure to determine if the other directories 
with the same table are junk that can be removed.

Thanks,
Jason





   

Reply via email to