I believe Aegisthus is open sourced.

Mohammed

From: Jan [mailto:cne...@yahoo.com]
Sent: Monday, January 26, 2015 11:20 AM
To: user@cassandra.apache.org
Subject: Re: Controlling the MAX SIZE of sstables after compaction

Parth  et al;

the folks at Netflix seem to have built a solution for your problem.
The Netflix Tech Blog: Aegisthus - A Bulk Data Pipeline out of 
Cassandra<http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.html>





[image]<http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.html>











The Netflix Tech Blog: Aegisthus - A Bulk Data Pipeline 
...<http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.html>
By Charles Smith and Jeff Magnusson


View on 
techblog.netflix.com<http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.html>

Preview by Yahoo






May want to chase Jeff Magnuson & check if the solution is open sourced.
Pl.   report back to this forum if you get an answer to the problem.

hope this helps.
Jan

C* Architect

On Monday, January 26, 2015 11:25 AM, Robert Coli 
<rc...@eventbrite.com<mailto:rc...@eventbrite.com>> wrote:

On Sun, Jan 25, 2015 at 10:40 PM, Parth Setya 
<setya.pa...@gmail.com<mailto:setya.pa...@gmail.com>> wrote:
1. Is there a way to configure the size of sstables created after compaction?

No, won'tfix : 
https://issues.apache.org<https://issues.apache.org/>/jira/browse/CASSANDRA-4897.

You could use the "sstablesplit" utility on your One Big SSTable to split it 
into files of your preferred size.

2. Is there a better approach to generate the report?

The major compaction isn't too bad, but something that understands SSTables as 
an input format would be preferable to sstable2json.

3. What are the flaws with this approach?

sstable2json is slow and transforms your data to JSON.

=Rob

Reply via email to