Samal that's pretty smart stuff

From: samal [mailto:samalgo...@gmail.com]
Sent: Friday, June 01, 2012 11:24 AM
To: user@cassandra.apache.org <user@cassandra.apache.org>
Subject: Re: Cassandra Data Archiving

I believe you are talking about "HDD space", consumed by user generated data 
which is no longer required after 15 days or may required.
First case to use TTL which you don't wan to use. 2nd as aaron pointed 
snapshotting data, but data still exist in cluster, only used for back up.

I think of like using column family bucket, 15 day a bucket , 2 bucket a month.

Creating new cf every 15th day with time-stamp marker trip_offer_cf_[ts 
-ts%(86400*15)], caching cf name in app for 15 days, after 15th day old cf 
bucket will be read only, no write goes into it, snapshotting that 
old_cf_bucket _data, and deleting that cf few days later, this will keep cf 
count fixed.

current cf count=n,
bucket cf count= b*n

using separate cluster old data analytic.

/Samal

On Fri, Jun 1, 2012 at 9:58 AM, Harshvardhan Ojha 
<harshvardhan.o...@makemytrip.com<mailto:harshvardhan.o...@makemytrip.com>> 
wrote:
Problem statement:
We are keeping daily generated data(user generated content)  in Cassandra, but 
our application is using only 15 days old data. So how can we archive data 
older than 15 days so that we can reduce load on Cassandra ring.

Note : we can’t apply TTL, as this data may be needed in future.


From: aaron morton 
[mailto:aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>]
Sent: Friday, June 01, 2012 6:57 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Cassandra Data Archiving

I'm not sure on your needs, but the simplest thing to consider is snapshotting 
and copying off node.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/06/2012, at 12:23 AM, Shubham Srivastava wrote:


I need to archive my Cassandra data into another  permanent storage .

Two intent

1.To shed the unused data from the Live data.

2.To use the archived data for getting some analytics out or a potential source 
of DataWarehouse.

Any recommendations for the same in terms of strategies or tools to use.

Regards,
Shubham Srivastava | Technical Lead - Technology Development

+91 124 4910 548   |  MakeMyTrip.com<http://MakeMyTrip.com>, 243 SP Infocity, 
Udyog Vihar Phase 1, Gurgaon, Haryana - 122 016, India


<image001.gif>What's new? My Trip Rewards - An exclusive loyalty program for 
MakeMyTrip customers.<https://rewards.makemytrip.com/MTR>

<image002.gif><http://www.makemytrip.com/>

<image003.gif><http://www.makemytrip.com/support/gurgaon-travel-agent-office.php>
Office Map

<image004.gif><http://www.facebook.com/pages/MakeMyTrip-Deals/120740541030?ref=search&sid=100000077980239.1422657277..1>
Facebook

<image005.gif><http://twitter.com/makemytripdeals>
Twitter





Reply via email to