We use spark to do same because our partition contains data for whole year and 
we delete one day at a time. C* does not allow us delete without using 
partition key. I know it’s wrong data model but we can’t change it due to 
obvious reason of whole application redesign.

Sent from my iPhone

> On Mar 23, 2018, at 2:10 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:
> 
> I'm confused as to what the difference between deleting with prepared 
> statements and deleting through spark is?  To the best of my knowledge either 
> way it's the same thing - normal deletion with tombstones replicated.  Is it 
> that you're doing deletes in the analytics DC instead of your real time one? 
> 
>> On Fri, Mar 23, 2018 at 11:38 AM Charulata Sharma (charshar) 
>> <chars...@cisco.com> wrote:
>> Hi Rahul,
>> 
>>          Thanks for your answer. Why do you say that deleting from spark is 
>> not elegant?? This is the exact feedback I want. Basically why is it not 
>> elegant?
>> 
>> I can either delete using delete prepared statements or through spark. TTL 
>> approach doesn’t work for us
>> 
>> Because first of all ttl is there at a column level and there are business 
>> rules for purge which make the TTL solution not very clean in our case.
>> 
>>  
>> 
>> Thanks,
>> 
>> Charu
>> 
>>  
>> 
>> From: Rahul Singh <rahul.xavier.si...@gmail.com>
>> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>> Date: Thursday, March 22, 2018 at 5:08 PM
>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>, 
>> "user@cassandra.apache.org" <user@cassandra.apache.org>
>> Subject: Re: Using Spark to delete from Transactional Cluster
>> 
>>  
>> 
>> Short answer : it works. You can even run “delete” statements from within 
>> Spark once you know which keys to delete. Not elegant but it works.
>> 
>> It will create a bunch of tombstones and you may need to spread your deletes 
>> over days. Another thing to consider is instead of deleting setting a TTL 
>> which will eventually get cleansed.
>> 
>> 
>> --
>> Rahul Singh
>> rahul.si...@anant.us
>> 
>> Anant Corporation
>> 
>> 
>> On Mar 22, 2018, 2:19 PM -0500, Charulata Sharma (charshar) 
>> <chars...@cisco.com>, wrote:
>> 
>> 
>> Hi,
>> 
>>    Wanted to know the community’s experiences and feedback on using Apache 
>> Spark to delete data from C* transactional cluster.
>> 
>> We have spark installed in our analytical C* cluster and so far we have been 
>> using Spark only for analytics purposes.
>> 
>>  
>> 
>> However, now with advanced features of Spark 2.0, I am considering using 
>> spark-cassandra connector for deletes instead of a series of Delete Prepared 
>> Statements  
>> 
>> So essentially the deletes will happen on the analytical cluster and they 
>> will be replicated over to transactional cluster by means of our keyspace 
>> replication strategies.
>> 
>>  
>> 
>> Are there any risks involved in this ??
>> 
>>  
>> 
>> Thanks,
>> 
>> Charu
>> 
>>           

Reply via email to