Hi,

If your external API returns active records, that means I am guessing you
need to do a select * on the active table to figure out which records in
the table are no longer active.

You might be aware that range selects based on partition key will timeout
in cassandra. They can however be made to work using the column cluster
key.

To comment more, We would need to see your proposed cassandra tables and
queries that you might need to run.

regards




On Thu, Apr 23, 2015 at 9:45 AM, Ali Akhtar <ali.rac...@gmail.com> wrote:

> That's returned by the external API we're querying. We query them for
> active records, if a previous active record isn't included in the results,
> that means its time to archive that record.
>
> On Thu, Apr 23, 2015 at 9:20 PM, Manoj Khangaonkar <khangaon...@gmail.com>
> wrote:
>
>> Hi,
>>
>> How do you determine if the record is no longer active ? Is it a
>> perioidic process that goes through every record and checks when the last
>> update happened ?
>>
>> regards
>>
>> On Thu, Apr 23, 2015 at 8:09 AM, Ali Akhtar <ali.rac...@gmail.com> wrote:
>>
>>> Hey all,
>>>
>>> We are working on moving a mysql based application to Cassandra.
>>>
>>> The workflow in mysql is this: We have two tables: active and archive .
>>> Every hour, we pull in data from an external API. The records which are
>>> active, are kept in 'active' table. Once a record is no longer active, its
>>> deleted from 'active' and re-inserted into 'archive'
>>>
>>> The purpose for that, is because most of the time, queries are only done
>>> against the active records rather than archived. Therefore keeping the
>>> active table small may help with faster queries, if it only has to search
>>> 200k records vs 3 million or more.
>>>
>>> Is it advisable to keep the same data model in Cassandra? I'm concerned
>>> about tombstone issues when records are deleted from active.
>>>
>>> Thanks.
>>>
>>
>>
>>
>> --
>> http://khangaonkar.blogspot.com/
>>
>
>


-- 
http://khangaonkar.blogspot.com/

Reply via email to