Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3

Yuki Morishita Mon, 03 Feb 2014 08:09:15 -0800

if you are using < 2.0.4, then you are hitting
https://issues.apache.org/jira/browse/CASSANDRA-6527



On Mon, Feb 3, 2014 at 2:51 AM, olek.stas...@gmail.com
<olek.stas...@gmail.com> wrote:
> Hi All,
> We've faced very similar effect after upgrade from 1.1.7 to 2.0 (via
> 1.2.10). Probably after upgradesstable  (but it's only a guess,
> because we noticed problem few weeks later), some rows became
> tombstoned. They just disappear from results of queries. After
> inverstigation I've noticed, that they are reachable via sstable2json.
> Example output for "non-existent" row:
>
> {"key": "6e6e37716c6d665f6f61695f6463","metadata": {"deletionInfo":
> {"markedForDeleteAt":2201170739199,"localDeletionTime":0}},"columns":
> [["DATA","3c6f61695f64633a64(...)",1357677928108]]}
> ]
>
> If I understand correctly row is marked as deleted with timestamp in
> the far future, but it's still on the disk. Also localDeletionTime is
> set to 0, which may means, that it's kind of internal bug, not effect
> of client error. So my question is: is it true, that upgradesstable
> may do soemthing like that? How to find reasons for such strange
> cassandra behaviour? Is there any option of recovering such strange
> marked nodes?
> This problem touches about 500K rows of all 14M in our database, so
> the percentage is quite big.
> best regards
> Aleksander
>
> 2013-12-12 Robert Coli <rc...@eventbrite.com>:
>> On Wed, Dec 11, 2013 at 6:27 AM, Mathijs Vogelzang <math...@apptornado.com>
>> wrote:
>>>
>>> When I use sstable2json on the sstable on the destination cluster, it has
>>> "metadata": {"deletionInfo":
>>> {"markedForDeleteAt":1796952039620607,"localDeletionTime":0}}, whereas
>>> it doesn't have that in the source sstable.
>>> (Yes, this is a timestamp far into the future. All our hosts are
>>> properly synced through ntp).
>>
>>
>> This seems like a bug in sstableloader, I would report it on JIRA.
>>
>>>
>>> Naturally, copying the data again doesn't work to fix it, as the
>>> tombstone is far in the future. Apart from not having this happen at
>>> all, how can it be fixed?
>>
>>
>> Briefly, you'll want to purge that tombstone and then reload the data with a
>> reasonable timestamp.
>>
>> Dealing with rows with data (and tombstones) in the far future is described
>> in detail here :
>>
>> http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html
>>
>> =Rob
>>



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)

Re: Data tombstoned during bulk loading 1.2.10 -> 2.0.3

Reply via email to