Re: Weird problem with empty CF

Daning Tue, 04 Oct 2011 17:20:45 -0700

Thanks. Do you have plan to improve this? I think tombstone should beseparated with live data since it serves different purpose, built inseparate SSTable or indexed differently. It is pretty costly to dofiltering while reading.


Daning


On 10/04/2011 01:34 PM, aaron morton wrote:

I would not get gc_grace seconds to 0, set to to something small.
gc_grace_seconds or ttl is only the minimum amount of time the columnwill stay in the data files. The columns are only purged whencompaction runs some time after that timespan has ended.
If you are seeing issues where a heavy delete workload is having annoticeably adverse effect on read performance then you should look atthe data model. Consider ways to spread the write / read / deleteworkload over multiple rows.
If you cannot get away from it then experiment with reducing themin_compactioon_threshold of the CF's so that compaction kicks inquicker, and (potentially) tombstones are purged faster.
Chees


-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 5/10/2011, at 6:03 AM, Daning wrote:
Thanks Aaron. How about I set the gc_grace_seconds to 0 or like 2hours? I like to clean up tomebstone sooner, I don't care losing somedata and all my columns have ttl.
If one node is down longer than gc_grace_seconds, and I got tombstoneremoved, once the node is up, from my understanding deleted data willbe synced back. In this case my data will be processed twice and itwill not be a big deal to me.
Thanks,

Daning


On 10/04/2011 01:27 AM, aaron morton wrote:
Yes that's the slice query skipping past the tombstone columns.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com <http://www.thelastpickle.com/>

On 4/10/2011, at 4:24 PM, Daning Wang wrote:
Lots of SliceQueryFilter in the log, is that handling tombstone?
DEBUG [ReadStage:49] 2011-10-03 20:15:07,942 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317582939743663:true:4@1317582939933000DEBUG [ReadStage:50] 2011-10-03 20:15:07,942 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317573253148778:true:4@1317573253354000DEBUG [ReadStage:43] 2011-10-03 20:15:07,942 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317669552951428:true:4@1317669553018000DEBUG [ReadStage:33] 2011-10-03 20:15:07,942 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317581886709261:true:4@1317581886957000DEBUG [ReadStage:52] 2011-10-03 20:15:07,942 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317568165152246:true:4@1317568165482000DEBUG [ReadStage:36] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317567265089211:true:4@1317567265405000DEBUG [ReadStage:53] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317674324843122:true:4@1317674324946000DEBUG [ReadStage:38] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317571990078721:true:4@1317571990141000DEBUG [ReadStage:57] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317671855234221:true:4@1317671855239000DEBUG [ReadStage:54] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317558305262954:true:4@1317558305337000DEBUG [RequestResponseStage:11] 2011-10-03 20:15:07,941ResponseVerbHandler.java (line 48) Processing response on acallback from 12347@/10.210.101.104 <http://10.210.101.104/>DEBUG [RequestResponseStage:9] 2011-10-03 20:15:07,941AbstractRowResolver.java (line 66) Preprocessed data responseDEBUG [RequestResponseStage:13] 2011-10-03 20:15:07,941AbstractRowResolver.java (line 66) Preprocessed digest responseDEBUG [ReadStage:58] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317581337972739:true:4@1317581338044000DEBUG [ReadStage:64] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317582656796332:true:4@1317582656970000DEBUG [ReadStage:55] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317569432886284:true:4@1317569432984000DEBUG [ReadStage:45] 2011-10-03 20:15:07,941 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317572658687019:true:4@1317572658718000DEBUG [ReadStage:47] 2011-10-03 20:15:07,940 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317582281617755:true:4@1317582281717000DEBUG [ReadStage:48] 2011-10-03 20:15:07,940 SliceQueryFilter.java(line 123) collecting 0 of 1: 1317549607869226:true:4@1317549608118000DEBUG [ReadStage:34] 2011-10-03 20:15:07,940 SliceQueryFilter.java(line 123) collecting 0 of 1:On Thu, Sep 29, 2011 at 2:17 PM, aaron morton<aa...@thelastpickle.com <mailto:aa...@thelastpickle.com>> wrote:
    As with any situation involving the un-dead, it really is the
    number of Zombies, Mummies or Vampires that is the concern.

    If you delete data there will always be tombstones. If you have
    a delete heavy workload there will be more tombstones. This is
    why implementing a queue with cassandra is a bad idea.

    gc_grace_seconds (and column TTL) are the *minimum* about of
    time the tombstones will stay in the data files, there is no
    maximum.

    Your read performance also depends on the number of SSTables
    the row is spread over, see
    http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/

    If you really wanted to purge them then yes a repair and then
    major compaction would be the way to go. Also consider if it's
    possible to design the data model around the problem, e.g.
    partitioning rows by date. IMHO I would look to make data model
    changes before implementing a compaction policy, or consider if
    cassandra is the right store if you have a delete heavy workload.

    Cheers


    -----------------
    Aaron Morton
    Freelance Cassandra Developer
    @aaronmorton
    http://www.thelastpickle.com <http://www.thelastpickle.com/>

    On 30/09/2011, at 3:27 AM, Daning Wang wrote:
    Jonathan/Aaron,

    Thank you guy's reply, I will change GCGracePeriod to 1 day to
    see what will happen.

    Is there a way to purge tombstones at anytime? because if
    tombstones affect performance, we want them to be purged right
    away, not after GCGracePeriod. We know all the nodes are up,
    and we can do repair first to make sure the consistency before
    purging.

    Thanks,

    Daning


    On Wed, Sep 28, 2011 at 5:22 PM, aaron morton
    <aa...@thelastpickle.com <mailto:aa...@thelastpickle.com>> wrote:

        if I had to guess I would say it was spending time
        handling tombstones. If you see it happen again, and are
        interested, turn the logging up to DEBUG and look for
        messages from something starting with "Slice"

        Minor (automatic) compaction will, over time, purge the
        tombstones. Until then reads must read discard the data
        deleted by the tombstones. If you perform a big (i.e.
        100k's ) delete this can reduce performance until
        compaction does it's thing.

        My second guess would be read repair (or the simple
        consistency checks on read) kicking in. That would show up
        in the "ReadRepairStage" in TPSTATS

        it may have been neither of those two things, just
        guesses. If you have more issues let us know and provide
        some more info.

        Cheers


        -----------------
        Aaron Morton
        Freelance Cassandra Developer
        @aaronmorton
        http://www.thelastpickle.com <http://www.thelastpickle.com/>

        On 29/09/2011, at 6:35 AM, Daning wrote:

        > I have an app polling a few CFs (select first N * from
        CF), there were data in CFs but later were deleted so CFs
        were empty for a long time. I found Cassandra CPU usage
        was getting high to 80%, normally it uses less than 30%. I
        issued the select query manually and feel the response is
        slow. I have tried nodetool compact/repair for those CFs
        but that does not work. later, I issue 'truncate' for all
        the CFs and CPU usage gets down to 1%.
        >
        > Can somebody explain to me why I need to truncate an
        empty CF? and what else I could do to bring the CPU usage
        down?
        >
        > I am running 0.8.6.
        >
        > Thanks,
        >
        > Daning
        >

Re: Weird problem with empty CF

Reply via email to