Re: Cassandra Counters

2012-09-25 Thread Robin Verlangen
>From my point of view an other problem with using the "standard column
family" for counting is transactions. Cassandra lacks of them, so if you're
multithreaded updating counters, how will you keep track of that? Yes, I'm
aware of software like Zookeeper to do that, however I'm not sure whether
that's the best option.

I think you should stick with Cassandra counter column families.

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl



Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/9/25 Roshni Rajagopal 

>  Thanks for the reply and sorry for being bull - headed.
>
> Once  you're past the stage where you've decided its distributed, and
> NoSQL and cassandra out of all the NoSQL options,
> Now to count something, you can do it in different ways in cassandra.
> In all the ways you want to use cassandra's best features of availability,
> tunable consistency , partition tolerance etc.
>
> Given this, what are the performance tradeoffs of using counters vs a
> standard column family for counting. Because as I see if the counter number
> in a counter column family becomes wrong, it will not be 'eventually
> consistent' - you will need intervention to correct it. So the key aspect
> is how much faster would be a counter column family, and at what numbers do
> we start seing a difference.
>
>
>
>
>
> --
> Date: Tue, 25 Sep 2012 07:57:08 +0200
> Subject: Re: Cassandra Counters
> From: oleksandr.pet...@gmail.com
> To: user@cassandra.apache.org
>
>
> Maybe I'm missing the point, but counting in a standard column family
> would be a little overkill.
>
> I assume that "distributed counting" here was more of a map/reduce
> approach, where Hadoop (+ Cascading, Pig, Hive, Cascalog) would help you a
> lot. We're doing some more complex counting (e.q. based on sets of rules)
> like that. Of course, that would perform _way_ slower than counting
> beforehand. On the other side, you will always have a consistent result for
> a consistent dataset.
>
> On the other hand, if you use things like AMQP or Storm (sorry to put up
> my sentence together like that, as tools are mostly either orthogonal or
> complementary, but I hope you get my point), you could build a topology
> that makes fault-tolerant writes independently of your original write. Of
> course, it would still have a consistency tradeoff, mostly because of race
> conditions and different network latencies etc.
>
> So I would say that building a data model in a distributed system often
> depends more on your problem than on the common patterns, because
> everything has a tradeoff.
>
> Want to have an immediate result? Modify your counter while writing the
> row.
> Can sacrifice speed, but have more counting opportunities? Go with offline
> distributed counting.
> Want to have kind of both, dispatch a message and react upon it, having
> the processing logic and writes decoupled from main application, allowing
> you to care less about speed.
>
> However, I may have missed the point somewhere (early morning, you know),
> so I may be wrong in any given statement.
> Cheers
>
>
> On Tue, Sep 25, 2012 at 6:53 AM, Roshni Rajagopal <
> roshni_rajago...@hotmail.com> wrote:
>
>  Thanks Milind,
>
> Has anyone implemented counting in a standard col family in cassandra,
> when you can have increments and decrements to the count.
> Any comparisons in performance to using counter column families?
>
> Regards,
> Roshni
>
>
> --
> Date: Mon, 24 Sep 2012 11:02:51 -0700
> Subject: RE: Cassandra Counters
> From: milindpar...@gmail.com
> To: user@cassandra.apache.org
>
>
> IMO
> You would use Cassandra Counters (or other variation of distributed
> counting) in case of having determined that a centralized version of
> counting is not going to work.
> You'd determine the non_feasibility of centralized counting by figuring
> the speed at which you need to sustain writes and reads and reconcile that
> with your hard disk seek times (essentially).
> Once you have "proved" that you can't do centralized counting, the second
> layer of arsenal comes into play; which is distributed counting.
> In distributed counting , the CAP theorem comes into life. & in Cassandra,
> Availability and Network Partitioning trumps over Consistency.
>
> So yes, you sacrifice strong consistency for availability and partion
> tolerance; for eventual consistency.
> On Sep 24, 2012 10:28 AM, "Roshni Rajagopal" 
> wrote:
>
>  Hi folks,
>
>I looked at my ma

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Віталій Тимчишин
See my comments inline

2012/9/25 Aaron Turner 

> On Mon, Sep 24, 2012 at 10:02 AM, Віталій Тимчишин 
> wrote:
> > Why so?
> > What are pluses and minuses?
> > As for me, I am looking for number of files in directory.
> > 700GB/512MB*5(files per SST) = 7000 files, that is OK from my view.
> > 700GB/5MB*5 = 70 files, that is too much for single directory, too
> much
> > memory used for SST data, too huge compaction queue (that leads to
> strange
> > pauses, I suppose because of compactor thinking what to compact next),...
>
>
> Not sure why a lot of files is a problem... modern filesystems deal
> with that pretty well.
>

May be. May be it's not filesystem, but cassandra. I've seen slowdowns of
compaction when the compaction queue is too large. And it can be too large
if you have a lot of SSTables. Note that each SSTable is both FS metadata
(and FS metadata cache can be limited) and cassandra in-memory data.
Anyway, as for me, performance test would be great in this area. Otherwise
it's all speculations.



> Really large sstables mean that compactions now are taking a lot more
> disk IO and time to complete.


As for me, this point is valid only when your flushes are small. Otherwise
you still need to compact the whole key range flush cover, no matter if
this is one large file or multiple small ones. One large file can even be
cheapier to compact.


> Remember, Leveled Compaction is more
> disk IO intensive, so using large sstables makes that even worse.
> This is a big reason why the default is 5MB. Also, each level is 10x
> the size as the previous level.  Also, for level compaction, you need
> 10x the sstable size worth of free space to do compactions.  So now
> you need 5GB of free disk, vs 50MB of free disk.
>

I really don't think 5GB of free space is too much :)


>
> Also, if you're doing deletes in those CF's, that old, deleted data is
> going to stick around a LOT longer with 512MB files, because it can't
> get deleted until you have 10x512MB files to compact to level 2.
> Heaven forbid it doesn't get deleted then because each level is 10x
> bigger so you end up waiting a LOT longer to actually delete that data
> from disk.
>

But if I have small SSTables, all my data goes to high levels (4th for me
when I've had 128M setting). And it also take time for updates to reach
this level. I am not sure which way is faster.


>
> Now, if you're using SSD's then larger sstables is probably doable,
> but even then I'd guesstimate 50MB is far more reasonable then 512MB.
>

I don't think SSD are great for writes/compaction. Cassandra does this in
streaming fashion and regular HDDs are faster then SSDs for linear
read/write. SSD are good for random access, that for cassandra means reads.

P.S. I still think my way is better, yet it would be great to perform some
real tests.


> -Aaron
>
>
> > 2012/9/23 Aaron Turner 
> >>
> >> On Sun, Sep 23, 2012 at 8:18 PM, Віталій Тимчишин 
> >> wrote:
> >> > If you think about space, use Leveled compaction! This won't only
> allow
> >> > you
> >> > to fill more space, but also will shrink you data much faster in case
> of
> >> > updates. Size compaction can give you 3x-4x more space used than there
> >> > are
> >> > live data. Consider the following (our simplified) scenario:
> >> > 1) The data is updated weekly
> >> > 2) Each week a large SSTable is written (say, 300GB) after full update
> >> > processing.
> >> > 3) In 3 weeks you will have 1.2TB of data in 3 large SSTables.
> >> > 4) Only after 4th week they all will be compacted into one 300GB
> >> > SSTable.
> >> >
> >> > Leveled compaction've tamed space for us. Note that you should set
> >> > sstable_size_in_mb to reasonably high value (it is 512 for us with
> >> > ~700GB
> >> > per node) to prevent creating a lot of small files.
> >>
> >> 512MB per sstable?  Wow, that's freaking huge.  From my conversations
> >> with various developers 5-10MB seems far more reasonable.   I guess it
> >> really depends on your usage patterns, but that seems excessive to me-
> >> especially as sstables are promoted.
> >>
> >
> > --
> > Best regards,
> >  Vitalii Tymchyshyn
>
>
>
> --
> Aaron Turner
> http://synfin.net/ Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
> -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>



-- 
Best regards,
 Vitalii Tymchyshyn


a node stays in joining

2012-09-25 Thread Satoshi Yamada
hi,
One node in my cluster stay in "joining". I found a jira about this, which is 
fixed,but still sees the similar thing. This is a node I remove the token first 
becauseit did not boot correctly and re-joined in the cluster without any 
pre-set token(shouldI set the previous token?).
As you see below, the node()'s state is Joining and Effective-Ownership is 
0.00 %for more than 10 hours. But the Load keeps on increasing.
Also I noticed in the gossipinfo, the status of the node is BOOT while other 
node is NORMAL.
So, how can I get the status of the node to NORMAL?
$ nodetool -h `hostname` ringAddressDC   RackStatus 
 LoadState  Load  Effe...   datacenter1 rack1   
Up Normal   122.41 MB4.27 %.  
datacenter1 rack1   UpJoining371.33 MB0.00 %

$ nodetool -h `hostname` gossipinfo/192.0.1.111   
RELEASE=VERSION:1.1.4   LOAD:3.89343423E8   STATUS:BOOT, 1231231231312.   
SCHEMA:a442323-..
thanks,satoshi

Re: Cassandra Counters

2012-09-25 Thread Edward Kibardin
I've recently noticed several threads about Cassandra
Counters inconsistencies and started seriously think about possible
workarounds like store realtime counters in Redis and dump them daily to
Cassandra.
So general question, should I rely on Counters if I want 100% accuracy?

Thanks, Ed

On Tue, Sep 25, 2012 at 8:15 AM, Robin Verlangen  wrote:

> From my point of view an other problem with using the "standard column
> family" for counting is transactions. Cassandra lacks of them, so if you're
> multithreaded updating counters, how will you keep track of that? Yes, I'm
> aware of software like Zookeeper to do that, however I'm not sure whether
> that's the best option.
>
> I think you should stick with Cassandra counter column families.
>
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E ro...@us2.nl
>
> 
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> 2012/9/25 Roshni Rajagopal 
>
>>  Thanks for the reply and sorry for being bull - headed.
>>
>> Once  you're past the stage where you've decided its distributed, and
>> NoSQL and cassandra out of all the NoSQL options,
>> Now to count something, you can do it in different ways in cassandra.
>> In all the ways you want to use cassandra's best features of
>> availability, tunable consistency , partition tolerance etc.
>>
>> Given this, what are the performance tradeoffs of using counters vs a
>> standard column family for counting. Because as I see if the counter number
>> in a counter column family becomes wrong, it will not be 'eventually
>> consistent' - you will need intervention to correct it. So the key aspect
>> is how much faster would be a counter column family, and at what numbers do
>> we start seing a difference.
>>
>>
>>
>>
>>
>> --
>> Date: Tue, 25 Sep 2012 07:57:08 +0200
>> Subject: Re: Cassandra Counters
>> From: oleksandr.pet...@gmail.com
>> To: user@cassandra.apache.org
>>
>>
>> Maybe I'm missing the point, but counting in a standard column family
>> would be a little overkill.
>>
>> I assume that "distributed counting" here was more of a map/reduce
>> approach, where Hadoop (+ Cascading, Pig, Hive, Cascalog) would help you a
>> lot. We're doing some more complex counting (e.q. based on sets of rules)
>> like that. Of course, that would perform _way_ slower than counting
>> beforehand. On the other side, you will always have a consistent result for
>> a consistent dataset.
>>
>> On the other hand, if you use things like AMQP or Storm (sorry to put up
>> my sentence together like that, as tools are mostly either orthogonal or
>> complementary, but I hope you get my point), you could build a topology
>> that makes fault-tolerant writes independently of your original write. Of
>> course, it would still have a consistency tradeoff, mostly because of race
>> conditions and different network latencies etc.
>>
>> So I would say that building a data model in a distributed system often
>> depends more on your problem than on the common patterns, because
>> everything has a tradeoff.
>>
>> Want to have an immediate result? Modify your counter while writing the
>> row.
>> Can sacrifice speed, but have more counting opportunities? Go with
>> offline distributed counting.
>> Want to have kind of both, dispatch a message and react upon it, having
>> the processing logic and writes decoupled from main application, allowing
>> you to care less about speed.
>>
>> However, I may have missed the point somewhere (early morning, you know),
>> so I may be wrong in any given statement.
>> Cheers
>>
>>
>> On Tue, Sep 25, 2012 at 6:53 AM, Roshni Rajagopal <
>> roshni_rajago...@hotmail.com> wrote:
>>
>>  Thanks Milind,
>>
>> Has anyone implemented counting in a standard col family in cassandra,
>> when you can have increments and decrements to the count.
>> Any comparisons in performance to using counter column families?
>>
>> Regards,
>> Roshni
>>
>>
>> --
>> Date: Mon, 24 Sep 2012 11:02:51 -0700
>> Subject: RE: Cassandra Counters
>> From: milindpar...@gmail.com
>> To: user@cassandra.apache.org
>>
>>
>> IMO
>> You would use Cassandra Counters (or other variation of distributed
>> counting) in case of having determined that a centralized version of
>> counting is not going to work.
>> You'd determine the non_feasibility of centralized counting by figuring
>> the speed at which you need to sustain writes and reads and reconcile that
>> with your hard disk seek times (essentially).
>> Onc

Re: [problem with OOM in nodes]

2012-09-25 Thread Denis Gabaydulin
Thanks a lot for helping. We came to the same decision clustering one
report to multiple cassandra rows (sorted buckets of report rows) and
manage clusters on client side.

On Tue, Sep 25, 2012 at 5:28 AM, aaron morton  wrote:
> What exactly is the problem with big rows?
>
> During compaction the row will be passed through a slower two pass
> processing, this add's to IO pressure.
> Counting big rows requires that the entire row be read.
> Repairing big rows requires that the entire row be repaired.
>
> I generally avoid rows above a few 10's of MB as they result in more memory
> churn and create admin problems as above.
>
>
> What exactly is the problem with big rows?
>
> And, how can we should place our data in this case (see the schema in
> the previous replies)? Splitting one report to multiple rows is
> uncomfortably :-(
>
>
> Looking at your row sizes below, the question is "How do I store an object
> which may be up to 3.5GB in size."
>
> AFAIK there are no hard limits that would prevent you putting that in one
> row. And avoiding super columns may save some space. You could have a Simple
> CF, where the each report is one row, each report row is one column and the
> report row is serialised (with JSON or protobufs etc) and stored in the
> column value.
>
> But i would recommend creating a model where row size is constrained in
> space. E.g.
>
> Report CF:
> * one report per row.
> * one column per report row
> * column value is empty.
>
> Report Rows CF:
> * one row per 100 report rows, e.g. 
> * column name is  report row number.
> * column value is report data
> (Or use composite column names, e.g. 
>
> You can still do ranges, buy you have to do some client side work to work it
> out.
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 24/09/2012, at 5:14 PM, Denis Gabaydulin  wrote:
>
> On Sun, Sep 23, 2012 at 10:41 PM, aaron morton 
> wrote:
>
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
> "[0-9]+ bytes" -o | cut -d " " -f 1 |  awk '{ foo = $1 / 1024 / 1024 ;
> print foo "MB" }'  | sort -nr | head -n 50
>
>
> Is it bad signal?
>
> Sorry, I do not know what this is outputting.
>
>
> This is outputting size of big rows which cassandra had compacted before.
>
> As I can see in cfstats, compacted row maximum size: 386857368 !
>
> Yes.
> Having rows in the 100's of MB is will cause problems. Doubly so if they are
> large super columns.
>
>
> What exactly is the problem with big rows?
> And, how can we should place our data in this case (see the schema in
> the previous replies)? Splitting one report to multiple rows is
> uncomfortably :-(
>
>
> Cheers
>
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 22/09/2012, at 5:07 AM, Denis Gabaydulin  wrote:
>
> And some stuff from log:
>
>
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E
> "[0-9]+ bytes" -o | cut -d " " -f 1 |  awk '{ foo = $1 / 1024 / 1024 ;
> print foo "MB" }'  | sort -nr | head -n 50
> 3821.55MB
> 3337.85MB
> 1221.64MB
> 1128.67MB
> 930.666MB
> 916.4MB
> 861.114MB
> 843.325MB
> 711.813MB
> 706.992MB
> 674.282MB
> 673.861MB
> 658.305MB
> 557.756MB
> 531.577MB
> 493.112MB
> 492.513MB
> 492.291MB
> 484.484MB
> 479.908MB
> 465.742MB
> 464.015MB
> 459.95MB
> 454.472MB
> 441.248MB
> 428.763MB
> 424.028MB
> 416.663MB
> 416.191MB
> 409.341MB
> 406.895MB
> 397.314MB
> 388.27MB
> 376.714MB
> 371.298MB
> 368.819MB
> 366.92MB
> 361.371MB
> 360.509MB
> 356.168MB
> 355.012MB
> 354.897MB
> 354.759MB
> 347.986MB
> 344.109MB
> 335.546MB
> 329.529MB
> 326.857MB
> 326.252MB
> 326.237MB
>
> Is it bad signal?
>
> On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin  wrote:
>
> Found one more intersting fact.
> As I can see in cfstats, compacted row maximum size: 386857368 !
>
> On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin 
> wrote:
>
> Reports - is a SuperColumnFamily
>
> Each report has unique identifier (report_id). This is a key of
> SuperColumnFamily.
> And a report saved in separate row.
>
> A report is consisted of report rows (may vary between 1 and 50,
> but most are small).
>
> Each report row is saved in separate super column. Hector based code:
>
> superCfMutator.addInsertion(
> report_id,
> "Reports",
> HFactory.createSuperColumn(
>   report_row_id,
>   mapper.convertObject(object),
>   columnDefinition.getTopSerializer(),
>   columnDefinition.getSubSerializer(),
>   inferringSerializer
> )
> );
>
> We have two frequent operation:
>
> 1. count report rows by report_id (calculate number of super columns
> in the row).
> 2. get report rows by report_id and range predicate (get super columns
> from the row with range predicate).
>
> I can't see here a big super columns :-(
>
> On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs  wrote:
>
> I'm not 100% that I understand your data model and read patterns correctly,
> but it sounds like you have large supe

The compaction task cannot delete sstables which are used in a repair session

2012-09-25 Thread Rene Kochen
Is this a bug? I'm using Cassandra 1.0.11:

INFO 13:45:43,750 Compacting
[SSTableReader(path='d:\data\Traxis\Parameters-hd-47-Data.db'),
SSTableReader(path='d:\data\Traxis\Parameters-hd-44-Data.db'),
SSTableReader(path='d:\data\Traxis\Parameters-hd-46-Data.db'),
SSTableReader(path='d:\data\Traxis\Parameters-hd-45-Data.db')]
INFO 13:45:43,782 Compacted to [d:\data\Traxis\Parameters-hd-48-Data.db,].
2,552 to 638 (~25% of original) bytes for 1 keys at 0.019014MB/s.  Time:
32ms.
ERROR 13:45:43,782 Unable to delete d:\data\Traxis\Parameters-hd-44-Data.db
(it will be removed on server restart; we'll also retry after GC)
ERROR 13:45:43,782 Unable to delete d:\data\Traxis\Parameters-hd-45-Data.db
(it will be removed on server restart; we'll also retry after GC)
ERROR 13:45:43,797 Unable to delete d:\data\Traxis\Parameters-hd-46-Data.db
(it will be removed on server restart; we'll also retry after GC)
ERROR 13:45:43,797 Unable to delete d:\data\Traxis\Parameters-hd-47-Data.db
(it will be removed on server restart; we'll also retry after GC)
INFO 13:45:43,797 [repair #88f6f3a0-0706-11e2--aac4e84dbbbf] Sending
completed merkle tree to /10.49.94.171 for (Traxis,Parameters)

Thanks,

Rene


Re: Cassandra Counters

2012-09-25 Thread rohit bhatia
@Edward,

We use counters in production with Cassandra 1.0.5. Though since our
application is sensitive to write latency and we are seeing problems with
Frequent Young Garbage Collections, and also we just do increments
(decrements have caused problems for some people)
We don't see inconsistencies in our data.
So if you want 99.99% accurate counters, and can manage with eventual
consistency. Cassandra works nicely.

On Tue, Sep 25, 2012 at 4:52 PM, Edward Kibardin  wrote:

> I've recently noticed several threads about Cassandra
> Counters inconsistencies and started seriously think about possible
> workarounds like store realtime counters in Redis and dump them daily to
> Cassandra.
> So general question, should I rely on Counters if I want 100% accuracy?
>
> Thanks, Ed
>
>
> On Tue, Sep 25, 2012 at 8:15 AM, Robin Verlangen  wrote:
>
>> From my point of view an other problem with using the "standard column
>> family" for counting is transactions. Cassandra lacks of them, so if you're
>> multithreaded updating counters, how will you keep track of that? Yes, I'm
>> aware of software like Zookeeper to do that, however I'm not sure whether
>> that's the best option.
>>
>> I think you should stick with Cassandra counter column families.
>>
>> Best regards,
>>
>> Robin Verlangen
>> *Software engineer*
>> *
>> *
>> W http://www.robinverlangen.nl
>> E ro...@us2.nl
>>
>> 
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.
>>
>>
>>
>> 2012/9/25 Roshni Rajagopal 
>>
>>>  Thanks for the reply and sorry for being bull - headed.
>>>
>>> Once  you're past the stage where you've decided its distributed, and
>>> NoSQL and cassandra out of all the NoSQL options,
>>> Now to count something, you can do it in different ways in cassandra.
>>> In all the ways you want to use cassandra's best features of
>>> availability, tunable consistency , partition tolerance etc.
>>>
>>> Given this, what are the performance tradeoffs of using counters vs a
>>> standard column family for counting. Because as I see if the counter number
>>> in a counter column family becomes wrong, it will not be 'eventually
>>> consistent' - you will need intervention to correct it. So the key aspect
>>> is how much faster would be a counter column family, and at what numbers do
>>> we start seing a difference.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Date: Tue, 25 Sep 2012 07:57:08 +0200
>>> Subject: Re: Cassandra Counters
>>> From: oleksandr.pet...@gmail.com
>>> To: user@cassandra.apache.org
>>>
>>>
>>> Maybe I'm missing the point, but counting in a standard column family
>>> would be a little overkill.
>>>
>>> I assume that "distributed counting" here was more of a map/reduce
>>> approach, where Hadoop (+ Cascading, Pig, Hive, Cascalog) would help you a
>>> lot. We're doing some more complex counting (e.q. based on sets of rules)
>>> like that. Of course, that would perform _way_ slower than counting
>>> beforehand. On the other side, you will always have a consistent result for
>>> a consistent dataset.
>>>
>>> On the other hand, if you use things like AMQP or Storm (sorry to put up
>>> my sentence together like that, as tools are mostly either orthogonal or
>>> complementary, but I hope you get my point), you could build a topology
>>> that makes fault-tolerant writes independently of your original write. Of
>>> course, it would still have a consistency tradeoff, mostly because of race
>>> conditions and different network latencies etc.
>>>
>>> So I would say that building a data model in a distributed system often
>>> depends more on your problem than on the common patterns, because
>>> everything has a tradeoff.
>>>
>>> Want to have an immediate result? Modify your counter while writing the
>>> row.
>>> Can sacrifice speed, but have more counting opportunities? Go with
>>> offline distributed counting.
>>> Want to have kind of both, dispatch a message and react upon it, having
>>> the processing logic and writes decoupled from main application, allowing
>>> you to care less about speed.
>>>
>>> However, I may have missed the point somewhere (early morning, you
>>> know), so I may be wrong in any given statement.
>>> Cheers
>>>
>>>
>>> On Tue, Sep 25, 2012 at 6:53 AM, Roshni Rajagopal <
>>> roshni_rajago...@hotmail.com> wrote:
>>>
>>>  Thanks Milind,
>>>
>>> Has anyone implemented counting in a standard col family in cassandra,
>>> when you can have increments and decrements to the count.
>>> Any comparisons in performance to using counter column families?
>>>
>>> 

Re: Correct model

2012-09-25 Thread Hiller, Dean
If you need anything added/fixed, just let PlayOrm know.  PlayOrm has been able 
to quickly add so far…that may change as more and more requests come but so far 
PlayOrm seems to have managed to keep up.

We are using it live by the way already.  It works out very well so far for us 
(We have 5000 column families, obviously dynamically created instead of by 
hand…a very interesting use case of cassandra).  In our live environment we 
configured astyanax with LocalQUOROM on reads AND writes so CP style so we can 
afford one node out of 3 to go down but if two go down it stops working THOUGH 
there is a patch in astyanax to auto switch from LocalQUOROM to ONE NODE 
read/write when two nodes go down that we would like to suck in eventually so 
it is always live(I don't think Hector has that and it is a really NICE 
feature….ie fail localquorm read/write and then try again with consistency 
level of one).

Later,
Dean


From: Marcelo Elias Del Valle mailto:mvall...@gmail.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Monday, September 24, 2012 1:54 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Correct model

Dean, this sounds like magic :D
I don't know details about the performance on the index implementations you 
chose, but it would pay the way to use it in my case, as I don't need the best 
performance in the world when reading, but I need to assure scalability and 
have a simple model to maintain. I liked the playOrm concept regarding this.
I have more doubts, but I will ask them at stack over flow from now on.

2012/9/24 Hiller, Dean mailto:dean.hil...@nrel.gov>>
PlayOrm will automatically create a CF to index my CF?

It creates 3 CF's for all indices, IntegerIndice, DecimalIndice, and 
StringIndice such that the ad-hoc tool that is in development can display the 
indices as it knows the prefix of the composite column name is of Integer, 
Decimal or String and it knows the postfix type as well so it can translate 
back from bytes to the types and properly display in a GUI (i.e. On top of 
SELECT, the ad-hoc tool is adding a way to view the induce rows so you can 
check if they got corrupt or not).

Will it auto-manage it, like Cassandra's secondary indexes?

YES

Further detail…

You annotated fields with @NoSqlIndexed and PlayOrm adds/removes from the index 
as you add/modify/remove the entity…..a modify does a remove old val from index 
and insert new value into index.

An example would be PlayOrm stores all long, int, short, byte in a type that 
uses the least amount of space so IF you have a long OR BigInteger between –128 
to 128 it only ends up storing 1 byte in cassandra(SAVING tons of space!!!).  
Then if you are indexing a type that is one of those, PlayOrm creates a 
IntegerIndice table.

Right now, another guy is working on playorm-server which is a webgui to allow 
ad-hoc access to all your data as well so you can ad-hoc queries to see data 
and instead of showing Hex, it shows the real values by translating the bytes 
to String for the schema portions that it is aware of that is.

Later,
Dean

From: Marcelo Elias Del Valle 
mailto:mvall...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org>"
 
mailto:user@cassandra.apache.org>>>
Date: Monday, September 24, 2012 12:09 PM
To: 
"user@cassandra.apache.org>"
 
mailto:user@cassandra.apache.org>>>
Subject: Re: Correct model

Dean,

There is one last thing I would like to ask about playOrm by this list, the 
next questiosn will come by stackOverflow. Just because of the context, I 
prefer asking this here:
 When you say playOrm indexes a table (which would be a CF behind the 
scenes), what do you mean? PlayOrm will automatically create a CF to index my 
CF? Will it auto-manage it, like Cassandra's secondary indexes?
 In Cassandra, the application is responsible for maintaining the index, 
right? I might be wrong, but unless I am using secondary indexes I need to 
update index values manually, right?
 I got confused when you said "PlayOrm indexes the columns you choose". How 
do I choose and what exactly it means?

Best regards,
Marcelo Valle.

2012/9/24 Hiller, Dean 
mailto:dean.hil...@nrel.gov>>>
Oh, ok, you were talking about the wide row pattern, right?

yes

But playORM is compatible with Aaron's model, isn't it?

Not yet, PlayOrm supports partitioning one table multiple ways as it indexes 
the columns(in your case, the userid FK column and the time c

Re: Cassandra Counters

2012-09-25 Thread Sylvain Lebresne
>
> So general question, should I rely on Counters if I want 100% accuracy?
>

No.

 Even not considering potential bugs, counters being not idempotent, if you
get a TimeoutException during a write (which can happen even in relatively
normal conditions), you won't know if the increment went in or not (and you
have no way to know unless you have an external way to check the value).
This is probably fine if you use counters for say real-time analytics, but
not if you use 100% accuracy.

--
Sylvain


Re: Cassandra Counters

2012-09-25 Thread rohit bhatia
@Sylvain

In a relatively untroubled cluster, even timed out writes go through,
provided no messages are dropped. Which you can monitor on cassandra
nodes. We have 100% consistency on our production servers as we don't
see messages being dropped on our servers.
Though as you mention, there would be no way to "repair" your dropped messages .

On Tue, Sep 25, 2012 at 6:57 PM, Sylvain Lebresne  wrote:
>> So general question, should I rely on Counters if I want 100% accuracy?
>
>
> No.
>
>  Even not considering potential bugs, counters being not idempotent, if you
> get a TimeoutException during a write (which can happen even in relatively
> normal conditions), you won't know if the increment went in or not (and you
> have no way to know unless you have an external way to check the value).
> This is probably fine if you use counters for say real-time analytics, but
> not if you use 100% accuracy.
>
> --
> Sylvain


Re: Cassandra Counters

2012-09-25 Thread Edward Kibardin
@Sylvain and @Rohit: Thanks for your answers.


On Tue, Sep 25, 2012 at 2:27 PM, Sylvain Lebresne wrote:

> So general question, should I rely on Counters if I want 100% accuracy?
>>
>
> No.
>
>  Even not considering potential bugs, counters being not idempotent, if
> you get a TimeoutException during a write (which can happen even in
> relatively normal conditions), you won't know if the increment went in or
> not (and you have no way to know unless you have an external way to check
> the value). This is probably fine if you use counters for say real-time
> analytics, but not if you use 100% accuracy.
>
> --
> Sylvain
>


Re: Cassandra Counters

2012-09-25 Thread Sylvain Lebresne
> In a relatively untroubled cluster, even timed out writes go through,
> provided no messages are dropped.

This all depends of your definition of "untroubled" cluster, but to be
clear, in a cluster where a node dies (which for Cassandra is not
considered abnormal and will happen to everyone no matter how good
your monitoring is), you have a good change to get TimeoutExceptions
on counter writes while the other nodes of the cluster haven't
detected the failure (which can take a few seconds) AND those writes
won't get through. The fact that Cassandra logs dropped messages or
not has nothing to do with that.

> We have 100% consistency on our production servers as we don't
> see messages being dropped on our servers.

Though I'm happy for you that you achieve 100% consistency, I want to
re-iter that not seeing any log of messages being dropped does not
guarantee that all counter writes did went true: the ones that timeout
may or may have been persisted.

--
Sylvain


> Though as you mention, there would be no way to "repair" your dropped 
> messages .
>
> On Tue, Sep 25, 2012 at 6:57 PM, Sylvain Lebresne  
> wrote:
>>> So general question, should I rely on Counters if I want 100% accuracy?
>>
>>
>> No.
>>
>>  Even not considering potential bugs, counters being not idempotent, if you
>> get a TimeoutException during a write (which can happen even in relatively
>> normal conditions), you won't know if the increment went in or not (and you
>> have no way to know unless you have an external way to check the value).
>> This is probably fine if you use counters for say real-time analytics, but
>> not if you use 100% accuracy.
>>
>> --
>> Sylvain


Re: Correct model

2012-09-25 Thread Marcelo Elias Del Valle
Dean,

In the playOrm data modeling, if I understood it correctly, every CF
has its own id, right? For instance, User would have its own ID, Activities
would have its own id, etc. What if I have a trillion activities? Wouldn't
be a problem to have 1 row id for each activity?
 Cassandra always indexes by row id, right? If I have too many row ids
without using composite keys, will it scale the same way? Wouldn't the time
to insert an activity be each time longer because I have too many
activities?

Best regards,
Marcelo Valle.

2012/9/25 Hiller, Dean 

> If you need anything added/fixed, just let PlayOrm know.  PlayOrm has been
> able to quickly add so far…that may change as more and more requests come
> but so far PlayOrm seems to have managed to keep up.
>
> We are using it live by the way already.  It works out very well so far
> for us (We have 5000 column families, obviously dynamically created instead
> of by hand…a very interesting use case of cassandra).  In our live
> environment we configured astyanax with LocalQUOROM on reads AND writes so
> CP style so we can afford one node out of 3 to go down but if two go down
> it stops working THOUGH there is a patch in astyanax to auto switch from
> LocalQUOROM to ONE NODE read/write when two nodes go down that we would
> like to suck in eventually so it is always live(I don't think Hector has
> that and it is a really NICE feature….ie fail localquorm read/write and
> then try again with consistency level of one).
>
> Later,
> Dean
>
>
> From: Marcelo Elias Del Valle  mvall...@gmail.com>>
> Reply-To: "user@cassandra.apache.org" <
> user@cassandra.apache.org>
> Date: Monday, September 24, 2012 1:54 PM
> To: "user@cassandra.apache.org" <
> user@cassandra.apache.org>
> Subject: Re: Correct model
>
> Dean, this sounds like magic :D
> I don't know details about the performance on the index implementations
> you chose, but it would pay the way to use it in my case, as I don't need
> the best performance in the world when reading, but I need to assure
> scalability and have a simple model to maintain. I liked the playOrm
> concept regarding this.
> I have more doubts, but I will ask them at stack over flow from now on.
>
> 2012/9/24 Hiller, Dean mailto:dean.hil...@nrel.gov>>
> PlayOrm will automatically create a CF to index my CF?
>
> It creates 3 CF's for all indices, IntegerIndice, DecimalIndice, and
> StringIndice such that the ad-hoc tool that is in development can display
> the indices as it knows the prefix of the composite column name is of
> Integer, Decimal or String and it knows the postfix type as well so it can
> translate back from bytes to the types and properly display in a GUI (i.e.
> On top of SELECT, the ad-hoc tool is adding a way to view the induce rows
> so you can check if they got corrupt or not).
>
> Will it auto-manage it, like Cassandra's secondary indexes?
>
> YES
>
> Further detail…
>
> You annotated fields with @NoSqlIndexed and PlayOrm adds/removes from the
> index as you add/modify/remove the entity…..a modify does a remove old val
> from index and insert new value into index.
>
> An example would be PlayOrm stores all long, int, short, byte in a type
> that uses the least amount of space so IF you have a long OR BigInteger
> between –128 to 128 it only ends up storing 1 byte in cassandra(SAVING tons
> of space!!!).  Then if you are indexing a type that is one of those,
> PlayOrm creates a IntegerIndice table.
>
> Right now, another guy is working on playorm-server which is a webgui to
> allow ad-hoc access to all your data as well so you can ad-hoc queries to
> see data and instead of showing Hex, it shows the real values by
> translating the bytes to String for the schema portions that it is aware of
> that is.
>
> Later,
> Dean
>
> From: Marcelo Elias Del Valle  mvall...@gmail.com>>>
> Reply-To: "user@cassandra.apache.org >>" <
> user@cassandra.apache.org user@cassandra.apache.org>>
> Date: Monday, September 24, 2012 12:09 PM
> To: "user@cassandra.apache.org user@cassandra.apache.org>" <
> user@cassandra.apache.org user@cassandra.apache.org>>
> Subject: Re: Correct model
>
> Dean,
>
> There is one last thing I would like to ask about playOrm by this
> list, the next questiosn will come by stackOverflow. Just because of the
> context, I prefer asking this here:
>  When you say playOrm indexes a table (which would be a CF behind the
> scenes), what do you mean? PlayOrm will automatically create a CF to index
> my CF? Will it auto-m

Re: Correct model

2012-09-25 Thread Hiller, Dean
Just fyi that some of these are cassandra questions…

Dean,

In the playOrm data modeling, if I understood it correctly, every CF has 
its own id, right?

No, each entity has a field annotated with @NoSqlId.  That tells playOrm this 
is the row key.  Each INSTANCE of the entity is a row in cassandra (very much 
like hibernate for RDBMS).  So every instance of Activity has a different 
NoSqlId (NOTE: ids are auto generated so you don't need to deal with it though 
you can set it manually if you like)

For instance, User would have its own ID, Activities would have its own id, etc.

User has a field private String id; annotated with @NoSqlId so each INSTANCE of 
User has it's own id and each INSTANCE of Activity has it's own id.

What if I have a trillion activities?

This is fine and is a normal cassandra use-case.  In fact, this is highly 
desirable in nosql stores and retrieving by key is desired when possible.

Wouldn't be a problem to have 1 row id for each activity?

Nope, no problems.

 Cassandra always indexes by row id, right?

If you do CQL and cassandra partitioning/indexing, then yes, BUT if you do 
PlayOrm partitioning, then NO.  PlayOrm indexes your columns and there is ONE 
index for EACH partition so if you have 1 trillion rows and 1 billion 
partitions, then each index on average is 1000 rows only so you can do a quick 
query into an index that only has 1000 values.

If I have too many row ids without using composite keys, will it scale the same 
way?

Yes, partitions is the key though….you must decide your partitioning so that 
partitions(or I could say indices) do not have a very high row count.  I 
currently maintain less than 1 million but I would say it slows down somewhere 
in the millions of rows per partition(ie. You can get pretty big but smaller 
can be better).

Wouldn't the time to insert an activity be each time longer because I have too 
many activities?

Nope, this is a cassandra question really and cassandra is optimized as all 
noSQL stores are to put and read value by key.  They all work best that way.

Behind the scenes there is a meta table that PlayOrm writes to(one row per java 
class you create that is annotated with @NoSqlEntity) and that is used to drive 
the ad-hoc tool so you can query into cassandra and not get hex out, but get 
the real values and see them.

Best regards,
Marcelo Valle.

2012/9/25 Hiller, Dean mailto:dean.hil...@nrel.gov>>
If you need anything added/fixed, just let PlayOrm know.  PlayOrm has been able 
to quickly add so far…that may change as more and more requests come but so far 
PlayOrm seems to have managed to keep up.

We are using it live by the way already.  It works out very well so far for us 
(We have 5000 column families, obviously dynamically created instead of by 
hand…a very interesting use case of cassandra).  In our live environment we 
configured astyanax with LocalQUOROM on reads AND writes so CP style so we can 
afford one node out of 3 to go down but if two go down it stops working THOUGH 
there is a patch in astyanax to auto switch from LocalQUOROM to ONE NODE 
read/write when two nodes go down that we would like to suck in eventually so 
it is always live(I don't think Hector has that and it is a really NICE 
feature….ie fail localquorm read/write and then try again with consistency 
level of one).

Later,
Dean


From: Marcelo Elias Del Valle 
mailto:mvall...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org>"
 
mailto:user@cassandra.apache.org>>>
Date: Monday, September 24, 2012 1:54 PM
To: 
"user@cassandra.apache.org>"
 
mailto:user@cassandra.apache.org>>>
Subject: Re: Correct model

Dean, this sounds like magic :D
I don't know details about the performance on the index implementations you 
chose, but it would pay the way to use it in my case, as I don't need the best 
performance in the world when reading, but I need to assure scalability and 
have a simple model to maintain. I liked the playOrm concept regarding this.
I have more doubts, but I will ask them at stack over flow from now on.

2012/9/24 Hiller, Dean 
mailto:dean.hil...@nrel.gov>>>
PlayOrm will automatically create a CF to index my CF?

It creates 3 CF's for all indices, IntegerIndice, DecimalIndice, and 
StringIndice such that the ad-hoc tool that is in development can display the 
indices as it knows the prefix of the composite column name is of Integer, 
Decimal or String and it knows the postfix type as well so it can translate 
back from bytes to the types and properly display in a GUI (i.e. On top of 
SELEC

Running repair negatively impacts read performance?

2012-09-25 Thread Charles Brophy
Hey guys,

I've begun to notice that read operations take a performance nose-dive
after a standard (full) repair of a fairly large column family: ~11 million
records. Interestingly, I've then noticed that read performance returns to
normal after a full scrub of the column family. Is it possible that the
repair operation is not correctly establishing the bloom filter afterwards?
I've noticed an interesting note of the scrub operation is that it will
"rebuild sstables with correct bloom filters" which is what is leading me
to this conclusion. Does this make sense?

I'm using 1.1.3 and Oracle JDK 1.6.31
The column family is a stanard type and I've noticed this exact behavior
regardless of the key/column/value serializers in use.

Charles


Re:

2012-09-25 Thread Charles Brophy
There are settings in cassandra.yaml that will _gradually_ reduce the
available cache to zero if you are under constant memory pressure:

 # Set to 1.0 to disable.  
reduce_cache_sizes_at: *
reduce_cache_capacity_to: *

My experience is that the cache size will not return to the configured size
until a service restart if you leave this enabled.  The text of this
setting is not explicit about the long-term cache shrinkage, so it's easy
to think that it will restore the cache to its configured size after the
pressures have subsided. It won't.

Charles

On Tue, Sep 25, 2012 at 8:14 AM, Manu Zhang  wrote:

> I've enabled row cache and set its capacity to 10MB but when I check its
> size in jconsole it's always 0. Isn't it that a row will be written to row
> cache if it isn't there when I read the row? I've bulk loaded the data into
> disk so row cache is crucial to the performance.


Re: Correct model

2012-09-25 Thread Hiller, Dean
Oh, and if you really want to scale very easily, just use play framework
1.2.5 ;).  We use that and since it is stateless, to scale up, you simple
add more servers.  Also, it's like coding in php or ruby, etc. etc as far
as development speed(no server restarts) so it's a pretty nice framework.
We tried 2.x version, but it is just too unproductive with server restarts.

If you do use playframework, let me know and I can send you the startup
code we use in play framework so you can simply call NoSql.em() to get
that requests NoSqlEntityManager.  A play framework plugin will be
developed as well for the 1.2.x line.

Later,
Dean

On 9/25/12 6:36 AM, "Hiller, Dean"  wrote:

>If you need anything added/fixed, just let PlayOrm know.  PlayOrm has
>been able to quickly add so farŠthat may change as more and more requests
>come but so far PlayOrm seems to have managed to keep up.
>
>We are using it live by the way already.  It works out very well so far
>for us (We have 5000 column families, obviously dynamically created
>instead of by handŠa very interesting use case of cassandra).  In our
>live environment we configured astyanax with LocalQUOROM on reads AND
>writes so CP style so we can afford one node out of 3 to go down but if
>two go down it stops working THOUGH there is a patch in astyanax to auto
>switch from LocalQUOROM to ONE NODE read/write when two nodes go down
>that we would like to suck in eventually so it is always live(I don't
>think Hector has that and it is a really NICE featureŠ.ie fail localquorm
>read/write and then try again with consistency level of one).
>
>Later,
>Dean
>
>
>From: Marcelo Elias Del Valle
>mailto:mvall...@gmail.com>>
>Reply-To: "user@cassandra.apache.org"
>mailto:user@cassandra.apache.org>>
>Date: Monday, September 24, 2012 1:54 PM
>To: "user@cassandra.apache.org"
>mailto:user@cassandra.apache.org>>
>Subject: Re: Correct model
>
>Dean, this sounds like magic :D
>I don't know details about the performance on the index implementations
>you chose, but it would pay the way to use it in my case, as I don't need
>the best performance in the world when reading, but I need to assure
>scalability and have a simple model to maintain. I liked the playOrm
>concept regarding this.
>I have more doubts, but I will ask them at stack over flow from now on.
>
>2012/9/24 Hiller, Dean mailto:dean.hil...@nrel.gov>>
>PlayOrm will automatically create a CF to index my CF?
>
>It creates 3 CF's for all indices, IntegerIndice, DecimalIndice, and
>StringIndice such that the ad-hoc tool that is in development can display
>the indices as it knows the prefix of the composite column name is of
>Integer, Decimal or String and it knows the postfix type as well so it
>can translate back from bytes to the types and properly display in a GUI
>(i.e. On top of SELECT, the ad-hoc tool is adding a way to view the
>induce rows so you can check if they got corrupt or not).
>
>Will it auto-manage it, like Cassandra's secondary indexes?
>
>YES
>
>Further detailŠ
>
>You annotated fields with @NoSqlIndexed and PlayOrm adds/removes from the
>index as you add/modify/remove the entityŠ..a modify does a remove old
>val from index and insert new value into index.
>
>An example would be PlayOrm stores all long, int, short, byte in a type
>that uses the least amount of space so IF you have a long OR BigInteger
>between ­128 to 128 it only ends up storing 1 byte in cassandra(SAVING
>tons of space!!!).  Then if you are indexing a type that is one of those,
>PlayOrm creates a IntegerIndice table.
>
>Right now, another guy is working on playorm-server which is a webgui to
>allow ad-hoc access to all your data as well so you can ad-hoc queries to
>see data and instead of showing Hex, it shows the real values by
>translating the bytes to String for the schema portions that it is aware
>of that is.
>
>Later,
>Dean
>
>From: Marcelo Elias Del Valle
>mailto:mvall...@gmail.com>>>
>Reply-To:
>"user@cassandra.apache.orgassandra.apache.org>"
>mailto:user@cassandra.apache.org>assandra.apache.org>>
>Date: Monday, September 24, 2012 12:09 PM
>To:
>"user@cassandra.apache.orgassandra.apache.org>"
>mailto:user@cassandra.apache.org>assandra.apache.org>>
>Subject: Re: Correct model
>
>Dean,
>
>There is one last thing I would like to ask about playOrm by this
>list, the next questiosn will come by stackOverflow. Just because of the
>context, I prefer asking this here:
> When you say playOrm indexes a table (which would be a CF behind the
>scenes), what do you mean? PlayOrm will automatically create a CF to
>index my CF? Will it auto-manage it, like Cassandra's se

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Aaron Turner
On Tue, Sep 25, 2012 at 10:36 AM, Віталій Тимчишин  wrote:
> See my comments inline
>
> 2012/9/25 Aaron Turner 
>>
>> On Mon, Sep 24, 2012 at 10:02 AM, Віталій Тимчишин 
>> wrote:
>> > Why so?
>> > What are pluses and minuses?
>> > As for me, I am looking for number of files in directory.
>> > 700GB/512MB*5(files per SST) = 7000 files, that is OK from my view.
>> > 700GB/5MB*5 = 70 files, that is too much for single directory, too
>> > much
>> > memory used for SST data, too huge compaction queue (that leads to
>> > strange
>> > pauses, I suppose because of compactor thinking what to compact
>> > next),...
>>
>>
>> Not sure why a lot of files is a problem... modern filesystems deal
>> with that pretty well.
>
>
> May be. May be it's not filesystem, but cassandra. I've seen slowdowns of
> compaction when the compaction queue is too large. And it can be too large
> if you have a lot of SSTables. Note that each SSTable is both FS metadata
> (and FS metadata cache can be limited) and cassandra in-memory data.
> Anyway, as for me, performance test would be great in this area. Otherwise
> it's all speculations.

Agreed... I guess my thought is the default is 5MB and the
recommendations of the developers is to not stray too far from that.
So unless you've done the performance benchmarks to prove otherwise,
I'm not sure why you chose a value about 100x that?

Also, I notice you're talking about 700GB/node?  That's about 200%
above the recommended maximum of 300-400GB node.  I notice a lot of
people are trying to push this number, because while disk is
relatively cheap, computers are not.


-- 
Aaron Turner
http://synfin.net/ Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
"carpe diem quam minimum credula postero"


Re: unsubscribe

2012-09-25 Thread Eric Evans
On Tue, Sep 25, 2012 at 1:23 PM, puneet loya  wrote:
>

http://goo.gl/JcMcr

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


is this a cassandra bug?

2012-09-25 Thread Hiller, Dean
This is cassandra 1.1.4

Describe shows DecimalType and I test setting comparator TO the DecimalType 
and it fails  (Realize I have never touched this column family until now except 
for posting data which succeeded)

[default@unknown] use databus;
Authenticated to keyspace: databus
[default@databus] describe bacnet9800AnalogInput9;
ColumnFamily: bacnet9800AnalogInput9
  Key Validation Class: org.apache.cassandra.db.marshal.DecimalType
  Default column value validator: org.apache.cassandra.db.marshal.BytesType
  Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 0.1
  DC Local Read repair chance: 0.0
  Replicate on write: true
  Caching: KEYS_ONLY
  Bloom Filter FP chance: default
  Built indexes: []
  Compaction Strategy: 
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
  Compression Options:
sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
[default@databus] update column family bacnet9800AnalogInput9 with comparator = 
DecimalType;
org.apache.thrift.transport.TTransportException
[default@databus]

Exception from system.log from the node in the cluster is

ERROR [MigrationStage:1] 2012-09-25 14:11:20,327 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[MigrationStage:1,5,main]
java.lang.RuntimeException: java.io.IOException: 
org.apache.cassandra.config.ConfigurationException: comparators do not match or 
are not compatible.
at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: 
org.apache.cassandra.config.ConfigurationException: comparators do not match or 
are not compatible.
at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:676)
at org.apache.cassandra.db.DefsTable.updateColumnFamily(DefsTable.java:463)
at org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:407)
at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:271)
at org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:249)
at 
org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 6 more
Caused by: org.apache.cassandra.config.ConfigurationException: comparators do 
not match or are not compatible.
at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:705)
at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:672)
... 12 more



Re: is this a cassandra bug?

2012-09-25 Thread Hiller, Dean
Hmmm, is rowkey validation asynchronous to the actually sending of the
data to cassandra? 

I seem to be able to put an invalid type and GET that invalid data back
just fine even though my type was an int and the comparator was Decimal
BUT then in the logs I see a validation fail exception but I never saw
anything client sideŠin fact, the client READ back the data fine so I am
bit confused hereŠ..1.1.4Š..I tested this on a single node after seeing it
in our 6 node cluster with the same results.

Thanks,
Dean

On 9/25/12 2:13 PM, "Hiller, Dean"  wrote:

>This is cassandra 1.1.4
>
>Describe shows DecimalType and I test setting comparator TO the
>DecimalType and it fails  (Realize I have never touched this column
>family until now except for posting data which succeeded)
>
>[default@unknown] use databus;
>Authenticated to keyspace: databus
>[default@databus] describe bacnet9800AnalogInput9;
>ColumnFamily: bacnet9800AnalogInput9
>  Key Validation Class: org.apache.cassandra.db.marshal.DecimalType
>  Default column value validator:
>org.apache.cassandra.db.marshal.BytesType
>  Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>  GC grace seconds: 864000
>  Compaction min/max thresholds: 4/32
>  Read repair chance: 0.1
>  DC Local Read repair chance: 0.0
>  Replicate on write: true
>  Caching: KEYS_ONLY
>  Bloom Filter FP chance: default
>  Built indexes: []
>  Compaction Strategy:
>org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>  Compression Options:
>sstable_compression:
>org.apache.cassandra.io.compress.SnappyCompressor
>[default@databus] update column family bacnet9800AnalogInput9 with
>comparator = DecimalType;
>org.apache.thrift.transport.TTransportException
>[default@databus]
>
>Exception from system.log from the node in the cluster is
>
>ERROR [MigrationStage:1] 2012-09-25 14:11:20,327
>AbstractCassandraDaemon.java (line 134) Exception in thread
>Thread[MigrationStage:1,5,main]
>java.lang.RuntimeException: java.io.IOException:
>org.apache.cassandra.config.ConfigurationException: comparators do not
>match or are not compatible.
>at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628)
>at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
>java:886)
>at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:908)
>at java.lang.Thread.run(Thread.java:662)
>Caused by: java.io.IOException:
>org.apache.cassandra.config.ConfigurationException: comparators do not
>match or are not compatible.
>at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:676)
>at 
>org.apache.cassandra.db.DefsTable.updateColumnFamily(DefsTable.java:463)
>at 
>org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:407)
>at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:271)
>at org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:249)
>at 
>org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(Definit
>ionsUpdateVerbHandler.java:48)
>at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>... 6 more
>Caused by: org.apache.cassandra.config.ConfigurationException:
>comparators do not match or are not compatible.
>at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:705)
>at org.apache.cassandra.config.CFMetaData.reload(CFMetaData.java:672)
>... 12 more
>



Re: Cassandra failures while moving token

2012-09-25 Thread aaron morton
>  As per our understanding we expect that when we move token then that node 
> will first sync up the data as per the new assigned token & only after that 
> it will receive the requests for new range. 
When you use nodetool move the node will receive write requests for the new 
range. As well as read and write requests for the old range. 


> So not sure why cluster gives a miss as soon as we move token.
Can you explain the query you ran and the consistency level. 

> Is there any way/utility through which we can tell that a particular “row 
> key” is fetched from which node 

Try nodetool 
  getendpoints- Print the end points that owns the key

It wont tell you how a particular query worked, but it will say where a row 
should be.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/09/2012, at 9:39 PM, Shashilpi Krishan 
 wrote:

> Hi
>  
> Actually problem is that while we move the token in a 12 node cluster we 
> observe cassandra misses (no data as per cassandra for requested row key). As 
> per our understanding we expect that when we move token then that node will 
> first sync up the data as per the new assigned token & only after that it 
> will receive the requests for new range. So not sure why cluster gives a miss 
> as soon as we move token.
>  
> Is there any way/utility through which we can tell that a particular “row 
> key” is fetched from which node so as to ensure that token move is completed 
> fine and data is lying on correct new node and also being looked up by 
> cluster on correct node. OR
>  
> Please tell that what is the best way out to change the tokens in the cluster.
> Thanks & Regards
> 
> Shashilpi Krishan
>  
> 
> CONFIDENTIALITY NOTICE
> ==
> This email message and any attachments are for the exclusive use of the 
> intended recipient(s) and may contain confidential and privileged 
> information. Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the sender 
> by reply email and destroy all copies of the original message along with any 
> attachments, from your computer system. If you are the intended recipient, 
> please be advised that the content of this message is subject to access, 
> review and disclosure by the sender's Email System Administrator.



Integrated cassandra

2012-09-25 Thread Robin Verlangen
Hi there,

Is there a way to "embed"/package Cassandra with an other Java application
and maintain control over it? Is this done before? Are there any best
practices?

Why I want to do this? We want to offer as less as configuration as
possible to our customers, but only if it's possible without messing around
in the Cassandra core.

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl



Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.


Re: Can't change replication factor in Cassandra 1.1.2

2012-09-25 Thread Rob Coli
On Wed, Jul 18, 2012 at 10:27 AM, Douglas Muth  wrote:
> Even though keyspace "test1" had a replication_factor of 1 to start
> with, each of the above UPDATE KEYSPACE commands caused a new UUID to
> be generated for the schema, which I assume is normal and expected.

I believe the actual issue you have is "stuck schema for this
keyspace," not anything to do with replication factor. To test this,
try adding a ColumnFamily and see if it works. I bet it won't.

There are anecdotal reports in the 1.0.8-1.1.5 timeframe of this
happening. One of the causes is the one aaron pasted, but I do not
believe that is the only cause of this edge case. As far as I know,
however, there is no JIRA ticket open for "stuck schema for keyspace"
... perhaps you might want to look for and/or open one?

=Rob

-- 
=Robert Coli
AIM>ALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: any ways to have compaction use less disk space?

2012-09-25 Thread Rob Coli
On Sun, Sep 23, 2012 at 12:24 PM, Aaron Turner  wrote:
>> Leveled compaction've tamed space for us. Note that you should set
>> sstable_size_in_mb to reasonably high value (it is 512 for us with ~700GB
>> per node) to prevent creating a lot of small files.
>
> 512MB per sstable?  Wow, that's freaking huge.  From my conversations
> with various developers 5-10MB seems far more reasonable.   I guess it
> really depends on your usage patterns, but that seems excessive to me-
> especially as sstables are promoted.

700gb = 716800mb  / 5mb = 143360

150,000 sstables seem highly unlikely to be performant. As a simple
example of why, on the read path the bloom filter for every sstable
must be consulted...

=Rob

-- 
=Robert Coli
AIM>ALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: compression

2012-09-25 Thread aaron morton
Check the logs on  nodes 2 and 3 to see if the scrub started. The logs on 1 
will be a good help with that. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/09/2012, at 10:31 PM, Tamar Fraenkel  wrote:

> Hi!
> I ran 
> UPDATE COLUMN FAMILY cf_name WITH 
> compression_options={sstable_compression:SnappyCompressor, 
> chunk_length_kb:64};
> 
> I then ran on all my nodes (3)
> sudo nodetool -h localhost scrub tok cf_name
> 
> I have replication factor 3. The size of the data on disk was cut in half in 
> the first node and in the jmx I can see that indeed the compression ration is 
> 0.46. But on nodes 2 and 3 nothing happened. In the jmx I can see that 
> compression ratio is 0 and the size of the files of disk stayed the same.
> 
> In cli 
> 
> ColumnFamily: cf_name
>   Key Validation Class: org.apache.cassandra.db.marshal.UUIDType
>   Default column value validator: org.apache.cassandra.db.marshal.UTF8Type
>   Columns sorted by: 
> org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
>   Row cache size / save period in seconds / keys to save : 0.0/0/all
>   Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
>   Key cache size / save period in seconds: 20.0/14400
>   GC grace seconds: 864000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 1.0
>   Replicate on write: true
>   Bloom Filter FP chance: default
>   Built indexes: []
>   Compaction Strategy: 
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>   Compression Options:
> chunk_length_kb: 64
> sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
> 
> Can anyone help?
> Thanks
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> 
> 
> ta...@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Mon, Sep 24, 2012 at 8:37 AM, Tamar Fraenkel  wrote:
> Thanks all, that helps. Will start with one - two CFs and let you know the 
> effect
> 
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> 
> 
> ta...@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean  wrote:
> As well as your unlimited column names may all have the same prefix, right? 
> Like "accounts".rowkey56, "accounts".rowkey78, etc. etc.  so the "accounts 
> gets a ton of compression then.
> 
> Later,
> Dean
> 
> From: Tyler Hobbs mailto:ty...@datastax.com>>
> Reply-To: "user@cassandra.apache.org" 
> mailto:user@cassandra.apache.org>>
> Date: Sunday, September 23, 2012 11:46 AM
> To: "user@cassandra.apache.org" 
> mailto:user@cassandra.apache.org>>
> Subject: Re: compression
> 
>  column metadata, you're still likely to get a reasonable amount of 
> compression.  This is especially true if there is some amount of repetition 
> in the column names, values, or TTLs in wide rows.  Compression will almost 
> always be beneficial unless you're already somehow CPU bound or are using 
> large column values that are high in entropy, such as pre-compressed or 
> encrypted data.
> 
> 



Re: downgrade from 1.1.4 to 1.0.X

2012-09-25 Thread aaron morton
No. 
Versions are capable of reading previous file formats, but can only create 
files in the current format. 


File formats are listed here 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/Descriptor.java#L52

> Looking for a way to make this work.
I'd suggest contacting DS either through their support forums or directly via 
email. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/09/2012, at 12:53 AM, Arend-Jan Wijtzes  wrote:

> On Thu, Sep 20, 2012 at 10:13:49AM +1200, aaron morton wrote:
>> No. 
>> They use different minor file versions which are not backwards compatible. 
> 
> Thanks Aaron.
> 
> Is upgradesstables capable of downgrading the files to 1.0.8?
> Looking for a way to make this work.
> 
> Regards,
> Arend-Jan
> 
> 
>> On 18/09/2012, at 11:18 PM, Arend-Jan Wijtzes  wrote:
>> 
>>> Hi,
>>> 
>>> We are running Cassandra 1.1.4 and like to experiment with
>>> Datastax Enterprise which uses 1.0.8. Can we safely downgrade
>>> a production cluster or is it incompatible? Any special steps
>>> involved?
> 
> -- 
> Arend-Jan Wijtzes -- Wiseguys -- www.wise-guys.nl



Re:

2012-09-25 Thread Manu Zhang
I wonder now if "get_range_slices" call will ever look for data in row
cache. I don't see it in the codebase. Only the "get" call will check row
cache?

On Wed, Sep 26, 2012 at 12:11 AM, Charles Brophy  wrote:

> There are settings in cassandra.yaml that will _gradually_ reduce the
> available cache to zero if you are under constant memory pressure:
>
>  # Set to 1.0 to disable.  
> reduce_cache_sizes_at: *
> reduce_cache_capacity_to: *
>
> My experience is that the cache size will not return to the configured
> size until a service restart if you leave this enabled.  The text of this
> setting is not explicit about the long-term cache shrinkage, so it's easy
> to think that it will restore the cache to its configured size after the
> pressures have subsided. It won't.
>
> Charles
>
> On Tue, Sep 25, 2012 at 8:14 AM, Manu Zhang wrote:
>
>> I've enabled row cache and set its capacity to 10MB but when I check its
>> size in jconsole it's always 0. Isn't it that a row will be written to row
>> cache if it isn't there when I read the row? I've bulk loaded the data into
>> disk so row cache is crucial to the performance.
>
>
>


Re: Understanding Thread Pools

2012-09-25 Thread aaron morton
The are thrift connection threads. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/09/2012, at 1:32 AM, rohit bhatia  wrote:

> Hi
> 
> What are "pool-2-thread-*" threads. Someone mentioned "Client
> Connection Threads".
> Does that mean Client to cassandra(thrift API) or cassandra to
> cassandra(storage API),
> 
> Thanks
> Rohit



Re: Prevent queries from OOM nodes

2012-09-25 Thread aaron morton
Can you provide some information on the queries and the size of the data they 
traversed ? 

The default maximum size for a single thrift message is 16MB, was it larger 
than that ? 
https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L375

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/09/2012, at 8:33 AM, Bryce Godfrey  wrote:

> Is there anything I can do on the configuration side to prevent nodes from 
> going OOM due to queries that will read large amounts of data and exceed the 
> heap available? 
>  
> For the past few days of we had some nodes consistently freezing/crashing 
> with OOM.  We got a heap dump into MAT and figured out the nodes were dying 
> due to some queries for a few extremely large data sets.  Tracked it back to 
> an app that just didn’t prevent users from doing these large queries, but it 
> seems like Cassandra could be smart enough to guard against this type of 
> thing?
>  
> Basically some kind of setting like “if the data to satisfy query > available 
> heap then throw an error to the caller and about query”.  I would much rather 
> return errors to clients then crash a node, as the error is easier to track 
> down that way and resolve.
>  
> Thanks.



Re: performance for different kinds of row keys

2012-09-25 Thread aaron morton
> Which one will be faster to insert?
In general Composite types have the same performance; the extra work is 
insignificant. 
(Assuming you don't create a type with 100 components.)
   
> And which one will be faster to read by incremental id?
If you have to specify the full key to get a row by row key. So this question 
only applied to the non composite key. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/09/2012, at 9:34 AM, Marcelo Elias Del Valle  wrote:

> Suppose two cases:
> I have a Cassandra column family with non-composite row keys = incremental id
> I have a Cassandra column family with a composite row keys = incremental id 1 
> : group id
>  Which one will be faster to insert? And which one will be faster to read 
> by incremental id?
> 
> Best regards,
> -- 
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr



Re: Cassandra compression not working?

2012-09-25 Thread aaron morton
Nothing jumps out. Are you able to reproduce the fault on a test  node ?

There were some schema change problems in the early 1.1X releases. Did you 
enable compression via a schema change ?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/09/2012, at 9:14 AM, Mike  wrote:

> I forgot to mention we are running Cassandra 1.1.2.
> 
> Thanks,
> -Mike
> 
> On Sep 24, 2012, at 5:00 PM, Michael Theroux  wrote:
> 
>> Hello,
>> 
>> We are running into an unusual situation that I'm wondering if anyone has 
>> any insight on.  We've been running a Cassandra cluster for some time, with 
>> compression enabled on one column family in which text documents are stored. 
>>  We enabled compression on the column family, utilizing the SnappyCompressor 
>> and a 64k chunk length.
>> 
>> It was recently discovered that Cassandra was reporting a compression ratio 
>> of 0.  I took a snapshot of the data and started a cassandra node in 
>> isolation to investigate.
>> 
>> Running nodetool scrub, or nodetool upgradesstables had little impact on the 
>> amount of data that was being stored.
>> 
>> I then disabled compression and ran nodetool upgradesstables on the column 
>> family.  Again, not impact on the data size stored.
>> 
>> I then reenabled compression and ran nodetool upgradesstables on the column 
>> family.  This resulting in a 60% reduction in the data size stored, and 
>> Cassandra reporting a compression ration of about .38.
>> 
>> Any idea what is going on here?  Obviously I can go through this process in 
>> production to enable compression, however, any idea what is currently 
>> happening and why new data does not appear to be compressed?
>> 
>> Any insights are appreciated,
>> Thanks,
>> -Mike



Re:

2012-09-25 Thread Manu Zhang
The DEFAULT_CACHING_STRATEGY is Caching.KEYS_ONLY but even configuring row
cache size to be greater zero
 won't enable row cache. Why?

On Wed, Sep 26, 2012 at 9:44 AM, Manu Zhang  wrote:

> I wonder now if "get_range_slices" call will ever look for data in row
> cache. I don't see it in the codebase. Only the "get" call will check row
> cache?
>
>
> On Wed, Sep 26, 2012 at 12:11 AM, Charles Brophy wrote:
>
>> There are settings in cassandra.yaml that will _gradually_ reduce the
>> available cache to zero if you are under constant memory pressure:
>>
>>  # Set to 1.0 to disable.  
>> reduce_cache_sizes_at: *
>> reduce_cache_capacity_to: *
>>
>> My experience is that the cache size will not return to the configured
>> size until a service restart if you leave this enabled.  The text of this
>> setting is not explicit about the long-term cache shrinkage, so it's easy
>> to think that it will restore the cache to its configured size after the
>> pressures have subsided. It won't.
>>
>> Charles
>>
>> On Tue, Sep 25, 2012 at 8:14 AM, Manu Zhang wrote:
>>
>>> I've enabled row cache and set its capacity to 10MB but when I check its
>>> size in jconsole it's always 0. Isn't it that a row will be written to row
>>> cache if it isn't there when I read the row? I've bulk loaded the data into
>>> disk so row cache is crucial to the performance.
>>
>>
>>
>


1.1.5 Missing Insert! Strange Problem

2012-09-25 Thread Arya Goudarzi
Hi All,

I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and
strategy options for writing a copy to each zone, so the effective load on
each machine is 50%.

Symptom:
I have a column family that has gc grace seconds of 10 days (the default).
On 17th there was an insert done to this column family and from our
application logs I can see that the client got a successful response back
with write consistency of ONE. I can verify the existence of the key that
was inserted in Commitlogs of both replicas however it seams that this
record was never inserted. I used list to get all the column family rows
which were about 800ish, and examine them to see if it could possibly be
deleted by our application. List should have shown them to me since I have
not gone beyond gc grace seconds if this record was deleted during past
days. I could not find it.

Things happened:
During the same time as this insert was happening, I was performing a
rolling upgrade of Cassandra from 1.1.3 to 1.1.5 by taking one node down at
a time, performing the package upgrade and restarting the service and going
to the next node. I could see from system.log that some mutations were
replayed during those restarts, so I suppose the memtables were not flushed
before restart.


Could this procedure cause the row inser to disappear? How could I
troubleshoot as I am running out of ideas.

Your help is greatly appreciated.


Cheers,
=Arya


RE: 1.1.5 Missing Insert! Strange Problem

2012-09-25 Thread Roshni Rajagopal

By any chance is a TTL (time to live ) set on the columns...

Date: Tue, 25 Sep 2012 19:56:19 -0700
Subject: 1.1.5 Missing Insert! Strange Problem
From: gouda...@gmail.com
To: user@cassandra.apache.org

Hi All,
I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and 
strategy options for writing a copy to each zone, so the effective load on each 
machine is 50%.

Symptom:I have a column family that has gc grace seconds of 10 days (the 
default). On 17th there was an insert done to this column family and from our 
application logs I can see that the client got a successful response back with 
write consistency of ONE. I can verify the existence of the key that was 
inserted in Commitlogs of both replicas however it seams that this record was 
never inserted. I used list to get all the column family rows which were about 
800ish, and examine them to see if it could possibly be deleted by our 
application. List should have shown them to me since I have not gone beyond gc 
grace seconds if this record was deleted during past days. I could not find it. 

Things happened:During the same time as this insert was happening, I was 
performing a rolling upgrade of Cassandra from 1.1.3 to 1.1.5 by taking one 
node down at a time, performing the package upgrade and restarting the service 
and going to the next node. I could see from system.log that some mutations 
were replayed during those restarts, so I suppose the memtables were not 
flushed before restart. 


Could this procedure cause the row inser to disappear? How could I troubleshoot 
as I am running out of ideas.
Your help is greatly appreciated.


Cheers,=Arya