Re: Cassandra Counters

2012-09-25 Thread Robin Verlangen
>From my point of view an other problem with using the "standard column family" for counting is transactions. Cassandra lacks of them, so if you're multithreaded updating counters, how will you keep track of that? Yes, I'm aware of software like Zookeeper to do that, however I'm not sure whether th

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Віталій Тимчишин
See my comments inline 2012/9/25 Aaron Turner > On Mon, Sep 24, 2012 at 10:02 AM, Віталій Тимчишин > wrote: > > Why so? > > What are pluses and minuses? > > As for me, I am looking for number of files in directory. > > 700GB/512MB*5(files per SST) = 7000 files, that is OK from my view. > > 700G

a node stays in joining

2012-09-25 Thread Satoshi Yamada
hi, One node in my cluster stay in "joining". I found a jira about this, which is fixed,but still sees the similar thing. This is a node I remove the token first becauseit did not boot correctly and re-joined in the cluster without any pre-set token(shouldI set the previous token?). As you see b

Re: Cassandra Counters

2012-09-25 Thread Edward Kibardin
I've recently noticed several threads about Cassandra Counters inconsistencies and started seriously think about possible workarounds like store realtime counters in Redis and dump them daily to Cassandra. So general question, should I rely on Counters if I want 100% accuracy? Thanks, Ed On Tue,

Re: [problem with OOM in nodes]

2012-09-25 Thread Denis Gabaydulin
Thanks a lot for helping. We came to the same decision clustering one report to multiple cassandra rows (sorted buckets of report rows) and manage clusters on client side. On Tue, Sep 25, 2012 at 5:28 AM, aaron morton wrote: > What exactly is the problem with big rows? > > During compaction the r

The compaction task cannot delete sstables which are used in a repair session

2012-09-25 Thread Rene Kochen
Is this a bug? I'm using Cassandra 1.0.11: INFO 13:45:43,750 Compacting [SSTableReader(path='d:\data\Traxis\Parameters-hd-47-Data.db'), SSTableReader(path='d:\data\Traxis\Parameters-hd-44-Data.db'), SSTableReader(path='d:\data\Traxis\Parameters-hd-46-Data.db'), SSTableReader(path='d:\data\Traxis\P

Re: Cassandra Counters

2012-09-25 Thread rohit bhatia
@Edward, We use counters in production with Cassandra 1.0.5. Though since our application is sensitive to write latency and we are seeing problems with Frequent Young Garbage Collections, and also we just do increments (decrements have caused problems for some people) We don't see inconsistencies

Re: Correct model

2012-09-25 Thread Hiller, Dean
If you need anything added/fixed, just let PlayOrm know. PlayOrm has been able to quickly add so far…that may change as more and more requests come but so far PlayOrm seems to have managed to keep up. We are using it live by the way already. It works out very well so far for us (We have 5000

Re: Cassandra Counters

2012-09-25 Thread Sylvain Lebresne
> > So general question, should I rely on Counters if I want 100% accuracy? > No. Even not considering potential bugs, counters being not idempotent, if you get a TimeoutException during a write (which can happen even in relatively normal conditions), you won't know if the increment went in or n

Re: Cassandra Counters

2012-09-25 Thread rohit bhatia
@Sylvain In a relatively untroubled cluster, even timed out writes go through, provided no messages are dropped. Which you can monitor on cassandra nodes. We have 100% consistency on our production servers as we don't see messages being dropped on our servers. Though as you mention, there would be

Re: Cassandra Counters

2012-09-25 Thread Edward Kibardin
@Sylvain and @Rohit: Thanks for your answers. On Tue, Sep 25, 2012 at 2:27 PM, Sylvain Lebresne wrote: > So general question, should I rely on Counters if I want 100% accuracy? >> > > No. > > Even not considering potential bugs, counters being not idempotent, if > you get a TimeoutException dur

Re: Cassandra Counters

2012-09-25 Thread Sylvain Lebresne
> In a relatively untroubled cluster, even timed out writes go through, > provided no messages are dropped. This all depends of your definition of "untroubled" cluster, but to be clear, in a cluster where a node dies (which for Cassandra is not considered abnormal and will happen to everyone no ma

Re: Correct model

2012-09-25 Thread Marcelo Elias Del Valle
Dean, In the playOrm data modeling, if I understood it correctly, every CF has its own id, right? For instance, User would have its own ID, Activities would have its own id, etc. What if I have a trillion activities? Wouldn't be a problem to have 1 row id for each activity? Cassandra alwa

Re: Correct model

2012-09-25 Thread Hiller, Dean
Just fyi that some of these are cassandra questions… Dean, In the playOrm data modeling, if I understood it correctly, every CF has its own id, right? No, each entity has a field annotated with @NoSqlId. That tells playOrm this is the row key. Each INSTANCE of the entity is a row in cass

Running repair negatively impacts read performance?

2012-09-25 Thread Charles Brophy
Hey guys, I've begun to notice that read operations take a performance nose-dive after a standard (full) repair of a fairly large column family: ~11 million records. Interestingly, I've then noticed that read performance returns to normal after a full scrub of the column family. Is it possible tha

Re:

2012-09-25 Thread Charles Brophy
There are settings in cassandra.yaml that will _gradually_ reduce the available cache to zero if you are under constant memory pressure: # Set to 1.0 to disable. reduce_cache_sizes_at: * reduce_cache_capacity_to: * My experience is that the cache size will not return to the configured size unt

Re: Correct model

2012-09-25 Thread Hiller, Dean
Oh, and if you really want to scale very easily, just use play framework 1.2.5 ;). We use that and since it is stateless, to scale up, you simple add more servers. Also, it's like coding in php or ruby, etc. etc as far as development speed(no server restarts) so it's a pretty nice framework. We t

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Aaron Turner
On Tue, Sep 25, 2012 at 10:36 AM, Віталій Тимчишин wrote: > See my comments inline > > 2012/9/25 Aaron Turner >> >> On Mon, Sep 24, 2012 at 10:02 AM, Віталій Тимчишин >> wrote: >> > Why so? >> > What are pluses and minuses? >> > As for me, I am looking for number of files in directory. >> > 700G

Re: unsubscribe

2012-09-25 Thread Eric Evans
On Tue, Sep 25, 2012 at 1:23 PM, puneet loya wrote: > http://goo.gl/JcMcr -- Eric Evans Acunu | http://www.acunu.com | @acunu

is this a cassandra bug?

2012-09-25 Thread Hiller, Dean
This is cassandra 1.1.4 Describe shows DecimalType and I test setting comparator TO the DecimalType and it fails (Realize I have never touched this column family until now except for posting data which succeeded) [default@unknown] use databus; Authenticated to keyspace: databus [default@da

Re: is this a cassandra bug?

2012-09-25 Thread Hiller, Dean
Hmmm, is rowkey validation asynchronous to the actually sending of the data to cassandra? I seem to be able to put an invalid type and GET that invalid data back just fine even though my type was an int and the comparator was Decimal BUT then in the logs I see a validation fail exception but I ne

Re: Cassandra failures while moving token

2012-09-25 Thread aaron morton
> As per our understanding we expect that when we move token then that node > will first sync up the data as per the new assigned token & only after that > it will receive the requests for new range. When you use nodetool move the node will receive write requests for the new range. As well as

Integrated cassandra

2012-09-25 Thread Robin Verlangen
Hi there, Is there a way to "embed"/package Cassandra with an other Java application and maintain control over it? Is this done before? Are there any best practices? Why I want to do this? We want to offer as less as configuration as possible to our customers, but only if it's possible without me

Re: Can't change replication factor in Cassandra 1.1.2

2012-09-25 Thread Rob Coli
On Wed, Jul 18, 2012 at 10:27 AM, Douglas Muth wrote: > Even though keyspace "test1" had a replication_factor of 1 to start > with, each of the above UPDATE KEYSPACE commands caused a new UUID to > be generated for the schema, which I assume is normal and expected. I believe the actual issue you

Re: any ways to have compaction use less disk space?

2012-09-25 Thread Rob Coli
On Sun, Sep 23, 2012 at 12:24 PM, Aaron Turner wrote: >> Leveled compaction've tamed space for us. Note that you should set >> sstable_size_in_mb to reasonably high value (it is 512 for us with ~700GB >> per node) to prevent creating a lot of small files. > > 512MB per sstable? Wow, that's freaki

Re: compression

2012-09-25 Thread aaron morton
Check the logs on nodes 2 and 3 to see if the scrub started. The logs on 1 will be a good help with that. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/09/2012, at 10:31 PM, Tamar Fraenkel wrote: > Hi! > I ran > UPDATE COLUMN FAMI

Re: downgrade from 1.1.4 to 1.0.X

2012-09-25 Thread aaron morton
No. Versions are capable of reading previous file formats, but can only create files in the current format. File formats are listed here https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/Descriptor.java#L52 > Looking for a way to make this work. I'd sugg

Re:

2012-09-25 Thread Manu Zhang
I wonder now if "get_range_slices" call will ever look for data in row cache. I don't see it in the codebase. Only the "get" call will check row cache? On Wed, Sep 26, 2012 at 12:11 AM, Charles Brophy wrote: > There are settings in cassandra.yaml that will _gradually_ reduce the > available cach

Re: Understanding Thread Pools

2012-09-25 Thread aaron morton
The are thrift connection threads. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/09/2012, at 1:32 AM, rohit bhatia wrote: > Hi > > What are "pool-2-thread-*" threads. Someone mentioned "Client > Connection Threads". > Does that mean

Re: Prevent queries from OOM nodes

2012-09-25 Thread aaron morton
Can you provide some information on the queries and the size of the data they traversed ? The default maximum size for a single thrift message is 16MB, was it larger than that ? https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L375 Cheers - Aaron Morton Free

Re: performance for different kinds of row keys

2012-09-25 Thread aaron morton
> Which one will be faster to insert? In general Composite types have the same performance; the extra work is insignificant. (Assuming you don't create a type with 100 components.) > And which one will be faster to read by incremental id? If you have to specify the full key to get a row by ro

Re: Cassandra compression not working?

2012-09-25 Thread aaron morton
Nothing jumps out. Are you able to reproduce the fault on a test node ? There were some schema change problems in the early 1.1X releases. Did you enable compression via a schema change ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25

Re:

2012-09-25 Thread Manu Zhang
The DEFAULT_CACHING_STRATEGY is Caching.KEYS_ONLY but even configuring row cache size to be greater zero won't enable row cache. Why? On Wed, Sep 26, 2012 at 9:44 AM, Manu Zhang wrote: > I wonder now if "get_range_slices" call will ever look for data in row > cache. I don't see it in the codeba

1.1.5 Missing Insert! Strange Problem

2012-09-25 Thread Arya Goudarzi
Hi All, I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and strategy options for writing a copy to each zone, so the effective load on each machine is 50%. Symptom: I have a column family that has gc grace seconds of 10 days (the default). On 17th there was an insert done t

RE: 1.1.5 Missing Insert! Strange Problem

2012-09-25 Thread Roshni Rajagopal
By any chance is a TTL (time to live ) set on the columns... Date: Tue, 25 Sep 2012 19:56:19 -0700 Subject: 1.1.5 Missing Insert! Strange Problem From: gouda...@gmail.com To: user@cassandra.apache.org Hi All, I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and strategy op