Re: High-read latency for non-existing rows with LCS and 1.2.5

2013-06-30 Thread aaron morton
> We are using Leveled Compaction with Cassandra 1.2.5. Our sstable size is 
> 100M. On each node,
> we have anywhere from 700+ to 800+ sstables (for all levels). The 
> bloom_filter_fp_chance is set at 0.000744.
The current default bloom_filter_fp_chance is 0.1 for levelled compaction. 
Reducing this (and running nodetool upgradesstables) will reduce the bloom 
filter size significantly. 

>  the latency is running into hundreds of milliseconds and sometimes seconds.
Check the number of SSTables per read using nodetool cfhistograms. With 
levelled compaction you should not see above 3 
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

Cheers

 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/06/2013, at 7:44 AM, sankalp kohli  wrote:

> Try doing request tracing. 
> http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2
> 
> 
> On Thu, Jun 27, 2013 at 2:40 PM, Bao Le  wrote:
> Hi,
> 
> We are using Leveled Compaction with Cassandra 1.2.5. Our sstable size is 
> 100M. On each node,
> we have anywhere from 700+ to 800+ sstables (for all levels). The 
> bloom_filter_fp_chance is set at 0.000744.
> 
>   For read requests that ask for existing row records, the latency is great, 
> mostly under 20 milliseconds with key cache and row cache set. For read 
> requests that ask for non-existing row records (not because of deletes, but 
> rather, have never been in the system to start with), the latency is running 
> into hundreds of milliseconds and sometimes seconds.
> 
>   Just wonder if anyone has come across this before and has some pointers on 
> how to reduce the latency in this case.
> 
> Thanks
> Bao
> 
> 
> 
> 



RE: Problem using sstableloader with SSTableSimpleUnsortedWriter and a composite key

2013-06-30 Thread Peer, Oded
Thank you Aaaron!

Your blog post helped me understand how a row with a compound key is stored and 
this helped me understand how to create the sstable files.
For anyone who needs it this is how it works:

In Cassandra-cli the row looks like this:
RowKey: 5
=> (column=10:created, value=013f84be6288, timestamp=137232163700)

>From this we see that the row key is a single Long value "5", and it has one 
>composite column "10:created" with a timestamp value.
Thus the code should look like this:

   File directory = new File( System.getProperty( "output" ) );
   IPartitioner partitioner = new Murmur3Partitioner();
   String keyspace = "test_keyspace";
   String columnFamily = "test_table";
   List> compositeList = new ArrayList>();
   compositeList.add( LongType.instance );
   compositeList.add( LongType.instance );
   CompositeType compositeType = CompositeType.getInstance( compositeList );
   SSTableSimpleUnsortedWriter sstableWriter = new SSTableSimpleUnsortedWriter(
  directory,
  partitioner,
  keyspace,
  columnFamily,
  compositeType,
  null,
  64 );
   long timestamp = 1372321637000L;
   long nanotimestamp = timestamp * 1000;
   long k1 = 5L;
   long k2 = 10L;
   sstableWriter.newRow( bytes( k1 ) );
   sstableWriter.addColumn( compositeType.builder().add( bytes( k2 ) ).add( 
bytes( "created" ) ).build(), bytes( timestamp ), nanotimestamp );
   sstableWriter.close();





Patterns for enabling Compute apps which only request Local Node's

2013-06-30 Thread rektide
Data 
Reply-To: 

Hello Cassandra-user ml, how is everyone?

Question; if we're co-locating our Cassandra and our compute application on the 
same nodes, are there any in-use
patterns in Cassandra user (or Cassandra dev) applications for having the 
compute application only pull data off the
localhost Cassandra process? If we have the ability to manage where we do 
compute, what options are there for keeping
compute happening on local data as much as possible? 

In the best case I can imagine:

If I have a KeyRange for a ColumnParent, there would be some way to know I'm 
going to fulfil that scan while only having
each node pull it's own local data (achieve read consistency via digest).

If it helps, I'd be glad to entertain options that only worked when using 
virtual nodes.

Regards,
rektide


Re: Cassandra terminates with OutOfMemory (OOM) error

2013-06-30 Thread Mohammed Guller
Yes, it is one read request.

Since Cassandra does not support GROUP BY, I was trying to implement it in our 
application. Hence the need to read large amount of data.  I guess that was a 
bad idea.

Mohammed

On Jun 27, 2013, at 9:54 PM, "aaron morton" 
mailto:aa...@thelastpickle.com>> wrote:

If our application tries to read 80,000 columns each from 10 or more rows at 
the same time, some of the nodes run out of heap space and terminate with OOM 
error.
Is this in one read request ?

Reading 80K columns is too many, try reading a few hundred at most.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 26/06/2013, at 3:57 PM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:

Replication is 3 and read consistency level is one. One of the non-cordinator 
mode is crashing, so the OOM is happening before aggregation of the data to be 
returned.

Thanks for the info about the space allocated to young generation heap. That is 
helpful.

Mohammed

On Jun 25, 2013, at 1:28 PM, "sankalp kohli" 
mailto:kohlisank...@gmail.com>> wrote:

Your young gen is 1/4 of 1.8G which is 450MB. Also in slice queries, the 
co-ordinator will get the results from replicas as per consistency level used 
and merge the results before returning to the client.
What is the replication in your keyspace and what consistency you are reading 
with.
Also 55MB on disk will not mean 55MB in memory. The data is compressed on disk 
and also there are other overheads.



On Mon, Jun 24, 2013 at 8:38 PM, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
No deletes. In my test, I am just writing and reading data.

There is a lot of GC, but only on the younger generation. Cassandra terminates 
before the GC for old generation kicks in.

I know that our queries are reading an unusual amount of data. However, I 
expected it to throw a timeout exception instead of crashing. Also, don't 
understand why 1.8 Gb heap is getting full when the total data stored in the 
entire Cassandra cluster is less than 55 MB.

Mohammed

On Jun 21, 2013, at 7:30 PM, "sankalp kohli" 
mailto:kohlisank...@gmail.com>> wrote:

Looks like you are putting lot of pressure on the heap by doing a slice query 
on a large row.
Do you have lot of deletes/tombstone on the rows? That might be causing a 
problem.
Also why are you returning so many columns as once, you can use auto paginate 
feature in Astyanax.

Also do you see lot of GC happening?


On Fri, Jun 21, 2013 at 1:13 PM, Jabbar Azam 
mailto:aja...@gmail.com>> wrote:
Hello Mohammed,

You should increase the heap space. You should also tune the garbage collection 
so young generation objects are collected faster, relieving pressure on heap We 
have been using jdk 7 and it uses G1 as the default collector. It does a better 
job than me trying to optimise the JDK 6 GC collectors.

Bear in mind though that the OS will need memory, so will the row cache and the 
filing system. Although memory usage will depend on the workload of your system.

I'm sure you'll also get good advice from other members of the mailing list.

Thanks

Jabbar Azam


On 21 June 2013 18:49, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
We have a 3-node cassandra cluster on AWS. These nodes are running cassandra 
1.2.2 and have 8GB memory. We didn't change any of the default heap or GC 
settings. So each node is allocating 1.8GB of heap space. The rows are wide; 
each row stores around 260,000 columns. We are reading the data using Astyanax. 
If our application tries to read 80,000 columns each from 10 or more rows at 
the same time, some of the nodes run out of heap space and terminate with OOM 
error. Here is the error message:

java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:50)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.split(AbstractCompositeType.java:126)
at 
org.apache.cassandra.db.filter.ColumnCounter$GroupByPrefix.count(ColumnCounter.java:96)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:164)
at 
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
at 
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1363)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1220)
at 
org

Re: Errors while upgrading from 1.1.10 version to 1.2.4 version

2013-06-30 Thread Ananth Gundabattula
Thanks for the pointer Fabien.


From: Fabien Rousseau mailto:fab...@yakaz.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Friday, June 28, 2013 6:35 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Errors while upgrading from 1.1.10 version to 1.2.4 version

Hello,

Have a look at : https://issues.apache.org/jira/browse/CASSANDRA-5476


2013/6/28 Ananth Gundabattula 
mailto:agundabatt...@threatmetrix.com>>
Hello Everybody,

We were performing an upgrade of our cluster from 1.1.10 version to 1.2.4 . We 
tested the upgrade process in a QA environment and found no issues. However in 
the production node, we faced loads of errors and had to abort the upgrade 
process.

I was wondering how we ran into such a situation. The main difference between 
the QA environment and the production environments is the Replication Factor. 
In QA , RF=1 and in production RF=3.

Example stack traces are  as seen on the other nodes are : 
http://pastebin.com/fSnMAd8q

The other observation is that the node which was being upgraded is a seed node 
in the 1.1.10. We aborted right after the first node gave the above issues. 
Does this mean that there will be an application downtime required if we go for 
rolling upgrade on a live cluster from 1.1.10 version to 1.2.4 version ?

Regards,
Ananth







--
Fabien Rousseau

[http://www.yakaz.com/img/logo_yakaz_small.png]
www.yakaz.com


Re: CompactionExecutor holds 8000+ SSTableReader 6G+ memory

2013-06-30 Thread sulong
These two fields:
CompressedRandomAccessReader.buffer
CompressedRandomAccessReader.compressed

in the queue SSTableReader.dfile.pool consumed those memory. I think the
SSTableReader.dfile is the cache of the SSTable file.


On Sat, Jun 29, 2013 at 1:09 PM, aaron morton wrote:

> Lots of memory are consumed by the SSTableReader's cache
>
>   The file cache is managed by the OS.
> However the SSTableReader will have bloom filters and compression meta
> data, both off heap in 1.2. The Key and Row caches are global so not
> associated with any one SStable.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 28/06/2013, at 6:23 PM, sulong  wrote:
>
> Total 100G data per node.
>
>
> On Fri, Jun 28, 2013 at 2:14 PM, sulong  wrote:
>
>> aaron, thanks for your reply. Yes, I do use the Leveled compactions
>> strategy, and the SSTable size is 10M. If it happens again, I will try to
>> enlarge the sstable size.
>>
>> I just wonder why cassandra doesn't limit the SSTableReader's total
>> memory usage when compacting. Lots of memory are consumed by the
>> SSTableReader's cache. Why not clear these cache first at the beginning of
>> compaction?
>>
>>
>> On Fri, Jun 28, 2013 at 1:14 PM, aaron morton wrote:
>>
>>> Are you running the Levelled compactions strategy ?
>>> If so what is the max SSTable size and what is the total data per node?
>>>
>>>  If you are running it try using a larger SSTable size like 32MB
>>>
>>> Cheers
>>>
>>>-
>>> Aaron Morton
>>> Freelance Cassandra Consultant
>>> New Zealand
>>>
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 27/06/2013, at 2:02 PM, sulong  wrote:
>>>
>>> According to  the OpsCenter records, yes,  the compaction was running
>>> then, 8.5mb /s
>>>
>>>
>>> On Thu, Jun 27, 2013 at 9:54 AM, sulong  wrote:
>>>
 version: 1.2.2
 cluster read requests 800/s, write request 22/s
 Sorrry, I don't know whether  the compaction was running then.


 On Thu, Jun 27, 2013 at 1:02 AM, Robert Coli wrote:

> On Tue, Jun 25, 2013 at 10:13 PM, sulong  wrote:
> > I have 4 nodes cassandra cluster. Every node has 32G memory, and the
> > cassandra jvm uses 8G. The cluster is suffering from gc. Looks like
> > CompactionExecutor thread holds too many SSTableReader. See the
> attachement.
>
> What version of Cassandra?
> What workload?
> Is compaction actually running?
>
> =Rob
>


>>>
>>>
>>
>
>