Re: Cassandra 1.x and proper JNA setup

2011-11-03 Thread Maciej Miklas
According to source code, JNA is being used to call malloc and free. In
this case each cached row will be serialized into RAM.
We must be really careful when defining cache size - to large size would
cause out of memory. Previous Cassandra releases has logic that would
decrease cache size if heap is low.
Currently each row will be serialized without any memory limit checks -
assuming that I understood it right.

Those properties:
   reduce_cache_sizes_at: 0.85
   reduce_cache_capacity_to: 0.6
are not used anymore - at least not when JNA is enabled, witch is default
from Cassandra 1.0


On Wed, Nov 2, 2011 at 1:53 PM, Maciej Miklas wrote:

> I've just found, that JNA will be not used from 1.1 release -
> https://issues.apache.org/jira/browse/CASSANDRA-3271
> I would be also nice to know what was the reason for this decision.
>
> Regards,
> Maciej
>
>
> On Wed, Nov 2, 2011 at 1:34 PM, Viktor Jevdokimov <
> viktor.jevdoki...@adform.com> wrote:
>
>> Up, also interested in answers to questions below.
>>
>>
>> Best regards/ Pagarbiai
>>
>> Viktor Jevdokimov
>> Senior Developer
>>
>> Email: viktor.jevdoki...@adform.com
>> Phone: +370 5 212 3063
>> Fax: +370 5 261 0453
>>
>> J. Jasinskio 16C,
>> LT-01112 Vilnius,
>> Lithuania
>>
>>
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.-Original Message-
>> From: Maciej Miklas [mailto:mac.mik...@googlemail.com]
>> Sent: Tuesday, November 01, 2011 11:15
>> To: user@cassandra.apache.org
>> Subject: Cassandra 1.x and proper JNA setup
>>
>> Hi all,
>>
>> is there any documentation about proper JNA configuration?
>>
>> I do not understand few things:
>>
>> 1) Does JNA use JVM heap settings?
>>
>> 2) Do I need to decrease max heap size while using JNA?
>>
>> 3) How do I limit RAM allocated by JNA?
>>
>> 4) Where can I see / monitor row cache size?
>>
>> 5) I've configured JNA just for test on my dev computer and so far I've
>> noticed serious performance issues (high cpu usage on heavy write load), so
>> I must be doing something wrong I've just copied JNA jars into
>> Cassandra/lib, without installing any native libs. This should not work at
>> all, right?
>>
>> Thanks,
>> Maciej
>>
>>
>


Re: Second Cassandra users survey

2011-11-03 Thread Peter Tillotson
I'm using Cassandra as a big graph database, loading large volumes of data live 
and linking on the fly. 
The number of edges grow geometrically with data added, and need to be read to 
continue linking the graph on the fly. 


Consequently, my problem is constrained by:
 * Predominantly read - especially when data gets large and reads are quasi 
random
 * I have lots of data to plow in, to be read
 * Although the problem scale out and possibly all be in RAM, it requires too 
much kit for the to be viable 

So, my findings with Cassandra are:
 * Compaction is expensive, I need it but
   1) It takes away disk IO from my reads
   2) Destroys the file cache
   I've not had chance to do extensive tests with the Level db compaction
 * Compaction has been too hard to configure historically
 * Memory hungry

So for me the biggest features would be
 * Cheaper compaction -   
 * Lower memory usage
 * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
   I do a lot of checking against dynamic colnames  
 
The great features are that redundancy, and live addition of shards is 
available out of the box. 


I've also experimented with Golden Orb and Triggered updates, I think there is 
a fair bit that can be achieved in my problem with local data access. Through 
GoldenOrb and Hadoop writables a managed to get both a BigTable and Pregel 
access model onto my Cassandra data. It was schema specific, but provided a 
local compute model. 

p 



From: Jonathan Ellis 
To: user 
Sent: Tuesday, 1 November 2011, 22:59
Subject: Second Cassandra users survey

Hi all,

Two years ago I asked for Cassandra use cases and feature requests.
[1]  The results [2] have been extremely useful in setting and
prioritizing goals for Cassandra development.  But with the release of
1.0 we've accomplished basically everything from our original wish
list. [3]

I'd love to hear from modern Cassandra users again, especially if
you're usually a quiet lurker.  What does Cassandra do well?  What are
your pain points?  What's your feature wish list?

As before, if you're in stealth mode or don't want to say anything in
public, feel free to reply to me privately and I will keep it off the
record.

[1] http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
[2] 
http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
[3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Second Cassandra users survey

2011-11-03 Thread Radim Kolar

 * Compaction is expensive
Yes, it is. Thats why i deciced not to go with hadoop hdfs backed by 
cassandra.


Re: Second Cassandra users survey

2011-11-03 Thread Mohit Anchlia
On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson  wrote:
> I'm using Cassandra as a big graph database, loading large volumes of data
> live and linking on the fly.

Not sure if Cassandra is right fit to model complex vertexes and edges.

> The number of edges grow geometrically with data added, and need to be read
> to continue linking the graph on the fly.
>
> Consequently, my problem is constrained by:
>  * Predominantly read - especially when data gets large and reads are quasi
> random
>  * I have lots of data to plow in, to be read
>  * Although the problem scale out and possibly all be in RAM, it requires
> too much kit for the to be viable
> So, my findings with Cassandra are:
>  * Compaction is expensive, I need it but
>    1) It takes away disk IO from my reads
>    2) Destroys the file cache
>    I've not had chance to do extensive tests with the Level db compaction
>  * Compaction has been too hard to configure historically
>  * Memory hungry
> So for me the biggest features would be
>  * Cheaper compaction -
>  * Lower memory usage
>  * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
>    I do a lot of checking against dynamic colnames

I agree, some kind of integration with search engine is required to
support adhoc queries as well and searching on column names. This will
be really helpful.

Currently, one of the options is to write in 2 places. Cassandra +
search engine.
>
> The great features are that redundancy, and live addition of shards is
> available out of the box.
>
> I've also experimented with Golden Orb and Triggered updates, I think there
> is a fair bit that can be achieved in my problem with local data access.
> Through GoldenOrb and Hadoop writables a managed to get both a BigTable and
> Pregel access model onto my Cassandra data. It was schema specific, but
> provided a local compute model.
> p
> 
> From: Jonathan Ellis 
> To: user 
> Sent: Tuesday, 1 November 2011, 22:59
> Subject: Second Cassandra users survey
>
> Hi all,
>
> Two years ago I asked for Cassandra use cases and feature requests.
> [1]  The results [2] have been extremely useful in setting and
> prioritizing goals for Cassandra development.  But with the release of
> 1.0 we've accomplished basically everything from our original wish
> list. [3]
>
> I'd love to hear from modern Cassandra users again, especially if
> you're usually a quiet lurker.  What does Cassandra do well?  What are
> your pain points?  What's your feature wish list?
>
> As before, if you're in stealth mode or don't want to say anything in
> public, feel free to reply to me privately and I will keep it off the
> record.
>
> [1]
> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
> [2]
> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>
>
>


Re: Second Cassandra users survey

2011-11-03 Thread Peter Tillotson
>>  * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
>>    I do a lot of checking against dynamic colnames
>
>I agree, some kind of integration with search engine is required to
>support adhoc queries as well and searching on column names. This will
>be really helpful.
>
>Currently, one of the options is to write in 2 places. Cassandra +
>search engine.
>

I thought a disk backed skiplist, with every nth rowkey:colkey dragged into 
memory per sstable as per Lucene TermEnum.  



From: Mohit Anchlia 
To: user@cassandra.apache.org; Peter Tillotson 
Sent: Thursday, 3 November 2011, 14:15
Subject: Re: Second Cassandra users survey

On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson  wrote:
> I'm using Cassandra as a big graph database, loading large volumes of data
> live and linking on the fly.

Not sure if Cassandra is right fit to model complex vertexes and edges.

> The number of edges grow geometrically with data added, and need to be read
> to continue linking the graph on the fly.
>
> Consequently, my problem is constrained by:
>  * Predominantly read - especially when data gets large and reads are quasi
> random
>  * I have lots of data to plow in, to be read
>  * Although the problem scale out and possibly all be in RAM, it requires
> too much kit for the to be viable
> So, my findings with Cassandra are:
>  * Compaction is expensive, I need it but
>    1) It takes away disk IO from my reads
>    2) Destroys the file cache
>    I've not had chance to do extensive tests with the Level db compaction
>  * Compaction has been too hard to configure historically
>  * Memory hungry
> So for me the biggest features would be
>  * Cheaper compaction -
>  * Lower memory usage
>  * Indexing dynamic colnames (eg Lucene TermEnum against rowkey:colkey)
>    I do a lot of checking against dynamic colnames

I agree, some kind of integration with search engine is required to
support adhoc queries as well and searching on column names. This will
be really helpful.

Currently, one of the options is to write in 2 places. Cassandra +
search engine.
>
> The great features are that redundancy, and live addition of shards is
> available out of the box.
>
> I've also experimented with Golden Orb and Triggered updates, I think there
> is a fair bit that can be achieved in my problem with local data access.
> Through GoldenOrb and Hadoop writables a managed to get both a BigTable and
> Pregel access model onto my Cassandra data. It was schema specific, but
> provided a local compute model.
> p
> 
> From: Jonathan Ellis 
> To: user 
> Sent: Tuesday, 1 November 2011, 22:59
> Subject: Second Cassandra users survey
>
> Hi all,
>
> Two years ago I asked for Cassandra use cases and feature requests.
> [1]  The results [2] have been extremely useful in setting and
> prioritizing goals for Cassandra development.  But with the release of
> 1.0 we've accomplished basically everything from our original wish
> list. [3]
>
> I'd love to hear from modern Cassandra users again, especially if
> you're usually a quiet lurker.  What does Cassandra do well?  What are
> your pain points?  What's your feature wish list?
>
> As before, if you're in stealth mode or don't want to say anything in
> public, feel free to reply to me privately and I will keep it off the
> record.
>
> [1]
> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
> [2]
> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>
>
>

Re: Cassandra 1.x and proper JNA setup

2011-11-03 Thread Jonathan Ellis
Relying on that was always a terrible idea because you could easily
OOM before it could help.  There's no substitute for "don't make the
caches too large" in the first place.

We're working on https://issues.apache.org/jira/browse/CASSANDRA-3143
to make cache sizing easier.

On Thu, Nov 3, 2011 at 3:16 AM, Maciej Miklas  wrote:
> According to source code, JNA is being used to call malloc and free. In this
> case each cached row will be serialized into RAM.
> We must be really careful when defining cache size - to large size would
> cause out of memory. Previous Cassandra releases has logic that would
> decrease cache size if heap is low.
> Currently each row will be serialized without any memory limit checks -
> assuming that I understood it right.
>
> Those properties:
>    reduce_cache_sizes_at: 0.85
>    reduce_cache_capacity_to: 0.6
> are not used anymore - at least not when JNA is enabled, witch is default
> from Cassandra 1.0
>
>
> On Wed, Nov 2, 2011 at 1:53 PM, Maciej Miklas 
> wrote:
>>
>> I've just found, that JNA will be not used from 1.1 release -
>> https://issues.apache.org/jira/browse/CASSANDRA-3271
>> I would be also nice to know what was the reason for this decision.
>>
>> Regards,
>> Maciej
>>
>> On Wed, Nov 2, 2011 at 1:34 PM, Viktor Jevdokimov
>>  wrote:
>>>
>>> Up, also interested in answers to questions below.
>>>
>>>
>>> Best regards/ Pagarbiai
>>>
>>> Viktor Jevdokimov
>>> Senior Developer
>>>
>>> Email: viktor.jevdoki...@adform.com
>>> Phone: +370 5 212 3063
>>> Fax: +370 5 261 0453
>>>
>>> J. Jasinskio 16C,
>>> LT-01112 Vilnius,
>>> Lithuania
>>>
>>>
>>>
>>> Disclaimer: The information contained in this message and attachments is
>>> intended solely for the attention and use of the named addressee and may be
>>> confidential. If you are not the intended recipient, you are reminded that
>>> the information remains the property of the sender. You must not use,
>>> disclose, distribute, copy, print or rely on this e-mail. If you have
>>> received this message in error, please contact the sender immediately and
>>> irrevocably delete this message and any copies.-Original Message-
>>> From: Maciej Miklas [mailto:mac.mik...@googlemail.com]
>>> Sent: Tuesday, November 01, 2011 11:15
>>> To: user@cassandra.apache.org
>>> Subject: Cassandra 1.x and proper JNA setup
>>>
>>> Hi all,
>>>
>>> is there any documentation about proper JNA configuration?
>>>
>>> I do not understand few things:
>>>
>>> 1) Does JNA use JVM heap settings?
>>>
>>> 2) Do I need to decrease max heap size while using JNA?
>>>
>>> 3) How do I limit RAM allocated by JNA?
>>>
>>> 4) Where can I see / monitor row cache size?
>>>
>>> 5) I've configured JNA just for test on my dev computer and so far I've
>>> noticed serious performance issues (high cpu usage on heavy write load), so
>>> I must be doing something wrong I've just copied JNA jars into
>>> Cassandra/lib, without installing any native libs. This should not work at
>>> all, right?
>>>
>>> Thanks,
>>> Maciej
>>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Second Cassandra users survey

2011-11-03 Thread Ertio Lew
Provide an option to sort columns by timestamp i.e, in the order they have
been added to the row, with the facility to use any column names.

On Wed, Nov 2, 2011 at 4:29 AM, Jonathan Ellis  wrote:

> Hi all,
>
> Two years ago I asked for Cassandra use cases and feature requests.
> [1]  The results [2] have been extremely useful in setting and
> prioritizing goals for Cassandra development.  But with the release of
> 1.0 we've accomplished basically everything from our original wish
> list. [3]
>
> I'd love to hear from modern Cassandra users again, especially if
> you're usually a quiet lurker.  What does Cassandra do well?  What are
> your pain points?  What's your feature wish list?
>
> As before, if you're in stealth mode or don't want to say anything in
> public, feel free to reply to me privately and I will keep it off the
> record.
>
> [1]
> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
> [2]
> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: data model for unique users in a time period

2011-11-03 Thread David Jeske
On Wed, Nov 2, 2011 at 7:26 PM, David Jeske  wrote:

> - make sure the summarizer does try to do it's job for a batch of counters
> until they are fully replicated and 'static' (no new increments will appear)
>
Apologies. make the summarizer ( doesn't ) try to do it's job...


Re: Second Cassandra users survey

2011-11-03 Thread Konstantin Naryshkin
I realize that it is not realistic to expect it, but is would be good
to have a Partitioner that supports both range slices and automatic
load balancing.

On Thu, Nov 3, 2011 at 13:57, Ertio Lew  wrote:
> Provide an option to sort columns by timestamp i.e, in the order they have
> been added to the row, with the facility to use any column names.
>
> On Wed, Nov 2, 2011 at 4:29 AM, Jonathan Ellis  wrote:
>>
>> Hi all,
>>
>> Two years ago I asked for Cassandra use cases and feature requests.
>> [1]  The results [2] have been extremely useful in setting and
>> prioritizing goals for Cassandra development.  But with the release of
>> 1.0 we've accomplished basically everything from our original wish
>> list. [3]
>>
>> I'd love to hear from modern Cassandra users again, especially if
>> you're usually a quiet lurker.  What does Cassandra do well?  What are
>> your pain points?  What's your feature wish list?
>>
>> As before, if you're in stealth mode or don't want to say anything in
>> public, feel free to reply to me privately and I will keep it off the
>> record.
>>
>> [1]
>> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html
>> [2]
>> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html
>> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>
>


Problem after upgrade to 1.0.1

2011-11-03 Thread Bryce Godfrey
I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go just fine 
with the rolling upgrade.  But now I'm having extreme load growth on one of my 
nodes (and others are growing faster than usual also).  I attempted to run a 
cfstats against the extremely large node that was seeing 2x the load of others 
and I get this error below.  I'm also went into the 
o.a.c.db.HintedHandoffManager mbean and attempted to list pending hints to see 
if it was growing out of control for some reason, but that just times out 
eventually for any node.  I'm not sure what to do next with this issue.

   Column Family: HintsColumnFamily
SSTable count: 3
Space used (live): 12681676437
Space used (total): 10233130272
Number of Keys (estimate): 384
Memtable Columns Count: 117704
Memtable Data Size: 115107307
Memtable Switch Count: 66
Read Count: 0
Read Latency: NaN ms.
Write Count: 21203290
Write Latency: 0.014 ms.
Pending Tasks: 0
Key cache capacity: 3
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 30130993
Compacted row maximum size: 9223372036854775807
Exception in thread "main" java.lang.IllegalStateException: Unable to compute 
ceiling for max when histogram overflowed
at 
org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
at 
org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
at 
org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:293)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at 
com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
at 
com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at 
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Bryce Godfrey | Sr. Software Engineer | Azaleos 
Corporation | T: 206.926.1978 | M: 206.849.2477



Re: Problem after upgrade to 1.0.1

2011-11-03 Thread Jonathan Ellis
Just to rule it out: you didn't do anything tricky like update
HintsColumnFamily to use compression?

On Thu, Nov 3, 2011 at 1:39 PM, Bryce Godfrey  wrote:
> I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go just
> fine with the rolling upgrade.  But now I’m having extreme load growth on
> one of my nodes (and others are growing faster than usual also).  I
> attempted to run a cfstats against the extremely large node that was seeing
> 2x the load of others and I get this error below.  I’m also went into the
> o.a.c.db.HintedHandoffManager mbean and attempted to list pending hints to
> see if it was growing out of control for some reason, but that just times
> out eventually for any node.  I’m not sure what to do next with this issue.
>
>
>
>    Column Family: HintsColumnFamily
>
>     SSTable count: 3
>
>     Space used (live): 12681676437
>
>     Space used (total): 10233130272
>
>     Number of Keys (estimate): 384
>
>     Memtable Columns Count: 117704
>
>     Memtable Data Size: 115107307
>
>     Memtable Switch Count: 66
>
>     Read Count: 0
>
>     Read Latency: NaN ms.
>
>     Write Count: 21203290
>
>     Write Latency: 0.014 ms.
>
>     Pending Tasks: 0
>
>     Key cache capacity: 3
>
>     Key cache size: 0
>
>     Key cache hit rate: NaN
>
>     Row cache: disabled
>
>     Compacted row minimum size: 30130993
>
>     Compacted row maximum size: 9223372036854775807
>
> Exception in thread "main" java.lang.IllegalStateException: Unable to
> compute ceiling for max when histogram overflowed
>
>     at
> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
>
>     at
> org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
>
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:293)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>
>     at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>
>     at
> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
>
>     at
> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
>
>     at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
>
>     at
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
>
>     at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>
>     at sun.rmi.transport.Transport$1.run(Transport.java:159)
>
>     at java.security.AccessController.doPrivileged(Native Method)
>
>     at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>
>     at
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
>
>     at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
>
>     at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
>
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
>     at java.lang.Thread.run(Thread.java:662)
>
>
>
> Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T:
> 206.926.1978 | M: 206.849.2477
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


RE: Problem after upgrade to 1.0.1

2011-11-03 Thread Dan Hendry
Regarding load growth, presumably you are referring to the load as reported
by JMX/nodetool. Have you actually looked at the disk utilization on the
nodes themselves? Potential issue I have seen:
http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html

 

Dan

 

From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com] 
Sent: November-03-11 14:40
To: user@cassandra.apache.org
Subject: Problem after upgrade to 1.0.1

 

I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go just
fine with the rolling upgrade.  But now I'm having extreme load growth on
one of my nodes (and others are growing faster than usual also).  I
attempted to run a cfstats against the extremely large node that was seeing
2x the load of others and I get this error below.  I'm also went into the
o.a.c.db.HintedHandoffManager mbean and attempted to list pending hints to
see if it was growing out of control for some reason, but that just times
out eventually for any node.  I'm not sure what to do next with this issue.

 

   Column Family: HintsColumnFamily

SSTable count: 3

Space used (live): 12681676437

Space used (total): 10233130272

Number of Keys (estimate): 384

Memtable Columns Count: 117704

Memtable Data Size: 115107307

Memtable Switch Count: 66

Read Count: 0

Read Latency: NaN ms.

Write Count: 21203290

Write Latency: 0.014 ms.

Pending Tasks: 0

Key cache capacity: 3

Key cache size: 0

Key cache hit rate: NaN

Row cache: disabled

Compacted row minimum size: 30130993

Compacted row maximum size: 9223372036854775807

Exception in thread "main" java.lang.IllegalStateException: Unable to
compute ceiling for max when histogram overflowed

at
org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:1
70)

at
org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)

at
org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.j
ava:293)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntr
ospector.java:93)

at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntr
ospector.java:27)

at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208
)

at
com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)

at
com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)

at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMB
eanServerInterceptor.java:666)

at
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)

at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.
java:1404)

at
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.j
ava:72)

at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMICon
nectionImpl.java:1265)

at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConne
ctionImpl.java:1360)

at
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl
.java:600)

at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)

at sun.rmi.transport.Transport$1.run(Transport.java:159)

at java.security.AccessController.doPrivileged(Native Method)

at sun.rmi.transport.Transport.serviceCall(Transport.java:155)

at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)

at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:
790)

at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:6
49)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja
va:886)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9
08)

at java.lang.Thread.run(Thread.java:662)

 

Bryce Godfrey | Sr. Software Engineer |   Azaleos
Corporation | T: 206.926.1978 | M: 206.849.2477

 

No virus found in this incoming message.
Checked by AVG - www.av

RE: Problem after upgrade to 1.0.1

2011-11-03 Thread Bryce Godfrey
Nope.  I did alter two of my own column families to use Leveled compaction and 
then ran scrub on each node, is the only change I have made from the upgrade.

Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T: 206.926.1978 | 
M: 206.849.2477

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Thursday, November 03, 2011 11:44 AM
To: user@cassandra.apache.org
Subject: Re: Problem after upgrade to 1.0.1

Just to rule it out: you didn't do anything tricky like update 
HintsColumnFamily to use compression?

On Thu, Nov 3, 2011 at 1:39 PM, Bryce Godfrey  wrote:
> I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go 
> just fine with the rolling upgrade.  But now I'm having extreme load 
> growth on one of my nodes (and others are growing faster than usual 
> also).  I attempted to run a cfstats against the extremely large node 
> that was seeing 2x the load of others and I get this error below.  I'm 
> also went into the o.a.c.db.HintedHandoffManager mbean and attempted 
> to list pending hints to see if it was growing out of control for some 
> reason, but that just times out eventually for any node.  I'm not sure what 
> to do next with this issue.
>
>
>
>    Column Family: HintsColumnFamily
>
>     SSTable count: 3
>
>     Space used (live): 12681676437
>
>     Space used (total): 10233130272
>
>     Number of Keys (estimate): 384
>
>     Memtable Columns Count: 117704
>
>     Memtable Data Size: 115107307
>
>     Memtable Switch Count: 66
>
>     Read Count: 0
>
>     Read Latency: NaN ms.
>
>     Write Count: 21203290
>
>     Write Latency: 0.014 ms.
>
>     Pending Tasks: 0
>
>     Key cache capacity: 3
>
>     Key cache size: 0
>
>     Key cache hit rate: NaN
>
>     Row cache: disabled
>
>     Compacted row minimum size: 30130993
>
>     Compacted row maximum size: 9223372036854775807
>
> Exception in thread "main" java.lang.IllegalStateException: Unable to 
> compute ceiling for max when histogram overflowed
>
>     at
> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.
> java:170)
>
>     at
> org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:39
> 5)
>
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyS
> tore.java:293)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> ava:39)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBe
> anIntrospector.java:93)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBe
> anIntrospector.java:27)
>
>     at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.ja
> va:208)
>
>     at
> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65
> )
>
>     at
> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:21
> 6)
>
>     at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Def
> aultMBeanServerInterceptor.java:666)
>
>     at
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.jav
> a:638)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectio
> nImpl.java:1404)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnection
> Impl.java:72)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(
> RMIConnectionImpl.java:1265)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RM
> IConnectionImpl.java:1360)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnecti
> onImpl.java:600)
>
>     at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown 
> Source)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>
>     at sun.rmi.transport.Transport$1.run(Transport.java:159)
>
>     at java.security.AccessController.doPrivileged(Native Method)
>
>     at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>
>     at
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:53
> 5)
>
>     at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport
> .java:790)
>
>     at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.
>

RE: Problem after upgrade to 1.0.1

2011-11-03 Thread Bryce Godfrey
Disk utilization is actually about 80% higher than what is reported for 
nodetool ring across all my nodes on the data drive

Bryce Godfrey | Sr. Software Engineer | Azaleos 
Corporation | T: 206.926.1978 | M: 206.849.2477

From: Dan Hendry [mailto:dan.hendry.j...@gmail.com]
Sent: Thursday, November 03, 2011 11:47 AM
To: user@cassandra.apache.org
Subject: RE: Problem after upgrade to 1.0.1

Regarding load growth, presumably you are referring to the load as reported by 
JMX/nodetool. Have you actually looked at the disk utilization on the nodes 
themselves? Potential issue I have seen: 
http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html

Dan

From: Bryce Godfrey 
[mailto:bryce.godf...@azaleos.com]
Sent: November-03-11 14:40
To: user@cassandra.apache.org
Subject: Problem after upgrade to 1.0.1

I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go just fine 
with the rolling upgrade.  But now I'm having extreme load growth on one of my 
nodes (and others are growing faster than usual also).  I attempted to run a 
cfstats against the extremely large node that was seeing 2x the load of others 
and I get this error below.  I'm also went into the 
o.a.c.db.HintedHandoffManager mbean and attempted to list pending hints to see 
if it was growing out of control for some reason, but that just times out 
eventually for any node.  I'm not sure what to do next with this issue.

   Column Family: HintsColumnFamily
SSTable count: 3
Space used (live): 12681676437
Space used (total): 10233130272
Number of Keys (estimate): 384
Memtable Columns Count: 117704
Memtable Data Size: 115107307
Memtable Switch Count: 66
Read Count: 0
Read Latency: NaN ms.
Write Count: 21203290
Write Latency: 0.014 ms.
Pending Tasks: 0
Key cache capacity: 3
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 30130993
Compacted row maximum size: 9223372036854775807
Exception in thread "main" java.lang.IllegalStateException: Unable to compute 
ceiling for max when histogram overflowed
at 
org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
at 
org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
at 
org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:293)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at 
com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
at 
com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at 
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTra

Re: Problem after upgrade to 1.0.1

2011-11-03 Thread Jonathan Ellis
Does restarting the node fix this?

On Thu, Nov 3, 2011 at 1:51 PM, Bryce Godfrey  wrote:
> Disk utilization is actually about 80% higher than what is reported for
> nodetool ring across all my nodes on the data drive
>
>
>
> Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T:
> 206.926.1978 | M: 206.849.2477
>
>
>
> From: Dan Hendry [mailto:dan.hendry.j...@gmail.com]
> Sent: Thursday, November 03, 2011 11:47 AM
> To: user@cassandra.apache.org
> Subject: RE: Problem after upgrade to 1.0.1
>
>
>
> Regarding load growth, presumably you are referring to the load as reported
> by JMX/nodetool. Have you actually looked at the disk utilization on the
> nodes themselves? Potential issue I have seen:
> http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html
>
>
>
> Dan
>
>
>
> From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com]
> Sent: November-03-11 14:40
> To: user@cassandra.apache.org
> Subject: Problem after upgrade to 1.0.1
>
>
>
> I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go just
> fine with the rolling upgrade.  But now I’m having extreme load growth on
> one of my nodes (and others are growing faster than usual also).  I
> attempted to run a cfstats against the extremely large node that was seeing
> 2x the load of others and I get this error below.  I’m also went into the
> o.a.c.db.HintedHandoffManager mbean and attempted to list pending hints to
> see if it was growing out of control for some reason, but that just times
> out eventually for any node.  I’m not sure what to do next with this issue.
>
>
>
>    Column Family: HintsColumnFamily
>
>     SSTable count: 3
>
>     Space used (live): 12681676437
>
>     Space used (total): 10233130272
>
>     Number of Keys (estimate): 384
>
>     Memtable Columns Count: 117704
>
>     Memtable Data Size: 115107307
>
>     Memtable Switch Count: 66
>
>     Read Count: 0
>
>     Read Latency: NaN ms.
>
>     Write Count: 21203290
>
>     Write Latency: 0.014 ms.
>
>     Pending Tasks: 0
>
>     Key cache capacity: 3
>
>     Key cache size: 0
>
>     Key cache hit rate: NaN
>
>     Row cache: disabled
>
>     Compacted row minimum size: 30130993
>
>     Compacted row maximum size: 9223372036854775807
>
> Exception in thread "main" java.lang.IllegalStateException: Unable to
> compute ceiling for max when histogram overflowed
>
>     at
> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170)
>
>     at
> org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395)
>
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:293)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>
>     at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>
>     at
> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
>
>     at
> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
>
>     at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
>
>     at
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
>
>     at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>
>     at sun.rmi.transport.Transport$1.run(Transport.java:159)
>
>     at java.security.AccessController.doPrivileged(Native Method)
>
>     at sun.rmi.transport.Transport.serviceCall(Transport.

Retreiving column by names Vs by range, which is more performant ?

2011-11-03 Thread Ertio Lew
Retrieving columns by names vs by range which is more performant , when you
have the options to do both ?


Re: Debian package jna bug workaroung

2011-11-03 Thread paul cannon
I can't reproduce this. What version of the cassandra deb are you using,
exactly, and why are you symlinking or copying jna.jar into
/usr/share/cassandra?  The initscript should be adding
/usr/sahre/java/jna.jar to the classpath, and that should be all you need.

The failure you see with o.a.c.cache.FreeableMemory is not because the jre
can't find the class, it's just that it can't initialize the class (because
it needs JNA, and it can't find JNA).

p

On Wed, Nov 2, 2011 at 4:42 AM, Peter Tillotson wrote:

> see below
>  * JAVA_HOME=/usr/lib/jvm/java-6-openjdk
> works
> --
> Reading the documentation over at Datastax
> “The Debian and RPM packages of Cassandra install JNA automatically”
>
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>
> And indeed the Debian package depends on jna, and the
> /etc/init.d/cassandra looks as though in adds /usr/share/java/jna.jar to
> the classpath, and here is the but.
>
> If I copy or symlink jna.jar into:
>  * /usr/share/cassandra
>  * /usr/share/cassandra/lib
> Whenever a column family initialises I get:
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.cassandra.cache.FreeableMemory
>
> This suggests to me that:
>  1) By default, for me at least, in Debian jna.jar is not on the classpath
>  2) There is an additional classpath issue
>  jar -tf apache-cassandra.jar | grep "FreeableMemory" succeeds
>
> I'm running on:
>  * Ubuntu 10.04 x64
>  * JAVA_HOME=/usr/lib/jvm/java-6-sun
>
> Full stack traces:
> java.lang.NoClassDefFoundError: Could not initialize class
> com.sun.jna.Native
> at com.sun.jna.Pointer.(Pointer.java:42)
> at
> org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:92)
> at
> org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:154)
> at
> org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:63)
> at
> org.apache.cassandra.db.ColumnFamilyStore.cacheRow(ColumnFamilyStore.java:1150)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1174)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
> at org.apache.cassandra.db.Table.getRow(Table.java:378)
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61)
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:797)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1265)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>  INFO [pool-1-thread-1] 2011-11-02 10:26:46,738 Memtable.java (line 177)
> CFS(Keyspace='BigSet', ColumnFamily='theKeys') liveRatio is
> 18.20062753783684 (just-counted was 16.960966424636872).  calculation took
> 408ms for 8169 columns
> ERROR [ReadStage:33] 2011-11-02 10:26:56,599 AbstractCassandraDaemon.java
> (line 133) Fatal exception in thread Thread[ReadStage:33,5,main]
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.cassandra.cache.FreeableMemory
> at
> org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:92)
> at
> org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:154)
> at
> org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:63)
> at
> org.apache.cassandra.db.ColumnFamilyStore.cacheRow(ColumnFamilyStore.java:1150)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1174)
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
> at org.apache.cassandra.db.Table.getRow(Table.java:378)
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61)
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:797)
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1265)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
>


RE: Problem after upgrade to 1.0.1

2011-11-03 Thread Bryce Godfrey
A restart fixed the load numbers, they are back to where I expect them to be 
now, but disk utilization is double the load #.  I'm also still get the cfstats 
exception from any node.

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Thursday, November 03, 2011 11:52 AM
To: user@cassandra.apache.org
Subject: Re: Problem after upgrade to 1.0.1

Does restarting the node fix this?

On Thu, Nov 3, 2011 at 1:51 PM, Bryce Godfrey  wrote:
> Disk utilization is actually about 80% higher than what is reported 
> for nodetool ring across all my nodes on the data drive
>
>
>
> Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T:
> 206.926.1978 | M: 206.849.2477
>
>
>
> From: Dan Hendry [mailto:dan.hendry.j...@gmail.com]
> Sent: Thursday, November 03, 2011 11:47 AM
> To: user@cassandra.apache.org
> Subject: RE: Problem after upgrade to 1.0.1
>
>
>
> Regarding load growth, presumably you are referring to the load as 
> reported by JMX/nodetool. Have you actually looked at the disk 
> utilization on the nodes themselves? Potential issue I have seen:
> http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html
>
>
>
> Dan
>
>
>
> From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com]
> Sent: November-03-11 14:40
> To: user@cassandra.apache.org
> Subject: Problem after upgrade to 1.0.1
>
>
>
> I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go 
> just fine with the rolling upgrade.  But now I'm having extreme load 
> growth on one of my nodes (and others are growing faster than usual 
> also).  I attempted to run a cfstats against the extremely large node 
> that was seeing 2x the load of others and I get this error below.  I'm 
> also went into the o.a.c.db.HintedHandoffManager mbean and attempted 
> to list pending hints to see if it was growing out of control for some 
> reason, but that just times out eventually for any node.  I'm not sure what 
> to do next with this issue.
>
>
>
>    Column Family: HintsColumnFamily
>
>     SSTable count: 3
>
>     Space used (live): 12681676437
>
>     Space used (total): 10233130272
>
>     Number of Keys (estimate): 384
>
>     Memtable Columns Count: 117704
>
>     Memtable Data Size: 115107307
>
>     Memtable Switch Count: 66
>
>     Read Count: 0
>
>     Read Latency: NaN ms.
>
>     Write Count: 21203290
>
>     Write Latency: 0.014 ms.
>
>     Pending Tasks: 0
>
>     Key cache capacity: 3
>
>     Key cache size: 0
>
>     Key cache hit rate: NaN
>
>     Row cache: disabled
>
>     Compacted row minimum size: 30130993
>
>     Compacted row maximum size: 9223372036854775807
>
> Exception in thread "main" java.lang.IllegalStateException: Unable to 
> compute ceiling for max when histogram overflowed
>
>     at
> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.
> java:170)
>
>     at
> org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:39
> 5)
>
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyS
> tore.java:293)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> ava:39)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:25)
>
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBe
> anIntrospector.java:93)
>
>     at
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBe
> anIntrospector.java:27)
>
>     at
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.ja
> va:208)
>
>     at
> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65
> )
>
>     at
> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:21
> 6)
>
>     at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Def
> aultMBeanServerInterceptor.java:666)
>
>     at
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.jav
> a:638)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectio
> nImpl.java:1404)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnection
> Impl.java:72)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(
> RMIConnectionImpl.java:1265)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RM
> IConnectionImpl.java:1360)
>
>     at
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnecti
> onImpl.java:600)
>
>     at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown 
> Source)
>
>     a

Re: Retreiving column by names Vs by range, which is more performant ?

2011-11-03 Thread Brandon Williams
On Thu, Nov 3, 2011 at 2:05 PM, Ertio Lew  wrote:
> Retrieving columns by names vs by range which is more performant , when you
> have the options to do both ?

Assuming the columns have never been overwritten, range has a small advantage.

However, in the face of frequently updated (overwritten) columns,
names will tear it up with
https://issues.apache.org/jira/browse/CASSANDRA-2498

-Brandon


Concatenating ids with extension to keep multiple rows related to an entity in a single CF

2011-11-03 Thread Aditya Narayan
I am concatenating  two Integer ids through bitwise operations(as described
below) to create a single primary key of type long. I wanted to know if
this is a good practice. This would help me in keeping multiple rows of an
entity in a single column family by appending different extensions to the
entityId.
Are there better ways ? My Ids are of type Integer(4 bytes).


public static final long makeCompositeKey(int k1, int k2){
return (long)k1 << 32 | k2;
}


Re: Debian package jna bug workaroung

2011-11-03 Thread Peter Tillotson
Cassandra 1.0.1 and only seemed to happen with
* JAVA_HOME=/usr/lib/jvm/java-6-sun
and jna.jar copied into /usr/share/cassandra(/lib)

I then saw the detail in the init script and how it was being linked

Is there a way I can verify which provider is being used? I want to make
sure Off heap is being used in the default config.

On 03/11/11 19:06, paul cannon wrote:
> I can't reproduce this. What version of the cassandra deb are you using,
> exactly, and why are you symlinking or copying jna.jar into
> /usr/share/cassandra?  The initscript should be adding
> /usr/sahre/java/jna.jar to the classpath, and that should be all you need.
> 
> The failure you see with o.a.c.cache.FreeableMemory is not because the
> jre can't find the class, it's just that it can't initialize the class
> (because it needs JNA, and it can't find JNA).
> 
> p
> 
> On Wed, Nov 2, 2011 at 4:42 AM, Peter Tillotson  > wrote:
> 
> see below
>  * JAVA_HOME=/usr/lib/jvm/java-6-openjdk 
> works
> --
> Reading the documentation over at Datastax
> “The Debian and RPM packages of Cassandra install JNA automatically”
> 
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
> 
> And indeed the Debian package depends on jna, and the
> /etc/init.d/cassandra looks as though in adds
> /usr/share/java/jna.jar to the classpath, and here is the but. 
> 
> If I copy or symlink jna.jar into:
>  * /usr/share/cassandra
>  * /usr/share/cassandra/lib
> Whenever a column family initialises I get:
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.cassandra.cache.FreeableMemory
> 
> This suggests to me that:
>  1) By default, for me at least, in Debian jna.jar is not on the
> classpath
>  2) There is an additional classpath issue
>  jar -tf apache-cassandra.jar | grep "FreeableMemory" succeeds
> 
> I'm running on: 
>  * Ubuntu 10.04 x64
>  * JAVA_HOME=/usr/lib/jvm/java-6-sun
> 
> Full stack traces:
> java.lang.NoClassDefFoundError: Could not initialize class
> com.sun.jna.Native
> at com.sun.jna.Pointer.(Pointer.java:42)
> at
> 
> org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:92)
> at
> org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:154)
> at
> 
> org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:63)
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore.cacheRow(ColumnFamilyStore.java:1150)
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1174)
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
> at org.apache.cassandra.db.Table.getRow(Table.java:378)
> at
> 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61)
> at
> 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:797)
> at
> 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1265)
> at
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>  INFO [pool-1-thread-1] 2011-11-02 10:26:46,738 Memtable.java (line
> 177) CFS(Keyspace='BigSet', ColumnFamily='theKeys') liveRatio is
> 18.20062753783684 (just-counted was 16.960966424636872).
>  calculation took 408ms for 8169 columns
> ERROR [ReadStage:33] 2011-11-02 10:26:56,599
> AbstractCassandraDaemon.java (line 133) Fatal exception in thread
> Thread[ReadStage:33,5,main]
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.cassandra.cache.FreeableMemory
> at
> 
> org.apache.cassandra.cache.SerializingCache.serialize(SerializingCache.java:92)
> at
> org.apache.cassandra.cache.SerializingCache.put(SerializingCache.java:154)
> at
> 
> org.apache.cassandra.cache.InstrumentingCache.put(InstrumentingCache.java:63)
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore.cacheRow(ColumnFamilyStore.java:1150)
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1174)
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
> at org.apache.cassandra.db.Table.getRow(Table.java:378)
> at
> 
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:61)
> at
> 
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(Storag

Re: Second Cassandra users survey

2011-11-03 Thread Todd Burruss
- Better performance when access random columns in a wide row
- caching subsets of wide rows - possibly on the same boundaries as the
index
- some sort of notification architecture when data is inserted.  This
could be co-processors, triggers, plugins, etc
- auto load balance when adding new nodes

On 11/1/11 3:59 PM, "Jonathan Ellis"  wrote:

>Hi all,
>
>Two years ago I asked for Cassandra use cases and feature requests.
>[1]  The results [2] have been extremely useful in setting and
>prioritizing goals for Cassandra development.  But with the release of
>1.0 we've accomplished basically everything from our original wish
>list. [3]
>
>I'd love to hear from modern Cassandra users again, especially if
>you're usually a quiet lurker.  What does Cassandra do well?  What are
>your pain points?  What's your feature wish list?
>
>As before, if you're in stealth mode or don't want to say anything in
>public, feel free to reply to me privately and I will keep it off the
>record.
>
>[1] 
>http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.ht
>ml
>[2] 
>http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.h
>tml
>[3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>
>-- 
>Jonathan Ellis
>Project Chair, Apache Cassandra
>co-founder of DataStax, the source for professional Cassandra support
>http://www.datastax.com



Read perf investigation

2011-11-03 Thread Ian Danforth
All,

 I've done a bit more homework, and I continue to see long 200ms to 300ms
read times for some keys.

Test Setup

EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all
M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all
communication.

Data Model

One column family with tens of millions of rows. The number of columns per
row varies between 0 and 1440 (per minute records). The values are all
ints. All data stored on EBS volumes. Total load per node is ~110GB.

According to VMstat I'm not swapping at all.

Highest %Util I see
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdf  0.00  2788.00   17.00  267.50  1168.00 23020.0085.02
   32.37  107.73   1.22  34.60

A more average profile I see is:

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdf  0.00 0.00   21.000.00  1288.00 0.0061.33
0.37   18.38   9.43  19.80

QUESTION

Where should I look next? I'd love to get a profile of exactly where
cassandra is spending its time on a per call basis.

Thanks in advance,

Ian


RE: Read perf investigation

2011-11-03 Thread Dan Hendry
Uh, so look at your await time of *107.3*. From the iostat man page: "await:
The average time (in milliseconds) for I/O requests issued to the device to
be  served.  This includes the time spent by the requests in queue and the
time spent servicing them."

 

If the key you are reading from is not in Cassandras key cache or row cache,
Cassandra needs to do two disk seeks
(http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra).
This means that some of your *must* take on average 215 ms not even
including network latency. Looks like EBS, or more generally disk
saturation, is your problem. Perhaps consider RAID0 with ephemeral drives.

 

Dan

 

From: Ian Danforth [mailto:idanfo...@numenta.com] 
Sent: November-03-11 18:34
To: user@cassandra.apache.org
Subject: Read perf investigation

 

All,

 

 I've done a bit more homework, and I continue to see long 200ms to 300ms
read times for some keys.

 

Test Setup

 

EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all
M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all
communication.

 

Data Model

 

One column family with tens of millions of rows. The number of columns per
row varies between 0 and 1440 (per minute records). The values are all ints.
All data stored on EBS volumes. Total load per node is ~110GB.

 

According to VMstat I'm not swapping at all.

 

Highest %Util I see

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util

xvdf  0.00  2788.00   17.00  267.50  1168.00 23020.0085.02
32.37  107.73   1.22  34.60

 

A more average profile I see is:

 

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util

xvdf  0.00 0.00   21.000.00  1288.00 0.0061.33
0.37   18.38   9.43  19.80

 

QUESTION

 

Where should I look next? I'd love to get a profile of exactly where
cassandra is spending its time on a per call basis.

 

Thanks in advance,

 

Ian

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.920 / Virus Database: 271.1.1/3993 - Release Date: 11/03/11
03:39:00



Benchmarking Cassandra scalability to over 1M writes/s on AWS

2011-11-03 Thread Adrian Cockcroft
Hi folks,

we just posted a detailed Netflix technical blog entry on this
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Hope you find it interesting/useful

Cheers Adrian


Re: Problem after upgrade to 1.0.1

2011-11-03 Thread Jonathan Ellis
I found the problem and posted a patch on
https://issues.apache.org/jira/browse/CASSANDRA-3451.  If you build
with that patch and rerun scrub the exception should go away.

On Thu, Nov 3, 2011 at 2:08 PM, Bryce Godfrey  wrote:
> A restart fixed the load numbers, they are back to where I expect them to be 
> now, but disk utilization is double the load #.  I'm also still get the 
> cfstats exception from any node.
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Thursday, November 03, 2011 11:52 AM
> To: user@cassandra.apache.org
> Subject: Re: Problem after upgrade to 1.0.1
>
> Does restarting the node fix this?
>
> On Thu, Nov 3, 2011 at 1:51 PM, Bryce Godfrey  
> wrote:
>> Disk utilization is actually about 80% higher than what is reported
>> for nodetool ring across all my nodes on the data drive
>>
>>
>>
>> Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T:
>> 206.926.1978 | M: 206.849.2477
>>
>>
>>
>> From: Dan Hendry [mailto:dan.hendry.j...@gmail.com]
>> Sent: Thursday, November 03, 2011 11:47 AM
>> To: user@cassandra.apache.org
>> Subject: RE: Problem after upgrade to 1.0.1
>>
>>
>>
>> Regarding load growth, presumably you are referring to the load as
>> reported by JMX/nodetool. Have you actually looked at the disk
>> utilization on the nodes themselves? Potential issue I have seen:
>> http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html
>>
>>
>>
>> Dan
>>
>>
>>
>> From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com]
>> Sent: November-03-11 14:40
>> To: user@cassandra.apache.org
>> Subject: Problem after upgrade to 1.0.1
>>
>>
>>
>> I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go
>> just fine with the rolling upgrade.  But now I'm having extreme load
>> growth on one of my nodes (and others are growing faster than usual
>> also).  I attempted to run a cfstats against the extremely large node
>> that was seeing 2x the load of others and I get this error below.  I'm
>> also went into the o.a.c.db.HintedHandoffManager mbean and attempted
>> to list pending hints to see if it was growing out of control for some
>> reason, but that just times out eventually for any node.  I'm not sure what 
>> to do next with this issue.
>>
>>
>>
>>    Column Family: HintsColumnFamily
>>
>>     SSTable count: 3
>>
>>     Space used (live): 12681676437
>>
>>     Space used (total): 10233130272
>>
>>     Number of Keys (estimate): 384
>>
>>     Memtable Columns Count: 117704
>>
>>     Memtable Data Size: 115107307
>>
>>     Memtable Switch Count: 66
>>
>>     Read Count: 0
>>
>>     Read Latency: NaN ms.
>>
>>     Write Count: 21203290
>>
>>     Write Latency: 0.014 ms.
>>
>>     Pending Tasks: 0
>>
>>     Key cache capacity: 3
>>
>>     Key cache size: 0
>>
>>     Key cache hit rate: NaN
>>
>>     Row cache: disabled
>>
>>     Compacted row minimum size: 30130993
>>
>>     Compacted row maximum size: 9223372036854775807
>>
>> Exception in thread "main" java.lang.IllegalStateException: Unable to
>> compute ceiling for max when histogram overflowed
>>
>>     at
>> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.
>> java:170)
>>
>>     at
>> org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:39
>> 5)
>>
>>     at
>> org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyS
>> tore.java:293)
>>
>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>>     at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>> ava:39)
>>
>>     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>> orImpl.java:25)
>>
>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>     at
>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBe
>> anIntrospector.java:93)
>>
>>     at
>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBe
>> anIntrospector.java:27)
>>
>>     at
>> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.ja
>> va:208)
>>
>>     at
>> com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65
>> )
>>
>>     at
>> com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:21
>> 6)
>>
>>     at
>> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(Def
>> aultMBeanServerInterceptor.java:666)
>>
>>     at
>> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.jav
>> a:638)
>>
>>     at
>> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectio
>> nImpl.java:1404)
>>
>>     at
>> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnection
>> Impl.java:72)
>>
>>     at
>> javax.management.remote.rmi

Re: Benchmarking Cassandra scalability to over 1M writes/s on AWS

2011-11-03 Thread Jonathan Ellis
<3 the straight line.  Fantastic!

On Thu, Nov 3, 2011 at 6:41 PM, Adrian Cockcroft
 wrote:
> Hi folks,
>
> we just posted a detailed Netflix technical blog entry on this
> http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
>
> Hope you find it interesting/useful
>
> Cheers Adrian
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


RE: Problem after upgrade to 1.0.1

2011-11-03 Thread Bryce Godfrey
Thanks for the help so far.  

Is there any way to find out why my HintsColumnFamily is so large now, since it 
wasn't this way before the upgrade and it seems to just climbing?  

I've tried invoking o.a.c.db.HintedHnadoffManager.countPendingHints() thinking 
I have a bunch of stale hints from upgrade issues, but it just eventually times 
out.  Plus the node it gets invoked against gets thrashed and stops responding, 
forcing me to restart cassandra.

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Thursday, November 03, 2011 5:06 PM
To: user@cassandra.apache.org
Subject: Re: Problem after upgrade to 1.0.1

I found the problem and posted a patch on 
https://issues.apache.org/jira/browse/CASSANDRA-3451.  If you build with that 
patch and rerun scrub the exception should go away.

On Thu, Nov 3, 2011 at 2:08 PM, Bryce Godfrey  wrote:
> A restart fixed the load numbers, they are back to where I expect them to be 
> now, but disk utilization is double the load #.  I'm also still get the 
> cfstats exception from any node.
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Thursday, November 03, 2011 11:52 AM
> To: user@cassandra.apache.org
> Subject: Re: Problem after upgrade to 1.0.1
>
> Does restarting the node fix this?
>
> On Thu, Nov 3, 2011 at 1:51 PM, Bryce Godfrey  
> wrote:
>> Disk utilization is actually about 80% higher than what is reported 
>> for nodetool ring across all my nodes on the data drive
>>
>>
>>
>> Bryce Godfrey | Sr. Software Engineer | Azaleos Corporation | T:
>> 206.926.1978 | M: 206.849.2477
>>
>>
>>
>> From: Dan Hendry [mailto:dan.hendry.j...@gmail.com]
>> Sent: Thursday, November 03, 2011 11:47 AM
>> To: user@cassandra.apache.org
>> Subject: RE: Problem after upgrade to 1.0.1
>>
>>
>>
>> Regarding load growth, presumably you are referring to the load as 
>> reported by JMX/nodetool. Have you actually looked at the disk 
>> utilization on the nodes themselves? Potential issue I have seen:
>> http://www.mail-archive.com/user@cassandra.apache.org/msg18142.html
>>
>>
>>
>> Dan
>>
>>
>>
>> From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com]
>> Sent: November-03-11 14:40
>> To: user@cassandra.apache.org
>> Subject: Problem after upgrade to 1.0.1
>>
>>
>>
>> I recently upgraded from 0.8.6 to 1.0.1 and everything seemed to go 
>> just fine with the rolling upgrade.  But now I'm having extreme load 
>> growth on one of my nodes (and others are growing faster than usual 
>> also).  I attempted to run a cfstats against the extremely large node 
>> that was seeing 2x the load of others and I get this error below.  
>> I'm also went into the o.a.c.db.HintedHandoffManager mbean and 
>> attempted to list pending hints to see if it was growing out of 
>> control for some reason, but that just times out eventually for any node.  
>> I'm not sure what to do next with this issue.
>>
>>
>>
>>    Column Family: HintsColumnFamily
>>
>>     SSTable count: 3
>>
>>     Space used (live): 12681676437
>>
>>     Space used (total): 10233130272
>>
>>     Number of Keys (estimate): 384
>>
>>     Memtable Columns Count: 117704
>>
>>     Memtable Data Size: 115107307
>>
>>     Memtable Switch Count: 66
>>
>>     Read Count: 0
>>
>>     Read Latency: NaN ms.
>>
>>     Write Count: 21203290
>>
>>     Write Latency: 0.014 ms.
>>
>>     Pending Tasks: 0
>>
>>     Key cache capacity: 3
>>
>>     Key cache size: 0
>>
>>     Key cache hit rate: NaN
>>
>>     Row cache: disabled
>>
>>     Compacted row minimum size: 30130993
>>
>>     Compacted row maximum size: 9223372036854775807
>>
>> Exception in thread "main" java.lang.IllegalStateException: Unable to 
>> compute ceiling for max when histogram overflowed
>>
>>     at
>> org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.
>> java:170)
>>
>>     at
>> org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:3
>> 9
>> 5)
>>
>>     at
>> org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamily
>> S
>> tore.java:293)
>>
>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
>> Method)
>>
>>     at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
>> j
>> ava:39)
>>
>>     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
>> s
>> orImpl.java:25)
>>
>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>
>>     at
>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMB
>> e
>> anIntrospector.java:93)
>>
>>     at
>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMB
>> e
>> anIntrospector.java:27)
>>
>>     at
>> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.j
>> a
>> va:2

Re: Concatenating ids with extension to keep multiple rows related to an entity in a single CF

2011-11-03 Thread Tyler Hobbs
On Thu, Nov 3, 2011 at 3:48 PM, Aditya Narayan  wrote:

> I am concatenating  two Integer ids through bitwise operations(as
> described below) to create a single primary key of type long. I wanted to
> know if this is a good practice. This would help me in keeping multiple
> rows of an entity in a single column family by appending different
> extensions to the entityId.
> Are there better ways ? My Ids are of type Integer(4 bytes).
>
>
> public static final long makeCompositeKey(int k1, int k2){
> return (long)k1 << 32 | k2;
> }
>

You could use an actual CompositeType(IntegerType, IntegerType), but it
would use a little extra space and not buy you much.

It doesn't sound like this is the case for you, but if you have several
distinct types of rows, you should consider using separate column families
for them rather than putting them all into one big CF.

-- 
Tyler Hobbs
DataStax 


Upcoming Apache Cassandra trainings from DataStax

2011-11-03 Thread Nate McCall
As an FYI for folks interested in quickly gaining an in-depth
understanding of developing for and operating Apache Cassandra
clusters, DataStax has the following training courses scheduled:

Austin, TX (Nov. 14th):
http://datastaxaustin.eventbrite.com/

San Mateo, CA (Dec. 13th):
http://datastaxsf.eventbrite.com/

Thanks,
-Nate


Re: Concatenating ids with extension to keep multiple rows related to an entity in a single CF

2011-11-03 Thread Aditya Narayan
the data in different rows of an entity  is all of similar type but serves
different features but still has almost similar storage and retrieval needs
thus I wanted to put them in one CF and reduce column families.

>From my knowledge, I believe compositeType existed for columns as
an alternative choice to implement something similar to supercolumns, are
there any cassandra's built in features to design composite keys using two
provided Integer ids.

Is my approach correct and recommended if I need to keep multiple rows
related to an entity in single CF ?

On Fri, Nov 4, 2011 at 10:11 AM, Tyler Hobbs  wrote:

> On Thu, Nov 3, 2011 at 3:48 PM, Aditya Narayan  wrote:
>
>> I am concatenating  two Integer ids through bitwise operations(as
>> described below) to create a single primary key of type long. I wanted to
>> know if this is a good practice. This would help me in keeping multiple
>> rows of an entity in a single column family by appending different
>> extensions to the entityId.
>> Are there better ways ? My Ids are of type Integer(4 bytes).
>>
>>
>> public static final long makeCompositeKey(int k1, int k2){
>> return (long)k1 << 32 | k2;
>> }
>>
>
> You could use an actual CompositeType(IntegerType, IntegerType), but it
> would use a little extra space and not buy you much.
>
> It doesn't sound like this is the case for you, but if you have several
> distinct types of rows, you should consider using separate column families
> for them rather than putting them all into one big CF.
>
> --
> Tyler Hobbs
> DataStax 
>
>