Re: GCInspector info messages in cassandra log

2012-08-16 Thread Tamar Fraenkel
Thank you very much!
*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Thu, Aug 16, 2012 at 12:11 AM, aaron morton wrote:

> Is there anything to do before that? like drain or flush?
>
> For a clean shutdown I do
>
> nodetool -h localhost disablethrift
> nodetool -h localhost disablegossip && sleep 10
> nodetool -h localhost drain
> then kill
>
> Would you recommend that? If I do it, how often should I do a full
> snapshot, and how often should I backup the backup directory?
>
> Sounds like you could use Priam and be happier...
> http://techblog.netflix.com/2012/02/announcing-priam.html
>
>  I just saw that there is an option global_snapshot, is it still supported?
>
> I cannot find it.
>
> Try Piram or the instructions here, which are pretty much what you have
> described http://www.datastax.com/docs/1.0/operations/backup_restore
>
> Cheers
>
>   -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 15/08/2012, at 4:57 PM, Tamar Fraenkel  wrote:
>
> Aaron,
> Thank you very much. I will do as you suggested.
>
> One last question regarding restart:
> I assume, I should do it node by node.
> Is there anything to do before that? like drain or flush?
>
> I am also considering enabling incremental backups on my cluster.
> Currently I take a daily full snapshot of the cluster, tar it and load it
> to S3 (size now is 3.1GB). Would you recommend that? If I do it, how often
> should I do a full snapshot, and how often should I backup the backup
> directory?
>
> Another snapshot related question, currently I snapshot on each node and
> use parallel-slurp to copy the snapshot to one node where I tar them. I
> just saw that there is an option global_snapshot, is it still supported?
> Does that mean that if I run it on one node the snapshot will contain data
> from all cluster? How does it work in restore? Is it better than my current
> backup system?
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
> On Tue, Aug 14, 2012 at 11:51 PM, aaron morton wrote:
>
>>
>>1. According to cfstats there are the some CF with high Comacted row
>>maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
>>100. Are these considered to be problematic, what can I do to solve
>>that?
>>2.
>>
>> They are only 1, 4 and 25 MB. Not too big.
>>
>> What should be the values of  in_memory_compaction_limit_in_mb
>>  and concurrent_compactors and how do I change them?
>>
>> Sounds like you dont have very big CF's, so changing the
>> in_memory_compaction_limit_in_mb may not make too much difference.
>>
>> Try changing concurrent_compactors to 2 in the yaml file. This change
>> will let you know if GC and compaction are related.
>>
>>
>>  change yaml file and restart,
>>
>> yes
>>
>> What do I do about the long rows? What value is considered too big.
>>
>> They churn more memory during compaction. If you have a lot of rows +32
>> MB I would think about it, does not look that way.
>>
>> Cheers
>>
>>
>>   -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 15/08/2012, at 3:15 AM, Tamar Fraenkel  wrote:
>>
>> Hi!
>> It helps, but before I do more actions I want to give you some more info,
>> and ask some questions:
>>
>> *Related Info*
>>
>>1. According to my yaml file (where do I see these parameters in the
>>jmx? I couldn't find them):
>>in_memory_compaction_limit_in_mb: 64
>>concurrent_compactors: 1, but it is commented out, so I guess it is
>>the default value
>>multithreaded_compaction: false
>>compaction_throughput_mb_per_sec: 16
>>compaction_preheat_key_cache: true
>>2. According to cfstats there are the some CF with high Comacted row
>>maximum sizes (1131752, 4866323 and 25109160). Others max sizes are <
>>100. Are these considered to be problematic, what can I do to solve
>>that?
>>3. During compactions Cassandra is slower
>>4. Running Cassandra Version 1.0.8
>>
>> *Questions*
>> What should be the values of  in_memory_compaction_limit_in_mb
>>  and concurrent_compactors and how do I change them? change yaml file
>> and restart, or can it be done using jmx without restarting Cassandra?
>> What do I do about the long rows? What value is considered too big.
>>
>> I appreciate your help! Thanks,
>>
>>
>>
>> *Tamar Fraenkel *
>> Senior Software Engineer, TOK Media
>>
>> 
>>
>> ta...@tok-media.com
>> Tel:   +972 2 6409736
>> Mob:  +972 54 8356490
>> Fax:   +972 2 5612956
>>
>>
>>
>>
>>
>> On Tue, Aug 14, 2012 at 1:22 PM, aaron morton wrote:
>>
>>> There are a couple of steps you can take if compaction is causing GC.
>>>
>>> - if you have a lot of wide rows consider reducing
>>

SSTable Index and Metadata - are they cached in RAM?

2012-08-16 Thread Maciej Miklas
Hi all,

bloom filter for row keys is always in RAM. What about SSTable index, and
Metadata?

Is it cached by Cassandra, or it relays on memory mapped files?


Thanks,
Maciej


Re: Migrating to a new cluster (using SSTableLoader or other approaches)

2012-08-16 Thread Filippo Diotalevi
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to 
> > position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
> > (65737276 bytes) in read-only mode
>  
>  
> This one looks like an error.
>  
> Can you run nodetool with DEBUG level logging and post the logs ?  

Thank Aaron.
 Which nodetool command are you referring to? (info, cfstats, ring,….)
Do I modify the log4j-tools.properties in $CASSANDRA_HOME/conf to set the 
nodetool logs to DEBUG?

Thanks,
--  
Filippo Diotalevi



On Wednesday, 15 August 2012 at 22:53, aaron morton wrote:

> > WARN 09:02:38,534 Unable to instantiate cache provider 
> > org.apache.cassandra.cache.SerializingCacheProvider; using default 
> > org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
> > instead
>  
> Happens when JNA is not in the path. Nothing to worry about when using the 
> sstableloader.  
>  
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to 
> > position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
> > (65737276 bytes) in read-only mode
>  
> This one looks like an error.  
>  
> Can you run nodetool with DEBUG level logging and post the logs ?  
>  
> Cheers
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
>  
>  
>  
>  
>  
> On 15/08/2012, at 9:32 PM, Filippo Diotalevi  (mailto:fili...@ntoklo.com)> wrote:
> > Hi,  
> > we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 
> > 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. 
> > The new cluster is in a different data centre.  
> >  
> > After reading the articles at
> > [1] http://www.datastax.com/dev/blog/bulk-loading
> > [2] 
> > http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
> >  
> > we are tried to follow this procedure  
> > 1) we took a snapshot of our keyspaces in the old cluster and moved them to 
> > the data folder of 3 of the new machines
> > 2) started cassandra in the new cluster
> > but we noticed that some column families were missing, other had missing 
> > data.
> >  
> > After that we tried to use sstableloader
> > 1) we reinstalled cassandra in the new cluster
> > 2) run sstableloader (as explained in [2]) to load the keyspaces
> >  
> > SSTableLoader starts, but the progress is always 0 and the transfer rate is 
> > 0MB/s. Some warning and exceptions are present in the logs
> >  
> > ./sstableloader /opt/analytics/analytics/
> > Starting client (and waiting 30 seconds for gossip) ...
> > Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db 
> > /opt/analytics/analytics/chart-hd-105-Data.db 
> > /opt/analytics/analytics/chart-hd-106-Data.db 
> > /opt/analytics/analytics/chart-hd-107-Data.db 
> > /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, 
> > /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, 
> > /1x.xx.xx.xx7]
> > WARN 09:02:38,534 Unable to instantiate cache provider 
> > org.apache.cassandra.cache.SerializingCacheProvider; using default 
> > org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
> > instead
> > WARN 09:02:38,549 Unable to instantiate cache provider 
> > org.apache.cassandra.cache.SerializingCacheProvider; using default 
> > org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
> > instead
> >  
> >  
> > [….]
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to 
> > position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
> > (65737276 bytes) in read-only mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> > at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:619)
> > Caused by: java.io.EOFException: unable to seek to position 93069003 in 
> > /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only 
> > mode
> > at 
> > org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:253)
> > at 
> > org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:136)
> > at 
> > org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > ... 3 more
> > Exception in thread "Streaming:1" java.lang.RuntimeException: 
> > java.io.EOFException: unable to seek to position 93069003 in 
> > /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only 
> > mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(F

Re: indexing question related to playOrm on github

2012-08-16 Thread Hiller, Dean
Yes, the synch may work, and no, I do "not" want a transaction…I want a 
different kind of eventually consistent

That might work.
Let's say server 1 sends a mutation (65 is the pk)
Remove: <65>  Add <65>
Server 2 also sends a mutation (65 is the pk)
Remove: <65> Add <65>

What everyone does not want is to end up with a row that has <65> and 
<65>.  With the wide row pattern, we would like to have ONE or the other. 
 I am not sure synchronization fixes that……It would be kind of nice if the 
column <65> would not actually be removed until after all servers are 
eventually consistent AND would keep a reference to the add that was happening 
so that when it goes to resolve eventually consistent between the servers, it 
would see that <65> is newer and it would decide to drop the first add 
completely.

Ie. In a full process it might look like this
Cassandra node 1 receives remove <65>, add <65> AND in the remove 
column stores info about the add <65> until eventual consistency is 
completed
Cassandra node 2 one ms later receives remove <65> and <65> AND in 
the remove column stores info about the add <65> until eventual 
consistency is completed
Eventual consistency starts comparing node 1 and node 2 and finds <65> is 
being removed by different servers and finds add info attached to that.  ONLY 
THE LAST add info is acknowledged and it makes the row consistent across the 
cluster.

That makes everyone's wide row indexing pattern tend to get less corrupt over 
time.

Thanks,
Dean


From: aaron morton mailto:aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, August 15, 2012 8:26 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: indexing question related to playOrm on github

1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a JQL/HQL 
query on a trillion rows in under 100ms (partitioning is the trick so you can 
JQL a partition)
No sure if we have an ORM specific page. If it's a client then feel free to add 
it to http://wiki.apache.org/cassandra/ClientOptions

I was wondering if cassandra has or will ever support eventual constancy where 
it keeps both the REMOVE AND the ADD together such until it is on all 3 
replicated nodes and in resolving the consistency would end up with an index 
that only has the very last one in the index.
Not sure I fully understand but it sounds like you want a transaction, which is 
not going to happen.

Internally when Cassandra updates a secondary index it does the same thing. But 
it synchronises updates around the same row so one thread will apply the 
changes at a time.

Hope that helps.
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/08/2012, at 12:34 PM, "Hiller, Dean" 
mailto:dean.hil...@nrel.gov>> wrote:

1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a JQL/HQL 
query on a trillion rows in under 100ms (partitioning is the trick so you can 
JQL a partition)
2.  Many applications have a common indexing problem and I was wondering if 
cassandra has or could have any support for this in the future….

When using wide row indexes, you frequently have . as 
the composite key.  This means when you have your object like so in the database

Activity {
  pk: 65
  name: bill
}

And then two servers want to save it as

Activity {
  pk:65
  name:tim
}
Activity {
  pk:65
  name:mike
}

Each server will remove <65> and BOTH servers will add <65> AND 
<65> BUT one of them will really be a lie!  I was wondering if 
cassandra has or will ever support eventual constancy where it keeps both the 
REMOVE AND the ADD together such until it is on all 3 replicated nodes and in 
resolving the consistency would end up with an index that only has the very 
last one in the index.

Thanks,
Dean




wild card on query

2012-08-16 Thread Swathi Vikas
Hi,
I am trying to run query on cassandra cluster with predicate on row key.

I have column family called "Users" and rows with row key like 
"projectid_userid_photos". Each user within a project can have rows like 
projectid_userid_blog, projectid_userid_status and so on. 


I want to retrieve all the photos from all the users of certain project. My sql 
like query will be "select projectid * photos from Users". How can i run this 
kind of row key predicate while executing query on cassandra?


Any sugesstion will help. 


Thank you,
swat.vikas

>>
>>
>
>
>

Re: indexing question related to playOrm on github

2012-08-16 Thread Hiller, Dean
Maybe this would be a special type of column family that could contain
these as my other tables definitely don't want the feature below by the
way.

Dean

On 8/16/12 6:29 AM, "Hiller, Dean"  wrote:

>Yes, the synch may work, and no, I do "not" want a transactionŠI want a
>different kind of eventually consistent
>
>That might work.
>Let's say server 1 sends a mutation (65 is the pk)
>Remove: <65>  Add <65>
>Server 2 also sends a mutation (65 is the pk)
>Remove: <65> Add <65>
>
>What everyone does not want is to end up with a row that has <65>
>and <65>.  With the wide row pattern, we would like to have ONE or
>the other.  I am not sure synchronization fixes thatŠŠIt would be kind of
>nice if the column <65> would not actually be removed until after
>all servers are eventually consistent AND would keep a reference to the
>add that was happening so that when it goes to resolve eventually
>consistent between the servers, it would see that <65> is newer and
>it would decide to drop the first add completely.
>
>Ie. In a full process it might look like this
>Cassandra node 1 receives remove <65>, add <65> AND in the
>remove column stores info about the add <65> until eventual
>consistency is completed
>Cassandra node 2 one ms later receives remove <65> and <65>
>AND in the remove column stores info about the add <65> until
>eventual consistency is completed
>Eventual consistency starts comparing node 1 and node 2 and finds
><65> is being removed by different servers and finds add info
>attached to that.  ONLY THE LAST add info is acknowledged and it makes
>the row consistent across the cluster.
>
>That makes everyone's wide row indexing pattern tend to get less corrupt
>over time.
>
>Thanks,
>Dean
>
>
>From: aaron morton
>mailto:aa...@thelastpickle.com>>
>Reply-To: "user@cassandra.apache.org"
>mailto:user@cassandra.apache.org>>
>Date: Wednesday, August 15, 2012 8:26 PM
>To: "user@cassandra.apache.org"
>mailto:user@cassandra.apache.org>>
>Subject: Re: indexing question related to playOrm on github
>
>1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a
>JQL/HQL query on a trillion rows in under 100ms (partitioning is the
>trick so you can JQL a partition)
>No sure if we have an ORM specific page. If it's a client then feel free
>to add it to http://wiki.apache.org/cassandra/ClientOptions
>
>I was wondering if cassandra has or will ever support eventual constancy
>where it keeps both the REMOVE AND the ADD together such until it is on
>all 3 replicated nodes and in resolving the consistency would end up with
>an index that only has the very last one in the index.
>Not sure I fully understand but it sounds like you want a transaction,
>which is not going to happen.
>
>Internally when Cassandra updates a secondary index it does the same
>thing. But it synchronises updates around the same row so one thread will
>apply the changes at a time.
>
>Hope that helps.
>-
>Aaron Morton
>Freelance Developer
>@aaronmorton
>http://www.thelastpickle.com
>
>On 16/08/2012, at 12:34 PM, "Hiller, Dean"
>mailto:dean.hil...@nrel.gov>> wrote:
>
>1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a
>JQL/HQL query on a trillion rows in under 100ms (partitioning is the
>trick so you can JQL a partition)
>2.  Many applications have a common indexing problem and I was wondering
>if cassandra has or could have any support for this in the futureŠ.
>
>When using wide row indexes, you frequently have
>. as the composite key.  This means when you
>have your object like so in the database
>
>Activity {
>  pk: 65
>  name: bill
>}
>
>And then two servers want to save it as
>
>Activity {
>  pk:65
>  name:tim
>}
>Activity {
>  pk:65
>  name:mike
>}
>
>Each server will remove <65> and BOTH servers will add <65>
>AND <65> BUT one of them will really be a lie!  I was wondering
>if cassandra has or will ever support eventual constancy where it keeps
>both the REMOVE AND the ADD together such until it is on all 3 replicated
>nodes and in resolving the consistency would end up with an index that
>only has the very last one in the index.
>
>Thanks,
>Dean
>
>



nodetool repair uses insane amount of disk space

2012-08-16 Thread Michael Morris
Occasionally as I'm doing my regular anti-entropy repair I end up with a
node that uses an exceptional amount of disk space (node should have about
5-6 GB of data on it, but ends up with 25+GB, and consumes the limited
amount of disk space I have available)

How come a node would consume 5x its normal data size during the repair
process?

My setup is kind of strange in that it's only about 80-100GB of data on a
35 node cluster, with 2 data centers and 3 racks, however the rack
assignments are unbalanced.  One data center has 8 nodes, and the other
data center is split into 2 racks with one rack of 9 nodes, and the other
with 18 nodes.  However, within each rack, the tokens are distributed
equally. It's a long sad story about how we ended up this way, but it
basically boils down to having to utilize existing resources to resolve a
production issue.

Additionally, the repair process takes (what I feel is) an extremely long
time to complete (36+ hours), and it always seems that nodes are streaming
data to each other, even on back-to-back executions of the repair.

Any help on these issues is appreciated.

- Mike


Many ParNew collections

2012-08-16 Thread Rene Kochen
Hi

I have a cluster of 7 nodes:

- Windows Server 2008
- Cassandra 0.7.10
- The nodes are identical (hardware, configuration and client request load)
- Standard batch file with 8GB heap
- I use disk_access_mode = standard
- Random partitioner
- TP stats shows no problems
- Ring command shows no problems (data is balanced)

However, there is one node with high read latency and far too many
ParNew collections (compared to other nodes). It also suffers from a
high CPU load (I guess due to the ParNew collections).

What can be the source of so many ParNew collections? The other
identical nodes do not have this behavior.

Logging:

2012-08-16 15:58:46,436906.072: [GC 906.072: [ParNew:
345022K->38336K(345024K), 0.2375976 secs]
3630599K->3516249K(8350272K), 0.2377296 secs] [Times: user=3.21
sys=0.00, real=0.23 secs]
2012-08-16 15:58:46,888906.517: [GC 906.517: [ParNew:
345024K->38336K(345024K), 0.2400594 secs]
3822937K->3743690K(8350272K), 0.2401802 secs] [Times: user=3.48
sys=0.03, real=0.25 secs]
2012-08-16 15:58:46,888 INFO 15:58:46,888 GC for ParNew: 478 ms for 2
collections, 3837792904 used; max is 8550678528
2012-08-16 15:58:47,372907.003: [GC 907.003: [ParNew:
345024K->38336K(345024K), 0.2405363 secs]
4050378K->3971544K(8350272K), 0.2406553 secs] [Times: user=3.34
sys=0.01, real=0.23 secs]
2012-08-16 15:58:47,918907.544: [GC 907.544: [ParNew:
345024K->38336K(345024K), 0.2404339 secs]
4278232K->4193789K(8350272K), 0.2405540 secs] [Times: user=3.31
sys=0.00, real=0.25 secs]
2012-08-16 15:58:47,918 INFO 15:58:47,918 GC for ParNew: 481 ms for 2
collections, 4300939752 used; max is 8550678528
2012-08-16 15:58:48,464908.079: [GC 908.079: [ParNew:
345024K->38336K(345024K), 0.2621174 secs]
4500477K->4434112K(8350272K), 0.2622375 secs] [Times: user=3.64
sys=0.00, real=0.25 secs]
2012-08-16 15:58:48,932 INFO 15:58:48,932 GC for ParNew: 262 ms for 1
collections, 4763583200 used; max is 8550678528
2012-08-16 15:58:49,384909.050: [GC 909.051: [ParNew:
344972K->38336K(345024K), 0.2033453 secs]
4740748K->4563252K(8350272K), 0.2034588 secs] [Times: user=2.89
sys=0.01, real=0.20 secs]
2012-08-16 15:58:49,946 INFO 15:58:49,946 GC for ParNew: 203 ms for 1
collections, 4885945792 used; max is 8550678528
2012-08-16 15:58:50,383909.998: [GC 909.998: [ParNew:
345024K->38336K(345024K), 0.2567542 secs]
4869940K->4740489K(8350272K), 0.2568804 secs] [Times: user=3.60
sys=0.00, real=0.25 secs]
2012-08-16 15:58:50,882910.474: [GC 910.474: [ParNew:
345024K->38336K(345024K), 0.2786205 secs]
5047177K->4962531K(8350272K), 0.2787668 secs] [Times: user=3.48
sys=0.00, real=0.28 secs]
2012-08-16 15:58:50,960 INFO 15:58:50,960 GC for ParNew: 536 ms for 2
collections, 5143423816 used; max is 8550678528
2012-08-16 15:58:51,584911.192: [GC 911.192: [ParNew:
344963K->38334K(345024K), 0.2664316 secs]
5269158K->5196444K(8350272K), 0.2665544 secs] [Times: user=3.74
sys=0.00, real=0.27 secs]
2012-08-16 15:58:52,130911.767: [GC 911.767: [ParNew:
345022K->38336K(345024K), 0.2327209 secs]
5503132K->5406771K(8350272K), 0.2328457 secs] [Times: user=3.35
sys=0.00, real=0.23 secs]
2012-08-16 15:58:52,130 INFO 15:58:52,130 GC for ParNew: 499 ms for 2
collections, 5541845264 used; max is 8550678528
2012-08-16 15:58:52,816912.460: [GC 912.460: [ParNew:
345024K->38334K(345024K), 0.2198399 secs]
5713459K->5605670K(8350272K), 0.2199669 secs] [Times: user=3.29
sys=0.00, real=0.23 secs]
2012-08-16 15:58:53,144 INFO 15:58:53,144 GC for ParNew: 220 ms for 1
collections, 5805870608 used; max is 8550678528
2012-08-16 15:58:55,546915.173: [GC 915.173: [ParNew:
345022K->38334K(345024K), 0.2369585 secs]
5912358K->5702871K(8350272K), 0.2371098 secs] [Times: user=3.18
sys=0.00, real=0.25 secs]
2012-08-16 15:58:56,186 INFO 15:58:56,186 GC for ParNew: 237 ms for 1
collections, 6089002480 used; max is 8550678528
2012-08-16 15:58:56,591916.232: [GC 916.232: [ParNew:
345022K->38336K(345024K), 0.2364850 secs]
6009559K->5914142K(8350272K), 0.2366075 secs] [Times: user=3.32
sys=0.00, real=0.23 secs]
2012-08-16 15:58:57,340916.989: [GC 916.989: [ParNew:
345024K->38334K(345024K), 0.2191751 secs]
6220830K->6107217K(8350272K), 0.2192894 secs] [Times: user=2.92
sys=0.00, real=0.22 secs]
2012-08-16 15:58:57,371917.209: [GC [1 CMS-initial-mark:
6068883K(8005248K)] 6108716K(8350272K), 0.0272472 secs] [Times:
user=0.03 sys=0.00, real=0.03 secs]
2012-08-16 15:58:57,371917.236: [CMS-concurrent-mark-start]
2012-08-16 15:58:57,371 INFO 15:58:57,371 GC for ParNew: 456 ms for 2
collections, 6255865264 used; max is 8550678528
2012-08-16 15:58:57,574917.444: [CMS-concurrent-mark: 0.208/0.208
secs] [Times: user=1.48 sys=0.06, real=0.20 secs]
2012-08-16 15:58:57,574917.444: [CMS-concurrent-preclean-start]
2012-08-16 15:58:57,637917.501: [CMS-concurrent-preclean: 0.057/0.057
secs] [Times: user=0.19 sys=0.00, real=0.06 secs]
2012-08-16 15:58:57,637917.501: [CMS-concurrent-abortable-preclean-start]
2012-08-16 15:58:58,775918.552: [GC 918.552: [ParNew:
345022K->38334K(345024K), 0.0948325 secs]
6413905K->

Opscenter 2.1 vs 1.3

2012-08-16 Thread Robin Verlangen
Hi there,

I just upgraded to opscenter 2.1 (from 1.3). It appears that my writes
have tripled. Is this a change in the display/measuring of opscenter?


Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.


C++ Bulk loader and Result set streaming.

2012-08-16 Thread Swathi Vikas
Hi All,
 
I am using C++ client libQtCassandra. I have two questions.
 
1) I want to bulk load data into cassandra through C++ interface. It is 
required by my group where i am doing internship. I could bulk load using 
sstableloader as specified in Datastax 
:http://www.datastax.com/dev/blog/bulk-loading. But i couldn't find any 
information on bulk loading using C++ client interface. 
 
2) I want to retrieve all the result of the query(not just first 100 result 
set) using C++ client. Is there any C++ supporting code or information on 
streaming the result set into a file or something.
 
If anyone has any information please direct me where i can look into.
 
Thank you very much,
Swat.vikas

'WHERE' with several indexed columns

2012-08-16 Thread A J
Hi
If I have a WHERE clause in CQL with several 'AND' and each column is
indexed, which index(es) is(are) used ?
Just the first field in the where clause or all the indexes involved
in the clause ?

Also is index used only with an equality operator or also with greater
than /less than comparator as well ?

Thanks.


Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-16 Thread Guillermo Winkler
Hi, I have a cassandra cluster where I'm seeing a lot of thread trashing
from the mutation pool.

MutationStage:72031

Where threads get created and disposed in 100's batches every few minutes,
since it's a 16 core server concurrent_writes is set in 100 in the
cassandra.yaml.

concurrent_writes: 100

I've seen in the StageManager class this pools get created with 60 seconds
keepalive time.

DebuggableThreadPoolExecutor -> allowCoreThreadTimeOut(true);

StageManager-> public static final long KEEPALIVE = 60; // seconds to keep
"extra" threads alive for when idle

Is it a reason for it to be this way?

Why not have a fixed size pool with Integer.MAX_VALUE as keepalive since
corePoolSize and maxPoolSize are set at the same size?

Thanks,
Guille


Re: SSTable Index and Metadata - are they cached in RAM?

2012-08-16 Thread aaron morton
> What about SSTable index, 
Not sure what you are referring to there. Each row has a in a SStable has a 
bloom filter and may have an index of columns. This is not cached. 

See http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ or 
http://www.slideshare.net/aaronmorton/cassandra-sf-2012-technical-deep-dive-query-performance

>  and Metadata?

This is the meta data we hold in memory for every open sstable
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java

Cheers
  

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/08/2012, at 7:34 PM, Maciej Miklas  wrote:

> Hi all,
> 
> bloom filter for row keys is always in RAM. What about SSTable index, and 
> Metadata?
> 
> Is it cached by Cassandra, or it relays on memory mapped files?
> 
> 
> Thanks,
> Maciej



Re: Migrating to a new cluster (using SSTableLoader or other approaches)

2012-08-16 Thread Filippo Diotalevi
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to 
> > position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
> > (65737276 bytes) in read-only mode
>  
>  
> This one looks like an error.
>  
> Can you run nodetool with DEBUG level logging and post the logs ?

Some of the error in the sstableloader

DEBUG [Thread-24] 2012-08-16 22:01:02,494 FileUtils.java (line 51) Deleting 
chart-tmp-hd-1-CompressionInfo.db
DEBUG [Thread-24] 2012-08-16 22:01:02,495 SSTable.java (line 146) Deleted 
/opt/SP/data/cassandra/analytics/chart-tmp-hd-1
INFO [Thread-24] 2012-08-16 22:01:02,495 StreamInSession.java (line 144) 
Streaming of file /opt/test/analytics/analytics/chart-hd-2400-Data.db 
sections=2 progress=0/-25429537 - 0% from 
org.apache.cassandra.streaming.StreamInSession@e66db21 failed: requesting a 
retry.
ERROR [Thread-24] 2012-08-16 22:01:02,496 AbstractCassandraDaemon.java (line 
139) Fatal exception in thread Thread[Thread-24,5,main]
java.lang.AssertionError: attempted to delete non-existing file 
chart-tmp-hd-1-Data.db
at org.apache.cassandra.io.util.FileUtils.deleteWithConfirm(FileUtils.java:49)
at 
org.apache.cassandra.streaming.IncomingStreamReader.retry(IncomingStreamReader.java:173)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:93)
at 
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:184)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:80)

DEBUG [ScheduledTasks:1] 2012-08-16 22:01:08,571 LoadBroadcaster.java (line 86) 
Disseminating load info ...  



--  
Filippo Diotalevi



On Wednesday, 15 August 2012 at 22:53, aaron morton wrote:

> > WARN 09:02:38,534 Unable to instantiate cache provider 
> > org.apache.cassandra.cache.SerializingCacheProvider; using default 
> > org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
> > instead
>  
> Happens when JNA is not in the path. Nothing to worry about when using the 
> sstableloader.  
>  
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to 
> > position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
> > (65737276 bytes) in read-only mode
>  
> This one looks like an error.  
>  
> Can you run nodetool with DEBUG level logging and post the logs ?  
>  
> Cheers
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
>  
>  
>  
>  
>  
> On 15/08/2012, at 9:32 PM, Filippo Diotalevi  (mailto:fili...@ntoklo.com)> wrote:
> > Hi,  
> > we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 
> > 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. 
> > The new cluster is in a different data centre.  
> >  
> > After reading the articles at
> > [1] http://www.datastax.com/dev/blog/bulk-loading
> > [2] 
> > http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
> >  
> > we are tried to follow this procedure  
> > 1) we took a snapshot of our keyspaces in the old cluster and moved them to 
> > the data folder of 3 of the new machines
> > 2) started cassandra in the new cluster
> > but we noticed that some column families were missing, other had missing 
> > data.
> >  
> > After that we tried to use sstableloader
> > 1) we reinstalled cassandra in the new cluster
> > 2) run sstableloader (as explained in [2]) to load the keyspaces
> >  
> > SSTableLoader starts, but the progress is always 0 and the transfer rate is 
> > 0MB/s. Some warning and exceptions are present in the logs
> >  
> > ./sstableloader /opt/analytics/analytics/
> > Starting client (and waiting 30 seconds for gossip) ...
> > Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db 
> > /opt/analytics/analytics/chart-hd-105-Data.db 
> > /opt/analytics/analytics/chart-hd-106-Data.db 
> > /opt/analytics/analytics/chart-hd-107-Data.db 
> > /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, 
> > /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, 
> > /1x.xx.xx.xx7]
> > WARN 09:02:38,534 Unable to instantiate cache provider 
> > org.apache.cassandra.cache.SerializingCacheProvider; using default 
> > org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
> > instead
> > WARN 09:02:38,549 Unable to instantiate cache provider 
> > org.apache.cassandra.cache.SerializingCacheProvider; using default 
> > org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
> > instead
> >  
> >  
> > [….]
> > ERROR 09:02:38,614 Error in ThreadPoolExecutor
> > java.lang.RuntimeException: java.io.EOFException: unable to seek to 
> > position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
> > (65737276 bytes) in read-only mode
> > at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
> >

Re: Migrating to a new cluster (using SSTableLoader or other approaches)

2012-08-16 Thread aaron morton
> Which nodetool command are you referring to? (info, cfstats, ring,….)
My bad. I meant to write sstableloader

> Do I modify the log4j-tools.properties in $CASSANDRA_HOME/conf to set the 
> nodetool logs to DEBUG?
You can use the --debug option with sstableloader to get a better exception 
message. 

Also change the logging in log4j-tools.properties for get DEBUG messages so we 
can see what's going on. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/08/2012, at 8:51 PM, Filippo Diotalevi  wrote:

>>> ERROR 09:02:38,614 Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.EOFException: unable to seek to 
>>> position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
>>> (65737276 bytes) in read-only mode
>> 
>> 
>> This one looks like an error.
>> 
>> Can you run nodetool with DEBUG level logging and post the logs ?  
> 
> Thank Aaron.
> Which nodetool command are you referring to? (info, cfstats, ring,….)
> Do I modify the log4j-tools.properties in $CASSANDRA_HOME/conf to set the 
> nodetool logs to DEBUG?
> 
> Thanks,
> --  
> Filippo Diotalevi
> 
> 
> 
> On Wednesday, 15 August 2012 at 22:53, aaron morton wrote:
> 
>>> WARN 09:02:38,534 Unable to instantiate cache provider 
>>> org.apache.cassandra.cache.SerializingCacheProvider; using default 
>>> org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
>>> instead
>> 
>> Happens when JNA is not in the path. Nothing to worry about when using the 
>> sstableloader.  
>> 
>>> ERROR 09:02:38,614 Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.EOFException: unable to seek to 
>>> position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
>>> (65737276 bytes) in read-only mode
>> 
>> This one looks like an error.  
>> 
>> Can you run nodetool with DEBUG level logging and post the logs ?  
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> 
>> 
>> 
>> 
>> 
>> On 15/08/2012, at 9:32 PM, Filippo Diotalevi > (mailto:fili...@ntoklo.com)> wrote:
>>> Hi,  
>>> we are trying to use SSTableLoader to bootstrap a new 7-node cassandra (v. 
>>> 1.0.10) cluster with the snapshots taken from a 3-node cassandra cluster. 
>>> The new cluster is in a different data centre.  
>>> 
>>> After reading the articles at
>>> [1] http://www.datastax.com/dev/blog/bulk-loading
>>> [2] 
>>> http://geekswithblogs.net/johnsPerfBlog/archive/2011/07/26/how-to-use-cassandrs-sstableloader.aspx
>>> 
>>> we are tried to follow this procedure  
>>> 1) we took a snapshot of our keyspaces in the old cluster and moved them to 
>>> the data folder of 3 of the new machines
>>> 2) started cassandra in the new cluster
>>> but we noticed that some column families were missing, other had missing 
>>> data.
>>> 
>>> After that we tried to use sstableloader
>>> 1) we reinstalled cassandra in the new cluster
>>> 2) run sstableloader (as explained in [2]) to load the keyspaces
>>> 
>>> SSTableLoader starts, but the progress is always 0 and the transfer rate is 
>>> 0MB/s. Some warning and exceptions are present in the logs
>>> 
>>> ./sstableloader /opt/analytics/analytics/
>>> Starting client (and waiting 30 seconds for gossip) ...
>>> Streaming revelant part of /opt/analytics/analytics/chart-hd-104-Data.db 
>>> /opt/analytics/analytics/chart-hd-105-Data.db 
>>> /opt/analytics/analytics/chart-hd-106-Data.db 
>>> /opt/analytics/analytics/chart-hd-107-Data.db 
>>> /opt/analytics/analytics/chart-hd-108-Data.db to [/1x.xx.xx.xx5, 
>>> /1x.xx.xx.xx7, /1x.xx.xx.xx0, /1x.xx.xx.xx7, /1x.xx.xx.xx3, /1x.xx.xx.xx8, 
>>> /1x.xx.xx.xx7]
>>> WARN 09:02:38,534 Unable to instantiate cache provider 
>>> org.apache.cassandra.cache.SerializingCacheProvider; using default 
>>> org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
>>> instead
>>> WARN 09:02:38,549 Unable to instantiate cache provider 
>>> org.apache.cassandra.cache.SerializingCacheProvider; using default 
>>> org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider@5d59054d 
>>> instead
>>> 
>>> 
>>> [….]
>>> ERROR 09:02:38,614 Error in ThreadPoolExecutor
>>> java.lang.RuntimeException: java.io.EOFException: unable to seek to 
>>> position 93069003 in /opt/analytics/analytics/chart-hd-104-Data.db 
>>> (65737276 bytes) in read-only mode
>>> at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
>>> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>> at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.io.EOFException: unable to seek to position 93069003 in 
>>> /opt/analytics/analytics/chart-hd-104-Data.db (65737276 bytes) in read-only 
>>> mode
>>> at 
>>> org.apache.cassandra.io.util.RandomAcce

Re: wild card on query

2012-08-16 Thread aaron morton
> I want to retrieve all the photos from all the users of certain project. My 
> sql like query will be "select projectid * photos from Users". How can i run 
> this kind of row key predicate while executing query on cassandra?
You cannot / should not do that using the data model you have. (i.e. you could 
do it with a secondary index, but in this case you probably should not).

Try to de-normalise your data. 

Say a CF called ProjectPhotos

* row key is the project_id
* column name is 
* column value is image_url or JSON data about the image. 

You would then slice some columns from one row in the  ProjectPhotos CF. 

You then need to know what images a user has uploaded, with say the UserPhotos 
CF. 

* row key is user_id
* column name is timestamp
* column value is image_url or JSON data about the image. 

I did a twitter sample app at http://wdcnz.com a couple of weeks ago that shows 
denormalising data  https://github.com/amorton/wdcnz-2012-site and 
http://www.slideshare.net/aaronmorton/hellow-world-cassandra

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 12:39 AM, Swathi Vikas  wrote:

> Hi,
> 
> I am trying to run query on cassandra cluster with predicate on row key.
> 
> I have column family called "Users" and rows with row key like 
> "projectid_userid_photos". Each user within a project can have rows like 
> projectid_userid_blog, projectid_userid_status and so on. 
> 
> I want to retrieve all the photos from all the users of certain project. My 
> sql like query will be "select projectid * photos from Users". How can i run 
> this kind of row key predicate while executing query on cassandra?
> 
> Any sugesstion will help. 
> 
> Thank you,
> swat.vikas
>>> 
>>> 
>> 
>> 
>> 
> 
> 
> 



Re: indexing question related to playOrm on github

2012-08-16 Thread aaron morton
>>  I am not sure synchronization fixes thatŠŠIt would be kind of
>> nice if the column <65> would not actually be removed until after
>> all servers are eventually consistent... 
Not sure thats possible.  

You can either serialise updating your custom secondary index on the client 
site or resolve the inconsistency on read. 

Not sure this fits with your workload but as an e.g. when you read from the 
index, if you detect multiple row PK's resolve the issue on the client and 
leave the data in cassandra as is. Then queue a job that will read the row and 
try to repair it's index entries. When repairing the index entry play with the 
timestamp so any deletions you make only apply to the column as it was when you 
saw the error.

Hope that helps. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 12:47 AM, "Hiller, Dean"  wrote:

> Maybe this would be a special type of column family that could contain
> these as my other tables definitely don't want the feature below by the
> way.
> 
> Dean
> 
> On 8/16/12 6:29 AM, "Hiller, Dean"  wrote:
> 
>> Yes, the synch may work, and no, I do "not" want a transactionŠI want a
>> different kind of eventually consistent
>> 
>> That might work.
>> Let's say server 1 sends a mutation (65 is the pk)
>> Remove: <65>  Add <65>
>> Server 2 also sends a mutation (65 is the pk)
>> Remove: <65> Add <65>
>> 
>> What everyone does not want is to end up with a row that has <65>
>> and <65>.  With the wide row pattern, we would like to have ONE or
>> the other.  I am not sure synchronization fixes thatŠŠIt would be kind of
>> nice if the column <65> would not actually be removed until after
>> all servers are eventually consistent AND would keep a reference to the
>> add that was happening so that when it goes to resolve eventually
>> consistent between the servers, it would see that <65> is newer and
>> it would decide to drop the first add completely.
>> 
>> Ie. In a full process it might look like this
>> Cassandra node 1 receives remove <65>, add <65> AND in the
>> remove column stores info about the add <65> until eventual
>> consistency is completed
>> Cassandra node 2 one ms later receives remove <65> and <65>
>> AND in the remove column stores info about the add <65> until
>> eventual consistency is completed
>> Eventual consistency starts comparing node 1 and node 2 and finds
>> <65> is being removed by different servers and finds add info
>> attached to that.  ONLY THE LAST add info is acknowledged and it makes
>> the row consistent across the cluster.
>> 
>> That makes everyone's wide row indexing pattern tend to get less corrupt
>> over time.
>> 
>> Thanks,
>> Dean
>> 
>> 
>> From: aaron morton
>> mailto:aa...@thelastpickle.com>>
>> Reply-To: "user@cassandra.apache.org"
>> mailto:user@cassandra.apache.org>>
>> Date: Wednesday, August 15, 2012 8:26 PM
>> To: "user@cassandra.apache.org"
>> mailto:user@cassandra.apache.org>>
>> Subject: Re: indexing question related to playOrm on github
>> 
>> 1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a
>> JQL/HQL query on a trillion rows in under 100ms (partitioning is the
>> trick so you can JQL a partition)
>> No sure if we have an ORM specific page. If it's a client then feel free
>> to add it to http://wiki.apache.org/cassandra/ClientOptions
>> 
>> I was wondering if cassandra has or will ever support eventual constancy
>> where it keeps both the REMOVE AND the ADD together such until it is on
>> all 3 replicated nodes and in resolving the consistency would end up with
>> an index that only has the very last one in the index.
>> Not sure I fully understand but it sounds like you want a transaction,
>> which is not going to happen.
>> 
>> Internally when Cassandra updates a secondary index it does the same
>> thing. But it synchronises updates around the same row so one thread will
>> apply the changes at a time.
>> 
>> Hope that helps.
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 16/08/2012, at 12:34 PM, "Hiller, Dean"
>> mailto:dean.hil...@nrel.gov>> wrote:
>> 
>> 1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a
>> JQL/HQL query on a trillion rows in under 100ms (partitioning is the
>> trick so you can JQL a partition)
>> 2.  Many applications have a common indexing problem and I was wondering
>> if cassandra has or could have any support for this in the futureŠ.
>> 
>> When using wide row indexes, you frequently have
>> . as the composite key.  This means when you
>> have your object like so in the database
>> 
>> Activity {
>> pk: 65
>> name: bill
>> }
>> 
>> And then two servers want to save it as
>> 
>> Activity {
>> pk:65
>> name:tim
>> }
>> Activity {
>> pk:65
>> name:mike
>> }
>> 
>> Each server will remove <65> and BOTH servers will add <65>
>> AND <65> BUT one of them will reall

Re: nodetool repair uses insane amount of disk space

2012-08-16 Thread aaron morton
What version are using ? There were issues with repair using lots-o-space in 
0.8.X, it's fixed in 1.X

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 2:56 AM, Michael Morris  wrote:

> Occasionally as I'm doing my regular anti-entropy repair I end up with a node 
> that uses an exceptional amount of disk space (node should have about 5-6 GB 
> of data on it, but ends up with 25+GB, and consumes the limited amount of 
> disk space I have available)
> 
> How come a node would consume 5x its normal data size during the repair 
> process?
> 
> My setup is kind of strange in that it's only about 80-100GB of data on a 35 
> node cluster, with 2 data centers and 3 racks, however the rack assignments 
> are unbalanced.  One data center has 8 nodes, and the other data center is 
> split into 2 racks with one rack of 9 nodes, and the other with 18 nodes.  
> However, within each rack, the tokens are distributed equally. It's a long 
> sad story about how we ended up this way, but it basically boils down to 
> having to utilize existing resources to resolve a production issue.
> 
> Additionally, the repair process takes (what I feel is) an extremely long 
> time to complete (36+ hours), and it always seems that nodes are streaming 
> data to each other, even on back-to-back executions of the repair.
> 
> Any help on these issues is appreciated.
> 
> - Mike
> 



Re: Opscenter 2.1 vs 1.3

2012-08-16 Thread aaron morton
You may have better luck on the Data Stax forums 
http://www.datastax.com/support-forums/

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 4:36 AM, Robin Verlangen  wrote:

> Hi there,
> 
> I just upgraded to opscenter 2.1 (from 1.3). It appears that my writes have 
> tripled. Is this a change in the display/measuring of opscenter?
> 
> 
> Best regards,
> 
> Robin Verlangen
> Software engineer
> 
> W http://www.robinverlangen.nl
> E ro...@us2.nl
> 
> Disclaimer: The information contained in this message and attachments is 
> intended solely for the attention and use of the named addressee and may be 
> confidential. If you are not the intended recipient, you are reminded that 
> the information remains the property of the sender. You must not use, 
> disclose, distribute, copy, print or rely on this e-mail. If you have 
> received this message in error, please contact the sender immediately and 
> irrevocably delete this message and any copies.
> 



Re: C++ Bulk loader and Result set streaming.

2012-08-16 Thread aaron morton
> But i couldn't find any information on bulk loading using C++ client 
> interface.
You cannot. 
To bulk load data use the sstableloader, otherwise you need to use the RPC / 
CQL API. 

> 2) I want to retrieve all the result of the query(not just first 100 result 
> set) using C++ client. Is there any C++ supporting code or information on 
> streaming the result set into a file or something.
I've not looked at the C++ client, but normally you use the last column 
returned as the start column for the next call. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 6:08 AM, Swathi Vikas  wrote:

> Hi All,
>  
> I am using C++ client libQtCassandra. I have two questions.
>  
> 1) I want to bulk load data into cassandra through C++ interface. It is 
> required by my group where i am doing internship. I could bulk load using 
> sstableloader as specified in Datastax 
> :http://www.datastax.com/dev/blog/bulk-loading. But i couldn't find any 
> information on bulk loading using C++ client interface.
>  
> 2) I want to retrieve all the result of the query(not just first 100 result 
> set) using C++ client. Is there any C++ supporting code or information on 
> streaming the result set into a file or something.
>  
> If anyone has any information please direct me where i can look into.
>  
> Thank you very much,
> Swat.vikas



Omitting empty columns from CQL SELECT

2012-08-16 Thread Mat Brown
Hello all,

I've noticed that when performing a SELECT statement with a list of
columns specified, Cassandra returns all columns in the resulting
row(s) even if they have no value. This creates an apparently
considerable amount of transport and deserialization overhead,
particularly in one use case I'm looking at, in which we select a
large collection of columns but expect only a small fraction of them
to contain values. Is there any way to get around this and only
receive columns that have values in the results?

Thanks,
Mat


Re: 'WHERE' with several indexed columns

2012-08-16 Thread aaron morton
> If I have a WHERE clause in CQL with several 'AND' and each column is
> indexed, which index(es) is(are) used ?
The most selective based on the average number of columns per row 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/index/keys/KeysSearcher.java

> Also is index used only with an equality operator or also with greater
equality


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 7:13 AM, A J  wrote:

> Hi
> If I have a WHERE clause in CQL with several 'AND' and each column is
> indexed, which index(es) is(are) used ?
> Just the first field in the where clause or all the indexes involved
> in the clause ?
> 
> Also is index used only with an equality operator or also with greater
> than /less than comparator as well ?
> 
> Thanks.



Re: nodetool repair uses insane amount of disk space

2012-08-16 Thread Michael Morris
Upgraded to 1.1.3 from 1.0.8 about 2 weeks ago.

On Thu, Aug 16, 2012 at 5:57 PM, aaron morton wrote:

> What version are using ? There were issues with repair using lots-o-space
> in 0.8.X, it's fixed in 1.X
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 17/08/2012, at 2:56 AM, Michael Morris 
> wrote:
>
> Occasionally as I'm doing my regular anti-entropy repair I end up with a
> node that uses an exceptional amount of disk space (node should have about
> 5-6 GB of data on it, but ends up with 25+GB, and consumes the limited
> amount of disk space I have available)
>
> How come a node would consume 5x its normal data size during the repair
> process?
>
> My setup is kind of strange in that it's only about 80-100GB of data on a
> 35 node cluster, with 2 data centers and 3 racks, however the rack
> assignments are unbalanced.  One data center has 8 nodes, and the other
> data center is split into 2 racks with one rack of 9 nodes, and the other
> with 18 nodes.  However, within each rack, the tokens are distributed
> equally. It's a long sad story about how we ended up this way, but it
> basically boils down to having to utilize existing resources to resolve a
> production issue.
>
> Additionally, the repair process takes (what I feel is) an extremely long
> time to complete (36+ hours), and it always seems that nodes are streaming
> data to each other, even on back-to-back executions of the repair.
>
> Any help on these issues is appreciated.
>
> - Mike
>
>
>


Re: Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-16 Thread aaron morton
That's some pretty old code. I would guess it was done that way to conserve 
resources. And _i think_ thread creation is pretty light weight.

Jonathan / Brandon / others - opinions ? 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 17/08/2012, at 8:09 AM, Guillermo Winkler  wrote:

> Hi, I have a cassandra cluster where I'm seeing a lot of thread trashing from 
> the mutation pool.
> 
> MutationStage:72031
> 
> Where threads get created and disposed in 100's batches every few minutes, 
> since it's a 16 core server concurrent_writes is set in 100 in the 
> cassandra.yaml. 
> 
> concurrent_writes: 100
> 
> I've seen in the StageManager class this pools get created with 60 seconds 
> keepalive time.
> 
> DebuggableThreadPoolExecutor -> allowCoreThreadTimeOut(true);
> 
> StageManager-> public static final long KEEPALIVE = 60; // seconds to keep 
> "extra" threads alive for when idle
> 
> Is it a reason for it to be this way? 
> 
> Why not have a fixed size pool with Integer.MAX_VALUE as keepalive since 
> corePoolSize and maxPoolSize are set at the same size?
> 
> Thanks,
> Guille
> 



Cassandra 1.0 row deletion

2012-08-16 Thread Terry Cumaranatunge
Hi,

We have a Cassandra 1.0 cluster that we run with RF=3 and perform
operations using a consistency level of quorum. We use batch_mutate for all
inserts and updates for atomicity across column families with the same row
key, but use the thrift interface remove API call in C++ to delete a row so
that we can delete an entire row without having to specify individual
column names. If you use the remove function to delete an entire row, is
that an atomic operation? In other words, can it delete a partial number of
columns in the row and leave other columns around?

In our particular test, we performed a row delete using the remove API, but
we took down one of the Cassandra nodes as part of a fail-over test. So,
the remove call didn't succeed from the client's perspective, but we ended
up with some columns being deleted from the row (not all) and started to
wonder if the remove API call was atomic.


Re: Cassandra 1.0 row deletion

2012-08-16 Thread Derek Williams
On Thu, Aug 16, 2012 at 9:08 PM, Terry Cumaranatunge wrote:
>
> We have a Cassandra 1.0 cluster that we run with RF=3 and perform
> operations using a consistency level of quorum. We use batch_mutate for all
> inserts and updates for atomicity across column families with the same row
> key, but use the thrift interface remove API call in C++ to delete a row so
> that we can delete an entire row without having to specify individual
> column names. If you use the remove function to delete an entire row, is
> that an atomic operation? In other words, can it delete a partial number of
> columns in the row and leave other columns around?
>

It all depends on the timestamp for the column. A row level delete will
place a row tombstone at the timestamp given, causing all columns with an
earlier timestamp to be deleted. If a column has a later timestamp then the
row tombstone, then it wont be deleted.

More info here: http://wiki.apache.org/cassandra/DistributedDeletes

-- 
Derek Williams