date:20110511

Re: Index interval tuning

2011-05-11 Thread Héctor Izquierdo Seliva

El mié, 11-05-2011 a las 14:24 +1200, aaron morton escribió:
> What version and what were the values for RecentBloomFilterFalsePositives and 
> BloomFilterFalsePositives ?
> 
> The bloom filter metrics are updated in SSTableReader.getPosition() the only 
> slightly odd thing I can see is that we do not count a key cache hit a a true 
> positive for the bloom filter. If there were a lot of key cache hits and a 
> few false positives the ratio would be wrong. I'll ask around, does not seem 
> to apply to Hectors case though. 
> 
> Cheers

0.7.5, and I am no longer using key cache. I get the bloom filter stats
via jmx. BloomFilterFalsePositiveRatio is always stuck at 1.0.
RecentBloomFilterFalsePositiveRation fluctuates from 0 to 1.0 with no
intermediate values. 

As for the index interval settings, I changed it from 128 to 256 and
memory consumption was just a tad lower but read performance was worse
by a few ms, so not much to gain there.

> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 11 May 2011, at 10:38, Chris Burroughs wrote:
> 
> > On 05/10/2011 02:12 PM, Peter Schuller wrote:
> >>> That reminds me, my false positive ration is stuck at 1.0, so I guess
> >>> bloom filters aren't doing a lot for me.
> >> 
> >> That sounds unlikely unless you're hitting some edge case like reading
> >> a particular row that happened to be a collision, and only that row.
> >> This is from JMX stats on the column family store?
> >> 
> > 
> > (From jmx)  I also see BloomFilterFalseRatio stuck at 1.0 on my
> > production nodes.  The only values that RecentBloomFilterFalseRatio had
> > over the past several minutes were 0.0 and 1.0.  While I can't prove
> > that isn't accurate, it is very suspicions.
> > 
> > The code looked reasonable until I got to SSTableReader, which was too
> > complicated to just glance through.
>

RE: Finding big rows

2011-05-11 Thread Meler Wojciech

Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they 
could be directly used.

I'd like to fetch whole row. I was able to dump the big row with sstable2json, 
but both my app and cli is unable to read the row from cassandra.
I see in json dump that all columns are marked as "deletedAt": 
-9223372036854775808, so SuperColumn::isMarkedForDelete() should return false. 
My cluster is running cassandra 0.7.4 and it path was 
0.7.0->0.7.2->0.7.3->0.7.4.
What's wrong? Bloom filters seems to be OK - I couldn't find tool for reading 
them but attached program does the job.
I'm sure that both my app and cli refer to proper keys this big rows is getting 
bigger and bigger as my app appends new super- and sub-columns to it, but can't 
read it:
get mycf[utf8('my-key')];
Returned 0 results.
I'm really confused - tried to turn debug on, but I can't see anything 
interesting in it. Any ideas what to check next?


Regards,
Wojtek

From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Wednesday, May 11, 2011 12:29 AM
To: user@cassandra.apache.org
Subject: Re: Finding big rows

I'm not aware of anything to find the row sizes, and your code looks like a 
good approach. Converting the key bytes to a string only makes sense if your 
app is doing the same thing.

In the cli try using one of the data type functions to format the key the same 
way as your app is, e.g. get FooCF[utf8('my-key')]

The main limitation on Super Columns is that Sub columns are not indexed 
http://wiki.apache.org/cassandra/CassandraLimitations. If you have a huge row 
use the get_slice() api call to get back slices of columns. The cli does not 
support slicing columns.

Hope that helps.
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 10 May 2011, at 20:41, Meler Wojciech wrote:


Hello,

I've noticed very nice stats exposed with JMX. I was quite shocked when I saw 
that MaxRowSize was about 400MB (it was expected to be several MB).
What is the best way to find keys of such big rows?

I couldn't find anything so I've written simple program to dump sizes from 
Index files (see attachment),
and got the keys, but when I used cassandra-cli to get such columns it said 
"Returned 0 results.".
I've realised that my app creates such big rows because it can't read them from 
Cassandra and recreates them every time.

Are there any tuneable limits for getting whole row?  Any limits on 
supercolumns?

Regards,
Wojtek


"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 
C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow 
prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 
068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci 
oraz Numerze Identyfikacji Podatkowej 957-07-51-216.





"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 
C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow 
prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 
068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci 
oraz Numerze Identyfikacji Podatkowej 957-07-51-216.


BFCheck.java
Description: BFCheck.java

Re: compaction strategy

2011-05-11 Thread Terje Marthinussen

>
>
> Not sure I follow you. 4 sstables is the minimum compaction look for
> (by default).
> If there is 30 sstables of ~20MB sitting there because compaction is
> behind, you
> will compact those 30 sstables together (unless there is not enough space
> for
> that and considering you haven't changed the max compaction threshold (32
> by
> default)). And you can increase max threshold.
> Don't get me wrong, I'm not pretending this works better than it does, but
> let's not pretend either that it's worth than it is.
>
>
Sorry, I am not trying to pretend anything or blow it out of proportions.
Just reacting to what I see.

This is what I see after some stress testing of some pretty decent HW.

81 Up Normal  181.6 GB8.33%   Token(bytes[30])

82 Up Normal  501.43 GB   8.33%   Token(bytes[313230])

83 Up Normal  248.07 GB   8.33%   Token(bytes[313437])

84 Up Normal  349.64 GB   8.33%   Token(bytes[313836])

85 Up Normal  511.55 GB   8.33%   Token(bytes[323336])

86 Up Normal  654.93 GB   8.33%   Token(bytes[333234])

87 UpNormal  534.77 GB   8.33%   Token(bytes[333939])

88 Up   Normal  525.88 GB   8.33%   Token(bytes[343739])

89 Up Normal  476.6 GB8.33%   Token(bytes[353730])

90 Up Normal  424.89 GB   8.33%   Token(bytes[363635])

91 Up Normal  338.14 GB   8.33%   Token(bytes[383036])

92 Up Normal  546.95 GB   8.33%   Token(bytes[6a])

.81 has been exposed to a full compaction. It had ~370GB before that and the
resulting sstable is 165GB.
The other nodes has only been doing minor compactions

I think this is a problem.
You are of course free to disagree.

I do however recommend doing a simulation on potential worst case scenarios
if many of the buckets end up with 3 sstables and don't compact for a while.
The disk space requirements  get pretty bad even without getting into
theoretical worst cases.

Regards,
Terje

Re: column bloat

2011-05-11 Thread Terje Marthinussen

On Wed, May 11, 2011 at 8:06 AM, aaron morton wrote:

> For a reasonable large amount of use cases (for me, 2 out of 3 at the
> moment) supercolumns will be units of data where the columns (attributes)
> will never change by themselves or where the data does not change anyway
> (archived data).
>
>
> Can you use a standard CF and pack the multiple columns into one value in
> your app ? It sounds like the super columns are just acting as opaque
> containers, and cassandra does not need to know these are different values.
> Agree this only works if there is no concurrent access on the sub columns.
> I'm suggesting this with one eye on
> https://issues.apache.org/jira/browse/CASSANDRA-2231
>
>
I have a great interest in sharing data across applications using cassandra.
This means I also have a great interest in removing serialization from the
applications :)
That I can get reasonably far without serialization logic in the application
is one of the main reasons I am working on Cassandra.

Yes, I have had this discussion before so I know the next suggestion would
be to build an API on top doing the serialization, but that will further
complicate things if I want to integrate with hadoop or other similar tools,
so why should I if I don't have to? :)

> It would seem like a good optimization to allow a timestamp on the
> supercolumn instead and remove the one on columns?
>
> I believe this may also work as an optimization on compactions? Just skip
> merging of columns under the supercolumn if the supercolumn has a timestamp
> and just replace the entire supercolumn in that case.
>
> Could be just a variation of the supercolumn object on insert. No
> timestamp, use the one in the columns, include timestamp, ignore timestamps
> in columns.
>
>
> SC's are more containers than columns, when it comes to reconciling their
> contents they act like column families: ask the columns to reconcile
> respecting the containers tombstone. Giving the SC a timestamp and making
> them act like columns would be a major change.
>

Not so sure it would be a major change, but if we can make an assumption
that people (or APIs) will be smart enough to feed data where all columns
has the same timestamp if they want to save some disk,  I guess this can be
compressed quite efficiently anyway.

Terje

Re: Read time get worse during dynamic snitch reset

2011-05-11 Thread shimi

I finally found some time to get back to this issue.
I turned on the DEBUG log on the StorageProxy and it shows that all of these
request are read from the other datacenter.

Shimi

On Tue, Apr 12, 2011 at 2:31 PM, aaron morton wrote:

> Something feels odd.
>
> From Peters nice write up of the dynamic snitch
> http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html The
> RackInferringSnitch (and the PropertyFileSnitch) derive from the
> AbstractNetworkTopologySnitch and should...
> "
> In the case of the NetworkTopologyStrategy, it inherits the
> implementation in AbstractNetworkTopologySnitch which sorts by
> AbstractNetworkTopologySnitch.compareEndPoints(), which:
>
> (1) Always prefers itself to any other node. So "myself" is always
> "closest", no matter what.
> (2) Else, always prefers a node in the same rack, to a node in a different
> rack.
> (3) Else, always prefers a node in the same dc, to a node in a different
> dc.
> "
>
> AFAIK the (data) request should be going to the local DC even after the
> DynamicSnitch has reset the scores. Because the underlying
> RackInferringSnitch should prefer local nodes.
>
> Just for fun check rack and dc assignments are what you thought using the
> operations on o.a.c.db.EndpointSnitchInfo bean in JConsole. Pass in the ip
> address for the nodes in each dc. If possible can you provide some info on
> the ip's in each dc?
>
> Aaron
>
> On 12 Apr 2011, at 18:24, shimi wrote:
>
> On Tue, Apr 12, 2011 at 12:26 AM, aaron morton wrote:
>
>> The reset interval clears the latency tracked for each node so a bad node
>> will be read from again. The scores for each node are then updated every
>> 100ms (default) using the last 100 responses from a node.
>>
>> How long does the bad performance last for?
>>
> Only a few seconds and but there are a lot of read requests during this
> time
>
>>
>> What CL are you reading at ? At Quorum with RF 4 the read request will be
>> sent to 3 nodes, ordered by proximity and wellness according to the dynamic
>> snitch. (for background recent discussion on dynamic snitch
>> http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)
>>
> I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch
> and keys_cached = rows_cached = 0
>
>>
>> You can take a look at the weights and timings used by the DynamicSnitch
>> in JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level
>> you will be able to see which nodes the request is sent to.
>>
> Everything looks OK. The weights are around 3 for the nodes in the same
> data center and around 5 for the others. I will turn on the DEBUG level to
> see if I can find more info.
>
>>
>> My guess is the DynamicSnitch is doing the right thing and the slow down
>> is a node with a problem getting back into the list of nodes used for your
>> read. It's then moved down the list as it's bad performance is noticed.
>>
> Looking the DynamicSnitch MBean I don't see any problems with any of the
> nodes. My guess is that during the reset time there are reads that are sent
> to the other data center.
>
>>
>> Hope that helps
>> Aaron
>>
>
> Shimi
>
>
>>
>> On 12 Apr 2011, at 01:28, shimi wrote:
>>
>> I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
>> version for several days across 2 data centers.
>> I noticed that the read time in some of the nodes increase by x50-60 every
>> ten minutes.
>> There was no indication in the logs for something that happen at the same
>> time. The only thing that I know that is running every 10 minutes is
>> the dynamic snitch reset.
>> So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
>> have the problem once in every 20 minutes.
>>
>> I am running all nodes with:
>> replica_placement_strategy:
>> org.apache.cassandra.locator.NetworkTopologyStrategy
>>   strategy_options:
>> DC1 : 2
>> DC2 : 2
>>   replication_factor: 4
>>
>> (DC1 and DC2 are taken from the ips)
>> Does anyone familiar with this kind of behavior?
>>
>> Shimi
>>
>>
>>
>
>

Re: Index interval tuning

2011-05-11 Thread aaron morton

What are the values for RecentBloomFilterFalsePositives and 
BloomFilterFalsePositives the non ratio ones ?
 
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 11 May 2011, at 19:53, Héctor Izquierdo Seliva wrote:

> El mié, 11-05-2011 a las 14:24 +1200, aaron morton escribió:
>> What version and what were the values for RecentBloomFilterFalsePositives 
>> and BloomFilterFalsePositives ?
>> 
>> The bloom filter metrics are updated in SSTableReader.getPosition() the only 
>> slightly odd thing I can see is that we do not count a key cache hit a a 
>> true positive for the bloom filter. If there were a lot of key cache hits 
>> and a few false positives the ratio would be wrong. I'll ask around, does 
>> not seem to apply to Hectors case though. 
>> 
>> Cheers
> 
> 0.7.5, and I am no longer using key cache. I get the bloom filter stats
> via jmx. BloomFilterFalsePositiveRatio is always stuck at 1.0.
> RecentBloomFilterFalsePositiveRation fluctuates from 0 to 1.0 with no
> intermediate values. 
> 
> As for the index interval settings, I changed it from 128 to 256 and
> memory consumption was just a tad lower but read performance was worse
> by a few ms, so not much to gain there.
> 
>> -
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 11 May 2011, at 10:38, Chris Burroughs wrote:
>> 
>>> On 05/10/2011 02:12 PM, Peter Schuller wrote:
> That reminds me, my false positive ration is stuck at 1.0, so I guess
> bloom filters aren't doing a lot for me.
 
 That sounds unlikely unless you're hitting some edge case like reading
 a particular row that happened to be a collision, and only that row.
 This is from JMX stats on the column family store?
 
>>> 
>>> (From jmx)  I also see BloomFilterFalseRatio stuck at 1.0 on my
>>> production nodes.  The only values that RecentBloomFilterFalseRatio had
>>> over the past several minutes were 0.0 and 1.0.  While I can't prove
>>> that isn't accurate, it is very suspicions.
>>> 
>>> The code looked reasonable until I got to SSTableReader, which was too
>>> complicated to just glance through.
>> 
> 
>

Re: Finding big rows

2011-05-11 Thread aaron morton

Couple of questions to ask. You may also get some value from the #cassandra 
chat room where you can have a bit more of a conversation. 

- checking you ran  nodetool scrub when upgrading to 0.7.3 ? (not related to 
the current problem, just asking)
- what client library was using to write the data ?
- when you have DEBUG logging and run the get that fails do you see any  log 
messages that say "collecting %s of %s" ? (these mean the columns are been read 
by the query even if not returned). 
- not sure how easy it's going to be to pull 400MB of data through the server 
in one call. Take a look at thrift_max_message_length_in_mb and 
thrift_framed_transport_size_in_mb in the config. 

Hope that helps. 

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 11 May 2011, at 20:18, Meler Wojciech wrote:

> Thanks for reply. My app uses 7-bit ascii string row keys so I assume that 
> they could be directly used.
>  
> I’d like to fetch whole row. I was able to dump the big row with 
> sstable2json, but both my app and cli is unable to read the row from 
> cassandra.
> I see in json dump that all columns are marked as "deletedAt": 
> -9223372036854775808, so SuperColumn::isMarkedForDelete() should return 
> false. My cluster is running cassandra 0.7.4 and it path was 
> 0.7.0->0.7.2->0.7.3->0.7.4.
> What’s wrong? Bloom filters seems to be OK - I couldn’t find tool for reading 
> them but attached program does the job.
> I’m sure that both my app and cli refer to proper keys this big rows is 
> getting bigger and bigger as my app appends new super- and sub-columns to it, 
> but can’t read it:
> get mycf[utf8('my-key')];
> Returned 0 results.
> I’m really confused – tried to turn debug on, but I can’t see anything 
> interesting in it. Any ideas what to check next?
>  
>  
> Regards,
> Wojtek
> 
>  
> From: aaron morton [mailto:aa...@thelastpickle.com] 
> Sent: Wednesday, May 11, 2011 12:29 AM
> To: user@cassandra.apache.org
> Subject: Re: Finding big rows
>  
> I'm not aware of anything to find the row sizes, and your code looks like a 
> good approach. Converting the key bytes to a string only makes sense if your 
> app is doing the same thing. 
>   
> In the cli try using one of the data type functions to format the key the 
> same way as your app is, e.g. get FooCF[utf8('my-key')]
>  
> The main limitation on Super Columns is that Sub columns are not indexed 
> http://wiki.apache.org/cassandra/CassandraLimitations. If you have a huge row 
> use the get_slice() api call to get back slices of columns. The cli does not 
> support slicing columns. 
>  
> Hope that helps. 
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 10 May 2011, at 20:41, Meler Wojciech wrote:
> 
> 
> Hello,
>  
> I’ve noticed very nice stats exposed with JMX. I was quite shocked when I saw 
> that MaxRowSize was about 400MB (it was expected to be several MB).
> What is the best way to find keys of such big rows?
>  
> I couldn’t find anything so I’ve written simple program to dump sizes from 
> Index files (see attachment),
> and got the keys, but when I used cassandra-cli to get such columns it said 
> „Returned 0 results.”.
> I’ve realised that my app creates such big rows because it can’t read them 
> from Cassandra and recreates them every time.
>  
> Are there any tuneable limits for getting whole row?  Any limits on 
> supercolumns?
>  
> Regards,
> Wojtek
>  
> "WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 
> C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow 
> prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 
> 068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci 
> oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
> 
> 
>  
> 
> 
> "WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 
> C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow 
> prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 
> 068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci 
> oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
> 
>

Data types for cross language access

2011-05-11 Thread Oliver Dungey

I am currently working on a system with Cassandra that is written purely in
Java. I know our end solution will require other languages to access the
data in Cassandra (Python, C++ etc.). What is the best way to store data to
ensure I can do this? Should I serialize everything to strings/json/xml
prior to the byte conversion? We currently use the Hector serializer, I
wondered if we should just switch this to something like Jackson/JAXB? Any
thoughts very welcome.

Re: Index interval tuning

2011-05-11 Thread Héctor Izquierdo Seliva

Sorry aaron, here are the values you requested

RecentBloomFilterFalsePositives = 5;
BloomFilterFalsePositives = 385260;

uptime of the node is three days and a half, more or less


El mié, 11-05-2011 a las 22:05 +1200, aaron morton escribió:

> What are the values for RecentBloomFilterFalsePositives and 
> BloomFilterFalsePositives the non ratio ones ?
>  
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 11 May 2011, at 19:53, Héctor Izquierdo Seliva wrote:
> 
> > El mié, 11-05-2011 a las 14:24 +1200, aaron morton escribió:
> >> What version and what were the values for RecentBloomFilterFalsePositives 
> >> and BloomFilterFalsePositives ?
> >> 
> >> The bloom filter metrics are updated in SSTableReader.getPosition() the 
> >> only slightly odd thing I can see is that we do not count a key cache hit 
> >> a a true positive for the bloom filter. If there were a lot of key cache 
> >> hits and a few false positives the ratio would be wrong. I'll ask around, 
> >> does not seem to apply to Hectors case though. 
> >> 
> >> Cheers
> > 
> > 0.7.5, and I am no longer using key cache. I get the bloom filter stats
> > via jmx. BloomFilterFalsePositiveRatio is always stuck at 1.0.
> > RecentBloomFilterFalsePositiveRation fluctuates from 0 to 1.0 with no
> > intermediate values. 
> > 
> > As for the index interval settings, I changed it from 128 to 256 and
> > memory consumption was just a tad lower but read performance was worse
> > by a few ms, so not much to gain there.
> > 
> >> -
> >> Aaron Morton
> >> Freelance Cassandra Developer
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >> 
> >> On 11 May 2011, at 10:38, Chris Burroughs wrote:
> >> 
> >>> On 05/10/2011 02:12 PM, Peter Schuller wrote:
> > That reminds me, my false positive ration is stuck at 1.0, so I guess
> > bloom filters aren't doing a lot for me.
>  
>  That sounds unlikely unless you're hitting some edge case like reading
>  a particular row that happened to be a collision, and only that row.
>  This is from JMX stats on the column family store?
>  
> >>> 
> >>> (From jmx)  I also see BloomFilterFalseRatio stuck at 1.0 on my
> >>> production nodes.  The only values that RecentBloomFilterFalseRatio had
> >>> over the past several minutes were 0.0 and 1.0.  While I can't prove
> >>> that isn't accurate, it is very suspicions.
> >>> 
> >>> The code looked reasonable until I got to SSTableReader, which was too
> >>> complicated to just glance through.
> >> 
> > 
> > 
>

How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Maki Watanabe

Hello,
It's a question on jconsole rather than cassandra, how can I invoke
getNaturalEndpoints with jconsole?

org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints

I want to run this method to find nodes which are responsible to store
data for specific row key.
I can find this method on jconsole but I can't invoke it because the
button is gray out and doesn't accept
click.

Thanks,
-- 
maki

Re: EC2 Snitch

2011-05-11 Thread Vijay

We are using this patch in our multi-region testing... yes this approach is
going to be integrated into
https://issues.apache.org/jira/browse/CASSANDRA-2491 once it is committed
(you might want to wait for that). Yes this fix the Amazon infrastructure
problems and it will automatically detect the DC and RAC only thing is
that we have to allow IP's in the security groups to talk to each
other's region.

Regards,

On Tue, May 10, 2011 at 6:19 PM, Sameer Farooqui wrote:

> Has anybody successfully used EC2 Snitch for cross-region deployments on
> EC2? Brandon Williams has not recommended using this just yet, but I was
> curious if anybody is using it with 0.8.0.
>
> Also, the snitch just let's the cluster automatically discover what the
> different regions (aka data centers) and racks (aka availability zones) are,
> right? So, does this fix the Amazon NATed infrastructure problem where we
> can't use the Amazon external IP as the Cassandra listen address? Cassandra
> can only bind to addresses that are attached to the server and the Amazon
> external IP is NATed so Cassandra can't see it.
>
> The EC2-Snitch patch was kinda unclear on this:
> https://issues.apache.org/jira/browse/CASSANDRA-2452
>
>
> Also, the nightly builds link to Cassandra seems to be down, has it been
> relocated?
>
> (click on Latest Builds/Hudson link on right):
> http://cassandra.apache.org/download/
>
> This link is dead:
> http://hudson.zones.apache.org/hudson/job/Cassandra/lastSuccessfulBuild/artifact/cassandra/build/
>

RE: Finding big rows

2011-05-11 Thread Meler Wojciech

I didn't run nodetool scrub. My app uses c++ thrift client (0.5.0 and 0.6.1) .
As this is production environment I get a lot of messages "collecting %s of 
%s", but there is no row key.
I've matched it by uuid and thread - hope it is ok:

[ReadStage:3][org.apache.cassandra.db.filter.SliceQueryFilter] collecting 0 of 
100: SuperColumn(455470c2-60e6-11e0-acae-e41f13798a50 
[4554853a-60e6-11e0-9342-e41f13798a50:false:361745@0,aa92b386-60e6-11e0-bc58-e41f13798a50:false:159@0,ac346cc0-60e6-11e0-bf8e-e41f13798a50:false:53@0,ad16d362-60e6-11e0-ad9c-e41f13798a50:false:66@0,ae57076a-60e6-11e0-8982-e41f13798a50:false:48@0,afc042ba-60e6-11e0-8320-e41f13798a50:false:63@0,b38d912c-60e6-11e0-b3b7-e41f13798a50:false:164@0,49d7dc00-60e7-11e0-9694-e41f13798a50:false:100@0,94e99fa8-60e7-11e0-b621-e41f13798a50:false:233@0,9612c3a0-60e7-11e0-8292-e41f13798a50:false:4049@0,ec245880-60e7-11e0-85be-e41f13798a50:false:148@0,110ac968-60e8-11e0-b325-e41f13798a50:false:125@0,64a45b4c-60e9-11e0-9628-e41f13798a50:false:160@0,cfc39a00-60e9-11e0-9539-e41f13798a50:false:105@0,2a21ab22-60ea-11e0-a1f6-e41f13798a50:false:146@0,95f2d2d6-60ea-11e0-b763-e41f13798a50:false:53@0,97972362-60ea-11e0-9275-e41f13798a50:false:134@0,ce980606-60ea-11e0-bd03-e41f13798a50:false:195@0,517e02dc-60eb-11e0-b0a8-e41f13798a50:false:53@0,5694c74c-60eb-11e0-941f-e41f13798a50:false:170@0,8d48cca2-60eb-11e0-bdbc-e41f13798a50:false:187@0,fc5e0148-60eb-11e0-ac4c-e41f13798a50:false:558@0,fc7e9476-60eb-11e0-bbfe-e41f13798a50:false:161@0,22a860a0-60ec-11e0-8d1b-e41f13798a50:false:138@0,bf054b52-60ec-11e0-a4b1-e41f13798a50:false:56@0,fd3b4822-60ec-11e0-8612-e41f13798a50:false:234@0,0c1d6fe0-60ee-11e0-8bb0-e41f13798a50:false:79@0,43bbddec-60ee-11e0-848f-e41f13798a50:false:79@0,ec50ed8e-60ef-11e0-a8c1-e41f13798a50:false:60@0,57ae0534-60f1-11e0-8fd5-e41f13798a50:false:60@0,81d7bbe8-60f1-11e0-9586-e41f13798a50:false:143@0,08ce8852-60f2-11e0-bfeb-e41f13798a50:false:150@0,c51bf170-60f2-11e0-8a60-e41f13798a50:false:192@0,e98b181e-60f3-11e0-b20b-e41f13798a50:false:108@0,ed931d76-60f3-11e0-90cf-e41f13798a50:false:146@0,f754541a-60f3-11e0-84db-e41f13798a50:false:60@0,24a73220-60f4-11e0-ae03-e41f13798a50:false:139@0,33485db8-60f4-11e0-9f86-e41f13798a50:false:196@0,42fe380e-60f4-11e0-9f0f-e41f13798a50:false:150@0,440afbb0-60f4-11e0-a066-e41f13798a50:false:122@0,64d9d13a-60f5-11e0-a30c-e41f13798a50:false:60@0,9e1cefc2-60f5-11e0-826e-e41f13798a50:false:205@0,1f642a46-60f6-11e0-8aa3-e41f13798a50:false:298@0,1f6c946a-60f6-11e0-9eab-e41f13798a50:false:117@0,d1a4e8c6-60f6-11e0-9bfa-e41f13798a50:false:58@0,0a40ba5c-60f7-11e0-893f-e41f13798a50:false:170@0,a2e0f740-60f7-11e0-93b0-e41f13798a50:false:108@0,3922060e-60f8-11e0-b850-e41f13798a50:false:147@0,3cdcdf08-60f8-11e0-8320-e41f13798a50:false:60@0,79b33a26-60f8-11e0-b151-e41f13798a50:false:187@0,aae281c8-60f9-11e0-9d14-e41f13798a50:false:60@0,e1b3295a-60f9-11e0-b367-e41f13798a50:false:81@0,8149870c-60fa-11e0-b6d4-e41f13798a50:false:128@0,b483680e-60fa-11e0-a56f-e41f13798a50:false:164@0,3dfdfe78-60fb-11e0-8582-e41f13798a50:false:143@0,15a9b830-60fc-11e0-be81-e41f13798a50:false:217@0,46ae8aae-60fd-11e0-908e-e41f13798a50:false:60@0,7ee58490-60fd-11e0-9072-e41f13798a50:false:81@0,bf4fba30-60ff-11e0-9b5f-e41f13798a50:false:60@0,26201952-6101-11e0-a4cb-e41f13798a50:false:60@0,8e5e995c-6102-11e0-9bb3-e41f13798a50:false:60@0,a4ef01c0-6102-11e0-9bff-e41f13798a50:false:214@0,a5034784-6102-11e0-96ee-e41f13798a50:false:133@0,ad9f3682-6102-11e0-8df5-e41f13798a50:false:172@0,e2e01b36-6102-11e0-8489-e41f13798a50:false:155@0,25561204-6103-11e0-a14b-e41f13798a50:false:155@0,4d39d2a6-6103-11e0-844a-e41f13798a50:false:152@0,74aeb4fa-6103-11e0-9f65-e41f13798a50:false:140@0,7e8802ba-6103-11e0-81f1-e41f13798a50:false:140@0,8e6532f2-6103-11e0-acd4-e41f13798a50:false:141@0,fceadee8-6103-11e0-8beb-e41f13798a50:false:313@0,2d319618-6105-11e0-9efa-e41f13798a50:false:60@0,4d4540bc-6105-11e0-96d1-e41f13798a50:false:128@0,63fcaa02-6105-11e0-ac35-e41f13798a50:false:81@0,67c70114-6105-11e0-9ab8-e41f13798a50:false:167@0,3a104306-6106-11e0-8be9-e41f13798a50:false:145@0,ce884cf4-6106-11e0-92ba-e41f13798a50:false:208@0,fe230516-6107-11e0-a9fa-e41f13798a50:false:60@0,354f86e0-6108-11e0-970f-e41f13798a50:false:81@0,9fbcaef8-6109-11e0-9b5a-e41f13798a50:false:60@0,55b4cca4-610a-11e0-a82d-e41f13798a50:false:129@0,ddc1395c-610a-11e0-9541-e41f13798a50:false:166@0,0934dec2-610b-11e0-a63b-e41f13798a50:false:192@0,39562880-610c-11e0-af41-e41f13798a50:false:310@0,74296bb6-610c-11e0-bdc4-e41f13798a50:false:192@0,cfe50906-610c-11e0-9fc9-e41f13798a50:false:108@0,dd594b9c-610c-11e0-9db1-e41f13798a50:false:60@0,dd5ce428-610c-11e0-8962-e41f13798a50:false:92@0,1a0804d4-610d-11e0-b783-e41f13798a50:false:178@0,5135e390-610d-11e0-9d71-e41f13798a50:false:62@0,a3f68ada-610d-11e0-a6ef-e41f13798a50:false:60@0,db2a37a4-610d-11e0-b795-e41f13798a50:false:62@0,15802f0c-610f-11e0-abfe-e41f13798a50:false:128@0,42d810be-610f-11e0-8f4d-e41f13798a50:false:62@0,54456e96-610f-11e0-88a0-e41f13798a50:false:

Re: Index interval tuning

2011-05-11 Thread Chris Burroughs

On 05/10/2011 10:24 PM, aaron morton wrote:
> What version and what were the values for RecentBloomFilterFalsePositives and 
> BloomFilterFalsePositives ?
> 
> The bloom filter metrics are updated in SSTableReader.getPosition() the only 
> slightly odd thing I can see is that we do not count a key cache hit a a true 
> positive for the bloom filter. If there were a lot of key cache hits and a 
> few false positives the ratio would be wrong. I'll ask around, does not seem 
> to apply to Hectors case though. 

0.7.1  No key cache.

BloomFilterFalsePositives: 48130
Read Count: 153973494
RecentBloomFilterFalsePositives: 4, 1, 2, 0, 0, 1

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Nick Bailey

As far as I know you can not call getNaturalEndpoints from jconsole
because it takes a byte array as a parameter and jconsole doesn't
provide a way for inputting a byte array. You might be able to use the
thrift call 'describe_ring' to do what you want though. You will have
to manually hash your key to see what range it falls in however.

On Wed, May 11, 2011 at 6:14 AM, Maki Watanabe  wrote:
> Hello,
> It's a question on jconsole rather than cassandra, how can I invoke
> getNaturalEndpoints with jconsole?
>
> org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints
>
> I want to run this method to find nodes which are responsible to store
> data for specific row key.
> I can find this method on jconsole but I can't invoke it because the
> button is gray out and doesn't accept
> click.
>
> Thanks,
> --
> maki
>

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Maki Watanabe

Thanks,

So my options are:
1. Write a thrift client code to call describe_ring with hashed key
or
2. Write a JMX client code to call getNaturalEndpoints

right?

2011/5/11 Nick Bailey :
> As far as I know you can not call getNaturalEndpoints from jconsole
> because it takes a byte array as a parameter and jconsole doesn't
> provide a way for inputting a byte array. You might be able to use the
> thrift call 'describe_ring' to do what you want though. You will have
> to manually hash your key to see what range it falls in however.
>
> On Wed, May 11, 2011 at 6:14 AM, Maki Watanabe  
> wrote:
>> Hello,
>> It's a question on jconsole rather than cassandra, how can I invoke
>> getNaturalEndpoints with jconsole?
>>
>> org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints
>>
>> I want to run this method to find nodes which are responsible to store
>> data for specific row key.
>> I can find this method on jconsole but I can't invoke it because the
>> button is gray out and doesn't accept
>> click.
>>
>> Thanks,
>> --
>> maki
>>
>



-- 
w3m

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Nick Bailey

Yes.

On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe  wrote:
> Thanks,
>
> So my options are:
> 1. Write a thrift client code to call describe_ring with hashed key
> or
> 2. Write a JMX client code to call getNaturalEndpoints
>
> right?
>
> 2011/5/11 Nick Bailey :
>> As far as I know you can not call getNaturalEndpoints from jconsole
>> because it takes a byte array as a parameter and jconsole doesn't
>> provide a way for inputting a byte array. You might be able to use the
>> thrift call 'describe_ring' to do what you want though. You will have
>> to manually hash your key to see what range it falls in however.
>>
>> On Wed, May 11, 2011 at 6:14 AM, Maki Watanabe  
>> wrote:
>>> Hello,
>>> It's a question on jconsole rather than cassandra, how can I invoke
>>> getNaturalEndpoints with jconsole?
>>>
>>> org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints
>>>
>>> I want to run this method to find nodes which are responsible to store
>>> data for specific row key.
>>> I can find this method on jconsole but I can't invoke it because the
>>> button is gray out and doesn't accept
>>> click.
>>>
>>> Thanks,
>>> --
>>> maki
>>>
>>
>
>
>
> --
> w3m
>

Re: compaction strategy

2011-05-11 Thread Jonathan Ellis

You are of course free to reduce the min per bucket to 2.

The fundamental idea of sstables + compaction is to trade disk space
for higher write performance. For most applications this is the right
trade to make on modern hardware... I don't think you'll get very far
trying to get the 2nd without the 1st.

On Wed, May 11, 2011 at 3:49 AM, Terje Marthinussen
 wrote:
>>
>> Not sure I follow you. 4 sstables is the minimum compaction look for
>> (by default).
>> If there is 30 sstables of ~20MB sitting there because compaction is
>> behind, you
>> will compact those 30 sstables together (unless there is not enough space
>> for
>> that and considering you haven't changed the max compaction threshold (32
>> by
>> default)). And you can increase max threshold.
>> Don't get me wrong, I'm not pretending this works better than it does, but
>> let's not pretend either that it's worth than it is.
>>
>
> Sorry, I am not trying to pretend anything or blow it out of proportions.
> Just reacting to what I see.
> This is what I see after some stress testing of some pretty decent HW.
> 81     Up     Normal  181.6 GB        8.33%   Token(bytes[30])
>
> 82     Up     Normal  501.43 GB       8.33%   Token(bytes[313230])
>
> 83     Up     Normal  248.07 GB       8.33%   Token(bytes[313437])
>
> 84     Up     Normal  349.64 GB       8.33%   Token(bytes[313836])
>
> 85     Up     Normal  511.55 GB       8.33%   Token(bytes[323336])
>
> 86     Up     Normal  654.93 GB       8.33%   Token(bytes[333234])
>
> 87     Up    Normal  534.77 GB       8.33%   Token(bytes[333939])
>
> 88     Up   Normal  525.88 GB       8.33%   Token(bytes[343739])
>
> 89     Up     Normal  476.6 GB        8.33%   Token(bytes[353730])
>
> 90     Up     Normal  424.89 GB       8.33%   Token(bytes[363635])
>
> 91     Up     Normal  338.14 GB       8.33%   Token(bytes[383036])
>
> 92     Up     Normal  546.95 GB       8.33%   Token(bytes[6a])
> .81 has been exposed to a full compaction. It had ~370GB before that and the
> resulting sstable is 165GB.
> The other nodes has only been doing minor compactions
> I think this is a problem.
> You are of course free to disagree.
> I do however recommend doing a simulation on potential worst case scenarios
> if many of the buckets end up with 3 sstables and don't compact for a while.
> The disk space requirements  get pretty bad even without getting into
> theoretical worst cases.
> Regards,
> Terje



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Finding big rows

2011-05-11 Thread Peter Schuller

> What is the best way to find keys of such big rows?

One, if not necessarily the best, way is to check system.log for large
row warnings that trigger for rows large enough to be compacted
lazily. Grep for 'azy' (or lazy case-insens) and you should find it.

-- 
/ Peter Schuller

Online text search with Hadoop/Brisk

2011-05-11 Thread Ben Scholl

I keep reading that Hadoop/Brisk is not suitable for online querying, only
for offline/batch processing. What exactly are the reasons it is unsuitable?
My use case is a fairly high query load, and each query ideally would return
within about 20 seconds. The queries will use indexes to narrow down the
result set first, but they also need to support text search on one of the
fields. I was thinking of simulating the SQL LIKE statement, by running each
query as a MapReduce job so that the text search gets distributed between
nodes.

I know the recommended approach is to keep a seperate full-text index, but
that could be quite space-intensive, and also means you can only search on
complete words. Any thoughts on this approach?

Thanks,

Ben

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Maki Watanabe

Add a new faq:
http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg

2011/5/11 Nick Bailey :
> Yes.
>
> On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe  
> wrote:
>> Thanks,
>>
>> So my options are:
>> 1. Write a thrift client code to call describe_ring with hashed key
>> or
>> 2. Write a JMX client code to call getNaturalEndpoints
>>
>> right?
>>
>> 2011/5/11 Nick Bailey :
>>> As far as I know you can not call getNaturalEndpoints from jconsole
>>> because it takes a byte array as a parameter and jconsole doesn't
>>> provide a way for inputting a byte array. You might be able to use the
>>> thrift call 'describe_ring' to do what you want though. You will have
>>> to manually hash your key to see what range it falls in however.
>>>
>>> On Wed, May 11, 2011 at 6:14 AM, Maki Watanabe  
>>> wrote:
 Hello,
 It's a question on jconsole rather than cassandra, how can I invoke
 getNaturalEndpoints with jconsole?

 org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints

 I want to run this method to find nodes which are responsible to store
 data for specific row key.
 I can find this method on jconsole but I can't invoke it because the
 button is gray out and doesn't accept
 click.

 Thanks,
 --
 maki

>>>
>>
>>
>>
>> --
>> w3m
>>
>



-- 
w3m

Re: How to invoke getNaturalEndpoints with jconsole?

2011-05-11 Thread Jonathan Ellis

Thanks!

On Wed, May 11, 2011 at 10:20 AM, Maki Watanabe  wrote:
> Add a new faq:
> http://wiki.apache.org/cassandra/FAQ#jconsole_array_arg
>
> 2011/5/11 Nick Bailey :
>> Yes.
>>
>> On Wed, May 11, 2011 at 8:25 AM, Maki Watanabe  
>> wrote:
>>> Thanks,
>>>
>>> So my options are:
>>> 1. Write a thrift client code to call describe_ring with hashed key
>>> or
>>> 2. Write a JMX client code to call getNaturalEndpoints
>>>
>>> right?
>>>
>>> 2011/5/11 Nick Bailey :
 As far as I know you can not call getNaturalEndpoints from jconsole
 because it takes a byte array as a parameter and jconsole doesn't
 provide a way for inputting a byte array. You might be able to use the
 thrift call 'describe_ring' to do what you want though. You will have
 to manually hash your key to see what range it falls in however.

 On Wed, May 11, 2011 at 6:14 AM, Maki Watanabe  
 wrote:
> Hello,
> It's a question on jconsole rather than cassandra, how can I invoke
> getNaturalEndpoints with jconsole?
>
> org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints
>
> I want to run this method to find nodes which are responsible to store
> data for specific row key.
> I can find this method on jconsole but I can't invoke it because the
> button is gray out and doesn't accept
> click.
>
> Thanks,
> --
> maki
>

>>>
>>>
>>>
>>> --
>>> w3m
>>>
>>
>
>
>
> --
> w3m
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Data types for cross language access

2011-05-11 Thread Luke Biddell

I wouldn't mind knowing how other people are approaching this problem too.

On 11 May 2011 11:27, Oliver Dungey  wrote:
> I am currently working on a system with Cassandra that is written purely in
> Java. I know our end solution will require other languages to access the
> data in Cassandra (Python, C++ etc.). What is the best way to store data to
> ensure I can do this? Should I serialize everything to strings/json/xml
> prior to the byte conversion? We currently use the Hector serializer, I
> wondered if we should just switch this to something like Jackson/JAXB? Any
> thoughts very welcome.

Re: Index interval tuning

2011-05-11 Thread Jonathan Ellis

Close: the problem is we don't count *any* true positives *unless*
cache is enabled.

Fix attached to https://issues.apache.org/jira/browse/CASSANDRA-2637.

On Wed, May 11, 2011 at 7:04 AM, Chris Burroughs
 wrote:
> On 05/10/2011 10:24 PM, aaron morton wrote:
>> What version and what were the values for RecentBloomFilterFalsePositives 
>> and BloomFilterFalsePositives ?
>>
>> The bloom filter metrics are updated in SSTableReader.getPosition() the only 
>> slightly odd thing I can see is that we do not count a key cache hit a a 
>> true positive for the bloom filter. If there were a lot of key cache hits 
>> and a few false positives the ratio would be wrong. I'll ask around, does 
>> not seem to apply to Hectors case though.
>
> 0.7.1  No key cache.
>
> BloomFilterFalsePositives: 48130
> Read Count: 153973494
> RecentBloomFilterFalsePositives: 4, 1, 2, 0, 0, 1
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Data types for cross language access

2011-05-11 Thread Alex Araujo


On 5/11/11 5:27 AM, Oliver Dungey wrote:
I am currently working on a system with Cassandra that is written 
purely in Java. I know our end solution will require other languages 
to access the data in Cassandra (Python, C++ etc.). What is the best 
way to store data to ensure I can do this? Should I serialize 
everything to strings/json/xml prior to the byte conversion? We 
currently use the Hector serializer, I wondered if we should just 
switch this to something like Jackson/JAXB? Any thoughts very welcome. 
I believe most high level (non-Thrift) clients convert types to/from 
bytes consistently without additional serialization (XML, JSON, etc).  
There may be a few tricks to working with TimeUUIDs for slices, but at 
least the Java and Python versions appear to be compatible. It's 
probably worth writing a few tests using your target languages to make sure:


http://wiki.apache.org/cassandra/ClientOptions

Don't see a C++ client, but a quick Google search turned up:

http://github.com/posulliv/libcassandra

Re: Data types for cross language access

2011-05-11 Thread Nate McCall

You should have no problems with byte conversion consistencies. For
the serialization test cases in Hector, we verify the most of the
results with o.a.c.utils.ByteBufferUtil from Cassandra source.

On Wed, May 11, 2011 at 10:23 AM, Luke Biddell  wrote:
> I wouldn't mind knowing how other people are approaching this problem too.
>
> On 11 May 2011 11:27, Oliver Dungey  wrote:
>> I am currently working on a system with Cassandra that is written purely in
>> Java. I know our end solution will require other languages to access the
>> data in Cassandra (Python, C++ etc.). What is the best way to store data to
>> ensure I can do this? Should I serialize everything to strings/json/xml
>> prior to the byte conversion? We currently use the Hector serializer, I
>> wondered if we should just switch this to something like Jackson/JAXB? Any
>> thoughts very welcome.
>

Re: Data types for cross language access

2011-05-11 Thread Eric tamme

> On Wed, May 11, 2011 at 10:23 AM, Luke Biddell  wrote:
>> I wouldn't mind knowing how other people are approaching this problem too.
>>
>> On 11 May 2011 11:27, Oliver Dungey  wrote:
>>> I am currently working on a system with Cassandra that is written purely in
>>> Java. I know our end solution will require other languages to access the
>>> data in Cassandra (Python, C++ etc.). What is the best way to store data to
>>> ensure I can do this? Should I serialize everything to strings/json/xml
>>> prior to the byte conversion? We currently use the Hector serializer, I
>>> wondered if we should just switch this to something like Jackson/JAXB? Any
>>> thoughts very welcome.
>>
>

We have clients that use python and C++.  We just generated thrift
bindings for C++ and use thrift directly - it really is not bad at
all, the cassandra.h file generated defines most all methods that any
higher level API would, it just doesn't have built in pooling, or
reconnect etc.

As far as data formats go - we are capturing data (packets) using
libpcap and storing the raw bytes  (literally the entire raw packet
including ethernet and ip headers) into cassandra which are later read
out by python scripts for ascii display, or conversion to pcap files
to open in wireshark.

-Eric

Talk on DataStax Brisk on Monday at Cassandra London

2011-05-11 Thread Dave Gardner

Hi all,

Any London-based people who are interested in Brisk should come along to the
Cassandra London meetup on Monday. There will be a talk and live demo.

http://www.meetup.com/Cassandra-London/events/16643691/


Dave

Choice of Index

2011-05-11 Thread Baskar Duraikannu

Hello -
I am using 0.8 Beta 2 and have a CF containing COMPANY, ACCOUNTNUMBER and
some account related data.  I have index on both Company and AccountNumber.

If I run a query -

SELECT  FROM COMPANYCF WHERE COMPANY='XXX' AND ACCOUNTNUMBER = 'YYY'


Even though ACCOUNTNUMBER based Index is a better index to use for the above
query compared to COMPANY based Index, Cassandra seems to pick COMPANY
index.

Does Cassandra always uses Index on first "where" clause?

I can always change the above query to have account number as the first
"where" clause.
Just wanted to understand whether any kind of index optimization built into
Cassandra 0.8.

Thanks
Baskar Duraikannu

Re: Online text search with Hadoop/Brisk

2011-05-11 Thread Edward Capriolo

On Wed, May 11, 2011 at 11:19 AM, Ben Scholl  wrote:
> I keep reading that Hadoop/Brisk is not suitable for online querying, only
> for offline/batch processing. What exactly are the reasons it is unsuitable?
> My use case is a fairly high query load, and each query ideally would return
> within about 20 seconds. The queries will use indexes to narrow down the
> result set first, but they also need to support text search on one of the
> fields. I was thinking of simulating the SQL LIKE statement, by running each
> query as a MapReduce job so that the text search gets distributed between
> nodes.
> I know the recommended approach is to keep a seperate full-text index, but
> that could be quite space-intensive, and also means you can only search on
> complete words. Any thoughts on this approach?
> Thanks,
> Ben

Brisk was made to me a tight integration of Cassandra Hadoop and Hive.

If you are looking to full text searches you should look at Solandra,
https://github.com/tjake/Solandra, which is an Cassandra backend for
the Solr/Lucene indexes.

Edward

Excessive allocation during hinted handoff

2011-05-11 Thread Gabriel Tataranu

Greetings,

I'm experiencing some issues with 2 nodes (out of more than 10). Right
after startup (Listening for thrift clients...) the nodes will create
objects at high rate using all available CPU cores:

 INFO 18:13:15,350 GC for PS Scavenge: 292 ms, 494902976 reclaimed
leaving 2024909864 used; max is 6658457600
 INFO 18:13:20,393 GC for PS Scavenge: 252 ms, 478691280 reclaimed
leaving 2184252600 used; max is 6658457600

 INFO 18:15:23,909 GC for PS Scavenge: 283 ms, 452943472 reclaimed
leaving 5523891120 used; max is 6658457600
 INFO 18:15:24,912 GC for PS Scavenge: 273 ms, 466157568 reclaimed
leaving 5594606128 used; max is 6658457600

This will eventually trigger old-gen GC and then the process repeats
until hinted handoff finishes.

The build version was updated from 0.7.2 to 0.7.5 but the behavior was
exactly the same.

Thank you.

jsvc hangs shell

2011-05-11 Thread Anton Belyaev

Hello,

I installed 0.7.5 to my Ubuntu 11.04 64 bit from package at
deb http://www.apache.org/dist/cassandra/debian 07x main

And I met really strange problem.
Any shell command that requires Cassandra's jsvc command line (for
example, "ps -ef", or "top" with cmdline args) - just hangs.
Using STRACE I found out that commands hang during reading
/proc//cmdline.
I tried to "cat" the file - shell hung.

I tried both OpenJDK and Sun JDK - the bug remains.
I tried 0.6.13 on the same machine - works fine.
I tried 0.7.5 on another machine (with older Ubuntu) - works fine.

I believe this is not a Cassandra bug. But I am not sure where to ask
help with the problem.
Could you please advise what should I check to find out where is the problem?

Thanks.
Anton.

Re: Choice of Index

2011-05-11 Thread Jonathan Ellis

No, Cassandra uses statistics to see which index will result in less
rows to check.

On Wed, May 11, 2011 at 12:42 PM, Baskar Duraikannu
 wrote:
> Hello -
> I am using 0.8 Beta 2 and have a CF containing COMPANY, ACCOUNTNUMBER and
> some account related data.  I have index on both Company and AccountNumber.
> If I run a query -
>
> SELECT  FROM COMPANYCF WHERE COMPANY='XXX' AND ACCOUNTNUMBER = 'YYY'
>
> Even though ACCOUNTNUMBER based Index is a better index to use for the above
> query compared to COMPANY based Index, Cassandra seems to pick COMPANY
> index.
> Does Cassandra always uses Index on first "where" clause?
> I can always change the above query to have account number as the first
> "where" clause.
> Just wanted to understand whether any kind of index optimization built into
> Cassandra 0.8.
> Thanks
> Baskar Duraikannu
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Index interval tuning

2011-05-11 Thread aaron morton

Thanks
A

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 May 2011, at 03:44, Jonathan Ellis wrote:

> Close: the problem is we don't count *any* true positives *unless*
> cache is enabled.
> 
> Fix attached to https://issues.apache.org/jira/browse/CASSANDRA-2637.
> 
> On Wed, May 11, 2011 at 7:04 AM, Chris Burroughs
>  wrote:
>> On 05/10/2011 10:24 PM, aaron morton wrote:
>>> What version and what were the values for RecentBloomFilterFalsePositives 
>>> and BloomFilterFalsePositives ?
>>> 
>>> The bloom filter metrics are updated in SSTableReader.getPosition() the 
>>> only slightly odd thing I can see is that we do not count a key cache hit a 
>>> a true positive for the bloom filter. If there were a lot of key cache hits 
>>> and a few false positives the ratio would be wrong. I'll ask around, does 
>>> not seem to apply to Hectors case though.
>> 
>> 0.7.1  No key cache.
>> 
>> BloomFilterFalsePositives: 48130
>> Read Count: 153973494
>> RecentBloomFilterFalsePositives: 4, 1, 2, 0, 0, 1
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com

Re: jsvc hangs shell

2011-05-11 Thread jonathan . colby

We use the Java Service Wrapper from Tanuki Software and are very happy  
with it. It's a lot more robust than jsvc.


http://wrapper.tanukisoftware.com/doc/english/download.jsp

The free community version will be enough in most cases.

Jon

On May 11, 2011 10:30pm, Anton Belyaev  wrote:

Hello,





I installed 0.7.5 to my Ubuntu 11.04 64 bit from package at



deb http://www.apache.org/dist/cassandra/debian 07x main





And I met really strange problem.



Any shell command that requires Cassandra's jsvc command line (for



example, "ps -ef", or "top" with cmdline args) - just hangs.



Using STRACE I found out that commands hang during reading



/proc//cmdline.



I tried to "cat" the file - shell hung.





I tried both OpenJDK and Sun JDK - the bug remains.



I tried 0.6.13 on the same machine - works fine.



I tried 0.7.5 on another machine (with older Ubuntu) - works fine.





I believe this is not a Cassandra bug. But I am not sure where to ask



help with the problem.


Could you please advise what should I check to find out where is the  
problem?





Thanks.



Anton.

Re: jsvc hangs shell

2011-05-11 Thread Anton Belyaev

I guess it is not trivial to modify the package to make it use JSW
instead of JSVC.
I am still not sure the JSVC itself is a culprit. Maybe something is
wrong in my setup.

2011/5/12  :
> We use the Java Service Wrapper from Tanuki Software and are very happy with
> it. It's a lot more robust than jsvc.
>
> http://wrapper.tanukisoftware.com/doc/english/download.jsp
>
> The free community version will be enough in most cases.
>
> Jon
>
> On May 11, 2011 10:30pm, Anton Belyaev  wrote:
>> Hello,
>>
>>
>>
>> I installed 0.7.5 to my Ubuntu 11.04 64 bit from package at
>>
>> deb http://www.apache.org/dist/cassandra/debian 07x main
>>
>>
>>
>> And I met really strange problem.
>>
>> Any shell command that requires Cassandra's jsvc command line (for
>>
>> example, "ps -ef", or "top" with cmdline args) - just hangs.
>>
>> Using STRACE I found out that commands hang during reading
>>
>> /proc//cmdline.
>>
>> I tried to "cat" the file - shell hung.
>>
>>
>>
>> I tried both OpenJDK and Sun JDK - the bug remains.
>>
>> I tried 0.6.13 on the same machine - works fine.
>>
>> I tried 0.7.5 on another machine (with older Ubuntu) - works fine.
>>
>>
>>
>> I believe this is not a Cassandra bug. But I am not sure where to ask
>>
>> help with the problem.
>>
>> Could you please advise what should I check to find out where is the
>> problem?
>>
>>
>>
>> Thanks.
>>
>> Anton.
>>

Keyspace creation error on 0.8 beta2

2011-05-11 Thread Sameer Farooqui

When I run this from the Cassandra CMD-Line:
create keyspace MyKeySpace with placement_strategy =
'org.apache.cassandra.locator.SimpleStrategy' and strategy_options =
[{replication_factor:2}];

I get this error: Internal error processing system_add_keyspace

My syntax is correct for creating the keyspace (I think) because I got it
from the "help create keyspace;" examples from the CMD-line.

Cassandra system log shows:
ERROR [pool-2-thread-1] 2011-05-11 22:52:04,577 Cassandra.java (line 3918)
Internal error processing system_add_keyspace
java.lang.RuntimeException: java.util.concurrent.ExecutionException:
java.lang.NoSuchMethodError:
org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
at
org.apache.cassandra.thrift.CassandraServer.applyMigrationOnStage(CassandraServer.java:793)
at
org.apache.cassandra.thrift.CassandraServer.system_add_keyspace(CassandraServer.java:881)
at
org.apache.cassandra.thrift.Cassandra$Processor$system_add_keyspace.process(Cassandra.java:3912)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.util.concurrent.ExecutionException:
java.lang.NoSuchMethodError:
org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.cassandra.thrift.CassandraServer.applyMigrationOnStage(CassandraServer.java:785)
... 7 more
Caused by: java.lang.NoSuchMethodError:
org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
at
org.apache.cassandra.config.KSMetaData.toString(KSMetaData.java:114)
at
org.apache.cassandra.db.migration.AddKeyspace.toString(AddKeyspace.java:94)
at
org.apache.cassandra.db.migration.Migration.apply(Migration.java:119)
at
org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:778)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
... 3 more
ERROR [MigrationStage:1] 2011-05-11 22:52:04,580
AbstractCassandraDaemon.java (line 112) Fatal exception in thread
Thread[MigrationStage:1,5,main]
java.lang.NoSuchMethodError:
org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
at
org.apache.cassandra.config.KSMetaData.toString(KSMetaData.java:114)
at
org.apache.cassandra.db.migration.AddKeyspace.toString(AddKeyspace.java:94)
at
org.apache.cassandra.db.migration.Migration.apply(Migration.java:119)
at
org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:778)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

network topology issue

2011-05-11 Thread Anurag Gujral

Hi All,
 I am testing network topology strategy in cassandra I am using
two nodes , one node each in different data center.
Since the nodes are in different dc I assigned token 0 to both the nodes.
I added both the nodes as seeds in the cassandra.yaml and  I am  using
properyfilesnitch as endpoint snitch where I have specified the colo
details.

I started first node then I when I restarted second node I got an error that
token "0" is already being used.Why am I getting this error.

Second Question: I already have cassandra running in two different data
centers I want to add a new keyspace which uses networkTopology strategy
in the light of above errors how can I accomplish this.


Thanks
Anurag

Re: Ec2 Stress Results

2011-05-11 Thread Alex Araujo


On 5/9/11 9:49 PM, Jonathan Ellis wrote:

On Mon, May 9, 2011 at 5:58 PM, Alex Araujo>  How many
replicas are you writing?

Replication factor is 3.

So you're actually spot on the predicted numbers: you're pushing
20k*3=60k "raw" rows/s across your 4 machines.

You might get another 10% or so from increasing memtable thresholds,
but bottom line is you're right around what we'd expect to see.
Furthermore, CPU is the primary bottleneck which is what you want to
see on a pure write workload.

That makes a lot more sense.  I upgraded the cluster to 4 m2.4xlarge 
instances (68GB of RAM/8 CPU cores) in preparation for application 
stress tests and the results were impressive @ 200 threads per client:


+--+--+--+--+--+--+--+--+--+
| Server Nodes | Client Nodes | --keep-going |   Columns|
Client|Total |  Rep Factor  |  Test Rate   | Cluster Rate |
|  |  |  |  |   
Threads|   Threads|  |  (writes/s)  |  (writes/s)  |

+==+==+==+==+==+==+==+==+==+
|  4   |  3   |  N   |   1000   | 
200  | 600  |  3   |44644 |133931|

+--+--+--+--+--+--+--+--+--+

The issue I'm seeing with app stress tests is that the rate will be 
comparable/acceptable at first (~100k w/s) and will degrade considerably 
(~48k w/s) until a flush and restart.  CPU usage will correspondingly be 
high at first (500-700%) and taper down to 50-200%.  My data model is 
pretty standard ( is pseudo-type information):


Users
"UserId<32CharHash>" : {
"email": "a...@b.com",
"first_name": "John",
"last_name": "Doe"
}

UserGroups
"GroupId": {
"UserId<32CharHash>": {
"date_joined": "2011-05-10 13:14.789",
"date_left": "2011-05-11 13:14.789",
"active": "0|1"
}
}

UserGroupTimeline
"GroupId": {
"date_joined": "UserId<32CharHash>"
}

UserGroupStatus
"CompositeId('GroupId:UserId<32CharHash>')": {
"active": "0|1"
}

Every new User has a row in Users and a ColumnOrSuperColumn in the other 
3 CFs (total of 4 operations).  One notable difference is that the RAID0 
on this instance type (surprisingly) only contains two ephemeral volumes 
and appear a bit more saturated in iostat, although not enough to 
clearly stand out as the bottleneck.  Is the bottleneck in this scenario 
likely memtable flush and/or commitlog rotation settings?


RF = 2; ConsistencyLevel = One; -Xmx = 6GB; concurrent_writes: 64; all 
other settings are the defaults.  Thanks, Alex.

Re: network topology issue

2011-05-11 Thread Sameer Farooqui

Anurag,

The Cassandra ring spans datacenters, so you can't use token 0 on both
nodes. Cassandra’s ring is from 0 to 2**127 in size.

Try assigning one node the token of 0 and the second node 8.50705917 × 10^37
(input this as a single long number).

To add a new keyspace in 0.8, run this from the CLI:
create keyspace KEYSPACENAME with placement_strategy =
org.apache.Cassandra.locator.NetworkTopologyStrategy' and strategy_options =
[{replication_factor:2}];

If using 0.7, run "help create keyspace;" from the CLI and it'll show you
the correct syntax.


More info on tokens:
http://journal.paul.querna.org/articles/2010/09/24/cassandra-token-selection/

http://wiki.apache.org/cassandra/Operations#Token_selection

On Wed, May 11, 2011 at 4:58 PM, Anurag Gujral wrote:

> Hi All,
>  I am testing network topology strategy in cassandra I am using
> two nodes , one node each in different data center.
> Since the nodes are in different dc I assigned token 0 to both the nodes.
> I added both the nodes as seeds in the cassandra.yaml and  I am  using
> properyfilesnitch as endpoint snitch where I have specified the colo
> details.
>
> I started first node then I when I restarted second node I got an error
> that token "0" is already being used.Why am I getting this error.
>
> Second Question: I already have cassandra running in two different data
> centers I want to add a new keyspace which uses networkTopology strategy
> in the light of above errors how can I accomplish this.
>
>
> Thanks
> Anurag
>

Re: Keyspace creation error on 0.8 beta2

2011-05-11 Thread Sameer Farooqui

FYI - creating the keyspace with the syntax below works in beta1, just not
beta2.

jeromatron on the IRC channel commented that it looks like the java
classpath is using the wrong library dependency for commons lang in beta2.

- Sameer


On Wed, May 11, 2011 at 4:09 PM, Sameer Farooqui wrote:

> When I run this from the Cassandra CMD-Line:
> create keyspace MyKeySpace with placement_strategy =
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options =
> [{replication_factor:2}];
>
> I get this error: Internal error processing system_add_keyspace
>
> My syntax is correct for creating the keyspace (I think) because I got it
> from the "help create keyspace;" examples from the CMD-line.
>
> Cassandra system log shows:
> ERROR [pool-2-thread-1] 2011-05-11 22:52:04,577 Cassandra.java (line 3918)
> Internal error processing system_add_keyspace
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.NoSuchMethodError:
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at
> org.apache.cassandra.thrift.CassandraServer.applyMigrationOnStage(CassandraServer.java:793)
> at
> org.apache.cassandra.thrift.CassandraServer.system_add_keyspace(CassandraServer.java:881)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$system_add_keyspace.process(Cassandra.java:3912)
> at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.NoSuchMethodError:
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at
> org.apache.cassandra.thrift.CassandraServer.applyMigrationOnStage(CassandraServer.java:785)
> ... 7 more
> Caused by: java.lang.NoSuchMethodError:
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at
> org.apache.cassandra.config.KSMetaData.toString(KSMetaData.java:114)
> at
> org.apache.cassandra.db.migration.AddKeyspace.toString(AddKeyspace.java:94)
> at
> org.apache.cassandra.db.migration.Migration.apply(Migration.java:119)
> at
> org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:778)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> ... 3 more
> ERROR [MigrationStage:1] 2011-05-11 22:52:04,580
> AbstractCassandraDaemon.java (line 112) Fatal exception in thread
> Thread[MigrationStage:1,5,main]
> java.lang.NoSuchMethodError:
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at
> org.apache.cassandra.config.KSMetaData.toString(KSMetaData.java:114)
> at
> org.apache.cassandra.db.migration.AddKeyspace.toString(AddKeyspace.java:94)
> at
> org.apache.cassandra.db.migration.Migration.apply(Migration.java:119)
> at
> org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:778)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>
>

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-11 Thread anuya joshi

Thanks Jaydeep.

On first insertion, I inserted data using Thrift API programmatically. So, I
could specify the timestamp which is the current system time. However, for
deleting the columns I used command line client that comes with Cassandra. I
have no way to specify delete timestamp in command line client. so I dont
know whats exactly delete timestamp.

Unless, I know whats delete stimestamp, I have no way to compare timestamps?

Thanks,
Anuya


On Mon, May 2, 2011 at 11:54 PM, chovatia jaydeep <
chovatia_jayd...@yahoo.co.in> wrote:

> One small correction in my mail below.
> Second insertion time-stamp has to be greater than delete time-stamp
> in-order to retrieve the data.
>
> Thank you,
> Jaydeep
> --
> *From:* chovatia jaydeep 
> *To:* "user@cassandra.apache.org" 
> *Sent:* Monday, 2 May 2011 11:52 PM
>
> *Subject:* Re: Unable to add columns to empty row in Column family:
> Cassandra
>
> Hi Anuya,
>
> > However, columns are not being inserted.
>
> Do you mean to say that after insert operation you couldn't retrieve the
> same data? If so, then please check the time-stamp when you reinserted
> after delete operation. Your second insertion time-stamp has to be greater
> than the previous insertion.
>
> Thank you,
> Jaydeep
> --
> *From:* anuya joshi 
> *To:* user@cassandra.apache.org
> *Sent:* Monday, 2 May 2011 11:34 PM
> *Subject:* Re: Unable to add columns to empty row in Column family:
> Cassandra
>
> Hello,
>
> I am using Cassandra for my application.My Cassandra client uses Thrift
> APIs directly. The problem I am facing currently is as follows:
>
> 1) I added a row and columns in it dynamically via Thrift API Client
> 2) Next, I used command line client to delete row which actually deleted
> all the columns in it, leaving empty row with original row id.
> 3) Now, I am trying to add columns dynamically using client program into
> this empty row with same row key
> However, columns are not being inserted.
> But, when tried from command line client, it worked correctly.
>
> Any pointer on this would be of great use
>
> Thanks in  advance,
>
> Regards,
> Anuya
>
>
>
>
>

Re: network topology issue

2011-05-11 Thread Anurag Gujral

Thanks Sameer for your answer.
I am using two DCs DC1 , DC2 with both having one node each, my
straegy_options values are DC1:1,DC2:1  I am not sure what my RF should be ,
should it be 1 or 2?
Please Advise
Thanks
Anurag

On Wed, May 11, 2011 at 5:27 PM, Sameer Farooqui wrote:

> Anurag,
>
> The Cassandra ring spans datacenters, so you can't use token 0 on both
> nodes. Cassandra’s ring is from 0 to 2**127 in size.
>
> Try assigning one node the token of 0 and the second node 8.50705917 ×
> 10^37 (input this as a single long number).
>
> To add a new keyspace in 0.8, run this from the CLI:
> create keyspace KEYSPACENAME with placement_strategy =
> org.apache.Cassandra.locator.NetworkTopologyStrategy' and strategy_options =
> [{replication_factor:2}];
>
> If using 0.7, run "help create keyspace;" from the CLI and it'll show you
> the correct syntax.
>
>
> More info on tokens:
>
> http://journal.paul.querna.org/articles/2010/09/24/cassandra-token-selection/
> 
> http://wiki.apache.org/cassandra/Operations#Token_selection
>
>
> On Wed, May 11, 2011 at 4:58 PM, Anurag Gujral wrote:
>
>> Hi All,
>>  I am testing network topology strategy in cassandra I am
>> using two nodes , one node each in different data center.
>> Since the nodes are in different dc I assigned token 0 to both the nodes.
>> I added both the nodes as seeds in the cassandra.yaml and  I am  using
>> properyfilesnitch as endpoint snitch where I have specified the colo
>> details.
>>
>> I started first node then I when I restarted second node I got an error
>> that token "0" is already being used.Why am I getting this error.
>>
>> Second Question: I already have cassandra running in two different data
>> centers I want to add a new keyspace which uses networkTopology strategy
>> in the light of above errors how can I accomplish this.
>>
>>
>> Thanks
>> Anurag
>>
>
>

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-11 Thread anuya joshi

Thanks aaron. here come the details:

1) Version: 0.7.4
2) Its a two node cluster with RF=2
3) It works perfectly after 1st get. Then I delete all the columns in a row.
Finally, I try to insert into the same row with same row id. However, its
not getting inserted programmatically.

Thanks,
Anuya

On Tue, May 3, 2011 at 2:03 AM, aaron morton wrote:

> If your are still having problems can you say what version, how many nodes,
> what RF, what CL and if after inserting and failing on the first get it
> works on a subsequent get.
>
>
> Thanks
> Aaron
>
> On 3 May 2011, at 18:54, chovatia jaydeep wrote:
>
> One small correction in my mail below.
> Second insertion time-stamp has to be greater than delete time-stamp
> in-order to retrieve the data.
>
> Thank you,
> Jaydeep
> --
> *From:* chovatia jaydeep 
> *To:* "user@cassandra.apache.org" 
> *Sent:* Monday, 2 May 2011 11:52 PM
> *Subject:* Re: Unable to add columns to empty row in Column family:
> Cassandra
>
> Hi Anuya,
>
> > However, columns are not being inserted.
>
> Do you mean to say that after insert operation you couldn't retrieve the
> same data? If so, then please check the time-stamp when you reinserted
> after delete operation. Your second insertion time-stamp has to be greater
> than the previous insertion.
>
> Thank you,
> Jaydeep
> --
> *From:* anuya joshi 
> *To:* user@cassandra.apache.org
> *Sent:* Monday, 2 May 2011 11:34 PM
> *Subject:* Re: Unable to add columns to empty row in Column family:
> Cassandra
>
> Hello,
>
> I am using Cassandra for my application.My Cassandra client uses Thrift
> APIs directly. The problem I am facing currently is as follows:
>
> 1) I added a row and columns in it dynamically via Thrift API Client
> 2) Next, I used command line client to delete row which actually deleted
> all the columns in it, leaving empty row with original row id.
> 3) Now, I am trying to add columns dynamically using client program into
> this empty row with same row key
> However, columns are not being inserted.
> But, when tried from command line client, it worked correctly.
>
> Any pointer on this would be of great use
>
> Thanks in  advance,
>
> Regards,
> Anuya
>
>
>
>
>
>

Re: network topology issue

2011-05-11 Thread Narendra Sharma

My understanding is that the replication factor is for the entire ring. Even
if you have 2 DCs the nodes are part of the same ring. What you get
additionally from NTS is that you can specify how many replicas to place in
each DC.

So RF = 1 and DC1:1, DC2:1 looks incorrect to me.

What is possible with NTS is following:
RF=3, DC1=1, DC2=2

Would wait for others comments to see if my understand is correct.

-Naren

On Wed, May 11, 2011 at 5:41 PM, Anurag Gujral wrote:

> Thanks Sameer for your answer.
> I am using two DCs DC1 , DC2 with both having one node each, my
> straegy_options values are DC1:1,DC2:1  I am not sure what my RF should be ,
> should it be 1 or 2?
> Please Advise
> Thanks
> Anurag
>
>
> On Wed, May 11, 2011 at 5:27 PM, Sameer Farooqui 
> wrote:
>
>> Anurag,
>>
>> The Cassandra ring spans datacenters, so you can't use token 0 on both
>> nodes. Cassandra’s ring is from 0 to 2**127 in size.
>>
>> Try assigning one node the token of 0 and the second node 8.50705917 ×
>> 10^37 (input this as a single long number).
>>
>> To add a new keyspace in 0.8, run this from the CLI:
>> create keyspace KEYSPACENAME with placement_strategy =
>> org.apache.Cassandra.locator.NetworkTopologyStrategy' and strategy_options =
>> [{replication_factor:2}];
>>
>> If using 0.7, run "help create keyspace;" from the CLI and it'll show you
>> the correct syntax.
>>
>>
>> More info on tokens:
>>
>> http://journal.paul.querna.org/articles/2010/09/24/cassandra-token-selection/
>> 
>> http://wiki.apache.org/cassandra/Operations#Token_selection
>>
>>
>> On Wed, May 11, 2011 at 4:58 PM, Anurag Gujral 
>> wrote:
>>
>>> Hi All,
>>>  I am testing network topology strategy in cassandra I am
>>> using two nodes , one node each in different data center.
>>> Since the nodes are in different dc I assigned token 0 to both the nodes.
>>> I added both the nodes as seeds in the cassandra.yaml and  I am  using
>>> properyfilesnitch as endpoint snitch where I have specified the colo
>>> details.
>>>
>>> I started first node then I when I restarted second node I got an error
>>> that token "0" is already being used.Why am I getting this error.
>>>
>>> Second Question: I already have cassandra running in two different data
>>> centers I want to add a new keyspace which uses networkTopology strategy
>>> in the light of above errors how can I accomplish this.
>>>
>>>
>>> Thanks
>>> Anurag
>>>
>>
>>
>


-- 
Narendra Sharma
Solution Architect
*http://www.persistentsys.com*
*http://narendrasharma.blogspot.com/*

Re: network topology issue

2011-05-11 Thread Sameer Farooqui

Yeah, Narendra is correct.

If you have 2 nodes, one in each data center, use RF=2 and do reads and
writes with either level ONE or QUORUM (which means 2 in this case).

However, if you had 2 nodes in DC1 and 1 node in DC2, then you could use
RF=3 and use LOCAL_QUORUM for reads and writes.

For writes, LOCAL_QUORUM means: Ensure that the write has been written
to  / 2 + 1 nodes, within the local datacenter
(requires NetworkTopologyStrategy)

For reads, LOCAL_QUORUM means: Returns the record with the most recent
timestamp once a majority of replicas within the local datacenter have
replied.

- Sameer

On Wed, May 11, 2011 at 5:49 PM, Narendra Sharma
wrote:

> My understanding is that the replication factor is for the entire ring.
> Even if you have 2 DCs the nodes are part of the same ring. What you get
> additionally from NTS is that you can specify how many replicas to place in
> each DC.
>
> So RF = 1 and DC1:1, DC2:1 looks incorrect to me.
>
> What is possible with NTS is following:
> RF=3, DC1=1, DC2=2
>
> Would wait for others comments to see if my understand is correct.
>
> -Naren
>
>
> On Wed, May 11, 2011 at 5:41 PM, Anurag Gujral wrote:
>
>> Thanks Sameer for your answer.
>> I am using two DCs DC1 , DC2 with both having one node each, my
>> straegy_options values are DC1:1,DC2:1  I am not sure what my RF should be ,
>> should it be 1 or 2?
>> Please Advise
>> Thanks
>>  Anurag
>>
>>
>> On Wed, May 11, 2011 at 5:27 PM, Sameer Farooqui > > wrote:
>>
>>> Anurag,
>>>
>>> The Cassandra ring spans datacenters, so you can't use token 0 on both
>>> nodes. Cassandra’s ring is from 0 to 2**127 in size.
>>>
>>> Try assigning one node the token of 0 and the second node 8.50705917 ×
>>> 10^37 (input this as a single long number).
>>>
>>> To add a new keyspace in 0.8, run this from the CLI:
>>> create keyspace KEYSPACENAME with placement_strategy =
>>> org.apache.Cassandra.locator.NetworkTopologyStrategy' and strategy_options =
>>> [{replication_factor:2}];
>>>
>>> If using 0.7, run "help create keyspace;" from the CLI and it'll show you
>>> the correct syntax.
>>>
>>>
>>> More info on tokens:
>>>
>>> http://journal.paul.querna.org/articles/2010/09/24/cassandra-token-selection/
>>> 
>>> http://wiki.apache.org/cassandra/Operations#Token_selection
>>>
>>>
>>> On Wed, May 11, 2011 at 4:58 PM, Anurag Gujral 
>>> wrote:
>>>
 Hi All,
  I am testing network topology strategy in cassandra I am
 using two nodes , one node each in different data center.
 Since the nodes are in different dc I assigned token 0 to both the
 nodes.
 I added both the nodes as seeds in the cassandra.yaml and  I am  using
 properyfilesnitch as endpoint snitch where I have specified the colo
 details.

 I started first node then I when I restarted second node I got an error
 that token "0" is already being used.Why am I getting this error.

 Second Question: I already have cassandra running in two different data
 centers I want to add a new keyspace which uses networkTopology strategy
 in the light of above errors how can I accomplish this.


 Thanks
 Anurag

>>>
>>>
>>
>
>
> --
> Narendra Sharma
> Solution Architect
> *http://www.persistentsys.com*
> *http://narendrasharma.blogspot.com/*
>
>
>

Re: Ec2 Stress Results

2011-05-11 Thread Adrian Cockcroft

Hi Alex,

This has been a useful thread, we've been comparing your numbers with
our own tests.

Why did you choose four big instances rather than more smaller ones?

For $8/hr you get four m2.4xl with a total of 8 disks.
For $8.16/hr you could have twelve m1.xl with a total of 48 disks, 3x
disk space, a bit less total RAM and much more CPU

When an instance fails, you have a 25% loss of capacity with 4 or an
8% loss of capacity with 12.

I don't think it makes sense (especially on EC2) to run fewer than 6
instances, we are mostly starting at 12-15.
We can also spread the instances over three EC2 availability zones,
with RF=3 and one copy of the data in each zone.

Cheers
Adrian


On Wed, May 11, 2011 at 5:25 PM, Alex Araujo
 wrote:
> On 5/9/11 9:49 PM, Jonathan Ellis wrote:
>>
>> On Mon, May 9, 2011 at 5:58 PM, Alex Araujo>  How many
>> replicas are you writing?
>>>
>>> Replication factor is 3.
>>
>> So you're actually spot on the predicted numbers: you're pushing
>> 20k*3=60k "raw" rows/s across your 4 machines.
>>
>> You might get another 10% or so from increasing memtable thresholds,
>> but bottom line is you're right around what we'd expect to see.
>> Furthermore, CPU is the primary bottleneck which is what you want to
>> see on a pure write workload.
>>
> That makes a lot more sense.  I upgraded the cluster to 4 m2.4xlarge
> instances (68GB of RAM/8 CPU cores) in preparation for application stress
> tests and the results were impressive @ 200 threads per client:
>
> +--+--+--+--+--+--+--+--+--+
> | Server Nodes | Client Nodes | --keep-going |   Columns    |    Client    |
>    Total     |  Rep Factor  |  Test Rate   | Cluster Rate |
> |              |              |              |              |   Threads    |
>   Threads    |              |  (writes/s)  |  (writes/s)  |
> +==+==+==+==+==+==+==+==+==+
> |      4       |      3       |      N       |   1000   |     200      |
>     600      |      3       |    44644     |    133931    |
> +--+--+--+--+--+--+--+--+--+
>
> The issue I'm seeing with app stress tests is that the rate will be
> comparable/acceptable at first (~100k w/s) and will degrade considerably
> (~48k w/s) until a flush and restart.  CPU usage will correspondingly be
> high at first (500-700%) and taper down to 50-200%.  My data model is pretty
> standard ( is pseudo-type information):
>
> Users
> "UserId<32CharHash>" : {
>    "email": "a...@b.com",
>    "first_name": "John",
>    "last_name": "Doe"
> }
>
> UserGroups
> "GroupId": {
>    "UserId<32CharHash>": {
>        "date_joined": "2011-05-10 13:14.789",
>        "date_left": "2011-05-11 13:14.789",
>        "active": "0|1"
>    }
> }
>
> UserGroupTimeline
> "GroupId": {
>    "date_joined": "UserId<32CharHash>"
> }
>
> UserGroupStatus
> "CompositeId('GroupId:UserId<32CharHash>')": {
>    "active": "0|1"
> }
>
> Every new User has a row in Users and a ColumnOrSuperColumn in the other 3
> CFs (total of 4 operations).  One notable difference is that the RAID0 on
> this instance type (surprisingly) only contains two ephemeral volumes and
> appear a bit more saturated in iostat, although not enough to clearly stand
> out as the bottleneck.  Is the bottleneck in this scenario likely memtable
> flush and/or commitlog rotation settings?
>
> RF = 2; ConsistencyLevel = One; -Xmx = 6GB; concurrent_writes: 64; all other
> settings are the defaults.  Thanks, Alex.
>

Re: Keyspace creation error on 0.8 beta2

2011-05-11 Thread Jeremy Hanna

I download a fresh 0.8 beta2 and create keyspaces fine - including the ones 
below.

I don't know if there are relics of a previous install somewhere or something 
wonky about the classpath.  You said that you might have /var/lib/cassandra 
data left over so one thing to try is starting fresh there as well.

(we talked about this in IRC but just updating the user thread)

On May 11, 2011, at 7:31 PM, Sameer Farooqui wrote:

> FYI - creating the keyspace with the syntax below works in beta1, just not 
> beta2.
> 
> jeromatron on the IRC channel commented that it looks like the java classpath 
> is using the wrong library dependency for commons lang in beta2.
> 
> - Sameer
> 
> 
> On Wed, May 11, 2011 at 4:09 PM, Sameer Farooqui  
> wrote:
> When I run this from the Cassandra CMD-Line:
> create keyspace MyKeySpace with placement_strategy = 
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = 
> [{replication_factor:2}];
> 
> I get this error: Internal error processing system_add_keyspace
> 
> My syntax is correct for creating the keyspace (I think) because I got it 
> from the "help create keyspace;" examples from the CMD-line.
> 
> Cassandra system log shows: 
> ERROR [pool-2-thread-1] 2011-05-11 22:52:04,577 Cassandra.java (line 3918) 
> Internal error processing system_add_keyspace
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.NoSuchMethodError: 
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at 
> org.apache.cassandra.thrift.CassandraServer.applyMigrationOnStage(CassandraServer.java:793)
> at 
> org.apache.cassandra.thrift.CassandraServer.system_add_keyspace(CassandraServer.java:881)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor$system_add_keyspace.process(Cassandra.java:3912)
> at 
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
> at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.NoSuchMethodError: 
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at 
> org.apache.cassandra.thrift.CassandraServer.applyMigrationOnStage(CassandraServer.java:785)
> ... 7 more
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at 
> org.apache.cassandra.config.KSMetaData.toString(KSMetaData.java:114)
> at 
> org.apache.cassandra.db.migration.AddKeyspace.toString(AddKeyspace.java:94)
> at 
> org.apache.cassandra.db.migration.Migration.apply(Migration.java:119)
> at 
> org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:778)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> ... 3 more
> ERROR [MigrationStage:1] 2011-05-11 22:52:04,580 AbstractCassandraDaemon.java 
> (line 112) Fatal exception in thread Thread[MigrationStage:1,5,main]
> java.lang.NoSuchMethodError: 
> org.apache.commons.lang.StringUtils.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;
> at 
> org.apache.cassandra.config.KSMetaData.toString(KSMetaData.java:114)
> at 
> org.apache.cassandra.db.migration.AddKeyspace.toString(AddKeyspace.java:94)
> at 
> org.apache.cassandra.db.migration.Migration.apply(Migration.java:119)
> at 
> org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:778)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 
>

Re: PIG Cassandra - IPs of nodes in a ring

2011-05-11 Thread aaron morton

People have been using that sort of configuration in EC2 deployments to run the 
listen_address through a VPN and rpc_address on the private IP. 

Are you still having troubles connecting ?

 
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 11 May 2011, at 00:53, Jeremy Hanna wrote:

> Anyone have any thoughts on this thread - about configuring cassandra with a 
> different ip for listen address and rpc address? 
> 
> moving this to the cassandra user list as it more involves cassandra 
> configuration at this point.
> 
> On May 10, 2011, at 12:58 AM, Badrinarayanan S wrote:
> 
>> Hi, after further digging, the issue is related to describe_ring function of
>> Cassandra and more details are available at
>> https://issues.apache.org/jira/browse/CASSANDRA-1777. So till it is resolved
>> opting to have only public ips for both gossip and thrift.
>> 
>> 
>> -Original Message-
>> From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com] 
>> Sent: Saturday, May 07, 2011 1:36 AM
>> To: u...@pig.apache.org
>> Subject: Re: PIG Cassandra - IPs of nodes in a ring
>> 
>> Hmmm - if that's the case, then you might try the cassandra user list or ask
>> someone like driftx (brandon) in the #cassandra channel on IRC.  He might
>> know what implications there are for that setup.
>> 
>> On May 6, 2011, at 1:13 PM, Badrinarayanan S wrote:
>> 
>>> Hi, I am running from one of the nodes in the cluster. 
>>> 
>>> I too believe it is something to do with different address for rpc_address
>>> and listen_address but not sure what it is...
>>> 
>>> 
>>> 
>>> -Original Message-
>>> From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com] 
>>> Sent: Friday, May 06, 2011 11:10 PM
>>> To: u...@pig.apache.org
>>> Subject: Re: PIG Cassandra - IPs of nodes in a ring
>>> 
>>> Where are you running the pig script from - your local machine or one of
>> the
>>> nodes in the cluster or ?  I would think it wouldn't matter which address
>>> you use, but what interface it's using.  So if the internal and public
>>> address are both using the same interface, then you should be able to
>>> connect to cassandra from your local machine using the public address.
>>> That's what I do with EC2.  I use the internal address to connect when I'm
>>> connecting within the region and the public address when I'm connecting
>> from
>>> my local machine.
>>> 
>>> I've never done a different address for rpc_address and listen_address for
>>> that configuration, so there might be peculiarities there that I wouldn't
>>> have seen.
>>> 
>>> On May 6, 2011, at 11:37 AM, Badrinarayanan S wrote:
>>> 
 Hi,
 
 
 
 I got a cluster with seven Cassandra nodes. The ring is formed using the
 private ips of each of the nodes. The rpc_address of the nodes is set to
 private and listen_address of the nodes set to public mainly to
>> facilitate
 cross data centre ring. When I ring the nodes, it shows all nodes are up
 pointing to private ip.
 
 
 
 However when I setup Hadoop/PIG and try to run a PIG script, I get an
 exception like java.io.IOException: failed connecting to all endpoints
 , . The ip1 and ip2 are the public ips of nodes part of the
>>> ring. 
 
 
 
 Any suggestion on why it is looking for public ip when the rpc_addr of
>>> nodes
 and ring is pointing to private ips.
 
 
 
 
 
 Regards,
 
 badri
 
 
 
>>> 
>> 
>

Re: Finding big rows

2011-05-11 Thread aaron morton

Let me know if you get anywhere, I'm on there as aaron_morton but I'm also way 
over in New Zealand. 

If you are using your own client and writing data you cannot read back check 
that the byte encoding is always the same and that you are setting appropriate 
timestamps for every call. In the log message all the columns have a 0 time 
stamp. Is that deliberate ? 

Does the count in  "collecting 0 of 100" increase ? If it does it means the 
query is collecting rows that a "live" and match the query filer. If not it 
means all that data under a (probably row level) tombstone. Which make me ask 
is this a row that gets a lot of inserts and then deletes ?

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 11 May 2011, at 23:30, Meler Wojciech wrote:

> I didn’t run nodetool scrub. My app uses c++ thrift client (0.5.0 and 0.6.1) .
> As this is production environment I get a lot of messages "collecting %s of 
> %s", but there is no row key.
> I’ve matched it by uuid and thread – hope it is ok:
>  
> [ReadStage:3][org.apache.cassandra.db.filter.SliceQueryFilter] collecting 0 
> of 100: SuperColumn(455470c2-60e6-11e0-acae-e41f13798a50 
> [4554853a-60e6-11e0-9342-e41f13798a50:false:361745@0,aa92b386-60e6-11e0-bc58-e41f13798a50:false:159@0,ac346cc0-60e6-11e0-bf8e-e41f13798a50:false:53@0,ad16d362-60e6-11e0-ad9c-e41f13798a50:false:66@0,ae57076a-60e6-11e0-8982-e41f13798a50:false:48@0,afc042ba-60e6-11e0-8320-e41f13798a50:false:63@0,b38d912c-60e6-11e0-b3b7-e41f13798a50:false:164@0,49d7dc00-60e7-11e0-9694-e41f13798a50:false:100@0,94e99fa8-60e7-11e0-b621-e41f13798a50:false:233@0,9612c3a0-60e7-11e0-8292-e41f13798a50:false:4049@0,ec245880-60e7-11e0-85be-e41f13798a50:false:148@0,110ac968-60e8-11e0-b325-e41f13798a50:false:125@0,64a45b4c-60e9-11e0-9628-e41f13798a50:false:160@0,cfc39a00-60e9-11e0-9539-e41f13798a50:false:105@0,2a21ab22-60ea-11e0-a1f6-e41f13798a50:false:146@0,95f2d2d6-60ea-11e0-b763-e41f13798a50:false:53@0,97972362-60ea-11e0-9275-e41f13798a50:false:134@0,ce980606-60ea-11e0-bd03-e41f13798a50:false:195@0,517e02dc-60eb-11e0-b0a8-e41f13798a50:false:53@0,5694c74c-60eb-11e0-941f-e41f13798a50:false:170@0,8d48cca2-60eb-11e0-bdbc-e41f13798a50:false:187@0,fc5e0148-60eb-11e0-ac4c-e41f13798a50:false:558@0,fc7e9476-60eb-11e0-bbfe-e41f13798a50:false:161@0,22a860a0-60ec-11e0-8d1b-e41f13798a50:false:138@0,bf054b52-60ec-11e0-a4b1-e41f13798a50:false:56@0,fd3b4822-60ec-11e0-8612-e41f13798a50:false:234@0,0c1d6fe0-60ee-11e0-8bb0-e41f13798a50:false:79@0,43bbddec-60ee-11e0-848f-e41f13798a50:false:79@0,ec50ed8e-60ef-11e0-a8c1-e41f13798a50:false:60@0,57ae0534-60f1-11e0-8fd5-e41f13798a50:false:60@0,81d7bbe8-60f1-11e0-9586-e41f13798a50:false:143@0,08ce8852-60f2-11e0-bfeb-e41f13798a50:false:150@0,c51bf170-60f2-11e0-8a60-e41f13798a50:false:192@0,e98b181e-60f3-11e0-b20b-e41f13798a50:false:108@0,ed931d76-60f3-11e0-90cf-e41f13798a50:false:146@0,f754541a-60f3-11e0-84db-e41f13798a50:false:60@0,24a73220-60f4-11e0-ae03-e41f13798a50:false:139@0,33485db8-60f4-11e0-9f86-e41f13798a50:false:196@0,42fe380e-60f4-11e0-9f0f-e41f13798a50:false:150@0,440afbb0-60f4-11e0-a066-e41f13798a50:false:122@0,64d9d13a-60f5-11e0-a30c-e41f13798a50:false:60@0,9e1cefc2-60f5-11e0-826e-e41f13798a50:false:205@0,1f642a46-60f6-11e0-8aa3-e41f13798a50:false:298@0,1f6c946a-60f6-11e0-9eab-e41f13798a50:false:117@0,d1a4e8c6-60f6-11e0-9bfa-e41f13798a50:false:58@0,0a40ba5c-60f7-11e0-893f-e41f13798a50:false:170@0,a2e0f740-60f7-11e0-93b0-e41f13798a50:false:108@0,3922060e-60f8-11e0-b850-e41f13798a50:false:147@0,3cdcdf08-60f8-11e0-8320-e41f13798a50:false:60@0,79b33a26-60f8-11e0-b151-e41f13798a50:false:187@0,aae281c8-60f9-11e0-9d14-e41f13798a50:false:60@0,e1b3295a-60f9-11e0-b367-e41f13798a50:false:81@0,8149870c-60fa-11e0-b6d4-e41f13798a50:false:128@0,b483680e-60fa-11e0-a56f-e41f13798a50:false:164@0,3dfdfe78-60fb-11e0-8582-e41f13798a50:false:143@0,15a9b830-60fc-11e0-be81-e41f13798a50:false:217@0,46ae8aae-60fd-11e0-908e-e41f13798a50:false:60@0,7ee58490-60fd-11e0-9072-e41f13798a50:false:81@0,bf4fba30-60ff-11e0-9b5f-e41f13798a50:false:60@0,26201952-6101-11e0-a4cb-e41f13798a50:false:60@0,8e5e995c-6102-11e0-9bb3-e41f13798a50:false:60@0,a4ef01c0-6102-11e0-9bff-e41f13798a50:false:214@0,a5034784-6102-11e0-96ee-e41f13798a50:false:133@0,ad9f3682-6102-11e0-8df5-e41f13798a50:false:172@0,e2e01b36-6102-11e0-8489-e41f13798a50:false:155@0,25561204-6103-11e0-a14b-e41f13798a50:false:155@0,4d39d2a6-6103-11e0-844a-e41f13798a50:false:152@0,74aeb4fa-6103-11e0-9f65-e41f13798a50:false:140@0,7e8802ba-6103-11e0-81f1-e41f13798a50:false:140@0,8e6532f2-6103-11e0-acd4-e41f13798a50:false:141@0,fceadee8-6103-11e0-8beb-e41f13798a50:false:313@0,2d319618-6105-11e0-9efa-e41f13798a50:false:60@0,4d4540bc-6105-11e0-96d1-e41f13798a50:false:128@0,63fcaa02-6105-11e0-ac35-e41f13798a50:false:81@0,67c70114-6105-11e0-9ab8-e41f13798a50:false:167@0,3a104306-6106-11e0-8be9-e41f13798a50:false:145@0,ce884cf4-6106-11e0-92ba-e41f13798a50:false:208@0,fe230516-6107-11e0-

Re: Excessive allocation during hinted handoff

2011-05-11 Thread aaron morton

I'm assuming the two nodes are the ones receiving the HH after they were down. 

Are there a lot of hints collected while they are down ? you can check the 
HintedHandOffManager MBean in JConsole

What does the TPStats look like on the nodes under pressure ? And how many 
nodes are delivering hints to the nodes when they restart?

Finally hinted_handoff_throttle_delay_in_ms in conf/cassandra.yaml will let you 
slow down the delivery rate if HH is indeed the problem. 

Hope that helps.

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 May 2011, at 06:55, Gabriel Tataranu wrote:

> Greetings,
> 
> I'm experiencing some issues with 2 nodes (out of more than 10). Right
> after startup (Listening for thrift clients...) the nodes will create
> objects at high rate using all available CPU cores:
> 
> INFO 18:13:15,350 GC for PS Scavenge: 292 ms, 494902976 reclaimed
> leaving 2024909864 used; max is 6658457600
> INFO 18:13:20,393 GC for PS Scavenge: 252 ms, 478691280 reclaimed
> leaving 2184252600 used; max is 6658457600
> 
> INFO 18:15:23,909 GC for PS Scavenge: 283 ms, 452943472 reclaimed
> leaving 5523891120 used; max is 6658457600
> INFO 18:15:24,912 GC for PS Scavenge: 273 ms, 466157568 reclaimed
> leaving 5594606128 used; max is 6658457600
> 
> This will eventually trigger old-gen GC and then the process repeats
> until hinted handoff finishes.
> 
> The build version was updated from 0.7.2 to 0.7.5 but the behavior was
> exactly the same.
> 
> Thank you.
>

Re: Ec2 Stress Results

2011-05-11 Thread Alex Araujo


Hey Adrian -

Why did you choose four big instances rather than more smaller ones?
Mostly to see the impact of additional CPUs on a write only load.  The 
portion of the application we're migrating from MySQL is very write 
intensive.  The other 8 core option was c1.xl with 7GB of RAM.  I will 
very likely need more than that once I add reads as some things can 
benefit significantly from the row cache.  I also thought that m2.4xls 
would come with 4 disks instead of two.

For $8/hr you get four m2.4xl with a total of 8 disks.
For $8.16/hr you could have twelve m1.xl with a total of 48 disks, 3x
disk space, a bit less total RAM and much more CPU

When an instance fails, you have a 25% loss of capacity with 4 or an
8% loss of capacity with 12.

I don't think it makes sense (especially on EC2) to run fewer than 6
instances, we are mostly starting at 12-15.
We can also spread the instances over three EC2 availability zones,
with RF=3 and one copy of the data in each zone.
Agree on all points.  The reason I'm keeping the cluster small now is to 
more easily monitor what's going on/find where things break down.  
Eventually it will be an 8+ node cluster spread across AZs as you 
mentioned (and likely m2.4xls as they do seem to provide the most 
value/$ for this type of system).


I'm interested in hearing about your experience(s) and will continue to 
share mine.  Alex.

Re: Unable to add columns to empty row in Column family: Cassandra

2011-05-11 Thread aaron morton

How do you delete the data in the cli ? Is it a row delete e.g. del 
MyCF['my-key'];

What client are you using the insert the row the second time ? e.g. custom 
thrift wrapper or pycassa 

How is the second read done, via the cli ? 

Does the same test work when you only use your app ? 
 
Cassandra-cli will be using the current time as it's time stamp for the delete. 
If I had to guess what was happening it would be a problem with the timestamps 
your app is creating.

Thanks

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 May 2011, at 12:42, anuya joshi wrote:

> Thanks aaron. here come the details:
> 
> 1) Version: 0.7.4
> 2) Its a two node cluster with RF=2
> 3) It works perfectly after 1st get. Then I delete all the columns in a row. 
> Finally, I try to insert into the same row with same row id. However, its not 
> getting inserted programmatically. 
> 
> Thanks,
> Anuya
> 
> On Tue, May 3, 2011 at 2:03 AM, aaron morton  wrote:
> If your are still having problems can you say what version, how many nodes, 
> what RF, what CL and if after inserting and failing on the first get it works 
> on a subsequent get. 
> 
> 
> Thanks
> Aaron
> 
> On 3 May 2011, at 18:54, chovatia jaydeep wrote:
> 
>> One small correction in my mail below. 
>> Second insertion time-stamp has to be greater than delete time-stamp 
>> in-order to retrieve the data.
>> 
>> Thank you,
>> Jaydeep
>> From: chovatia jaydeep 
>> To: "user@cassandra.apache.org" 
>> Sent: Monday, 2 May 2011 11:52 PM
>> Subject: Re: Unable to add columns to empty row in Column family: Cassandra
>> 
>> Hi Anuya,
>> 
>> > However, columns are not being inserted.
>> 
>> Do you mean to say that after insert operation you couldn't retrieve the 
>> same data? If so, then please check the time-stamp when you reinserted after 
>> delete operation. Your second insertion time-stamp has to be greater than 
>> the previous insertion.
>> 
>> Thank you,
>> Jaydeep
>> From: anuya joshi 
>> To: user@cassandra.apache.org
>> Sent: Monday, 2 May 2011 11:34 PM
>> Subject: Re: Unable to add columns to empty row in Column family: Cassandra
>> 
>> Hello,
>> 
>> I am using Cassandra for my application.My Cassandra client uses Thrift APIs 
>> directly. The problem I am facing currently is as follows:
>> 
>> 1) I added a row and columns in it dynamically via Thrift API Client
>> 2) Next, I used command line client to delete row which actually deleted all 
>> the columns in it, leaving empty row with original row id.
>> 3) Now, I am trying to add columns dynamically using client program into 
>> this empty row with same row key
>> However, columns are not being inserted.
>> But, when tried from command line client, it worked correctly.
>> 
>> Any pointer on this would be of great use
>> 
>> Thanks in  advance,
>> 
>> Regards,
>> Anuya
>> 
>> 
>> 
>> 
> 
>

Re: network topology issue

2011-05-11 Thread aaron morton

When creating a multi DC deployment tokens should be evenly distributed in 
*each* dc, see this recent discussion for an example
http://www.mail-archive.com/user@cassandra.apache.org/msg12975.html (I'll also 
update the wiki when I get time, making a note now) But no two nodes in the 
global ring can have the same token, hence the error.

 When using the NTS the RF must be set per DC using the  strategy_options 
clause in the `create keyspace  CL i statement. The global RF is just the sum 
of the per DC values. 

 Hope that helps. 

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 12 May 2011, at 12:59, Sameer Farooqui wrote:

> Yeah, Narendra is correct.
> 
> If you have 2 nodes, one in each data center, use RF=2 and do reads and 
> writes with either level ONE or QUORUM (which means 2 in this case).
> 
> However, if you had 2 nodes in DC1 and 1 node in DC2, then you could use RF=3 
> and use LOCAL_QUORUM for reads and writes.
> 
> For writes, LOCAL_QUORUM means: Ensure that the write has been written to 
>  / 2 + 1 nodes, within the local datacenter (requires 
> NetworkTopologyStrategy)
> 
> For reads, LOCAL_QUORUM means: Returns the record with the most recent 
> timestamp once a majority of replicas within the local datacenter have 
> replied.
> 
> - Sameer
> 
> On Wed, May 11, 2011 at 5:49 PM, Narendra Sharma  
> wrote:
> My understanding is that the replication factor is for the entire ring. Even 
> if you have 2 DCs the nodes are part of the same ring. What you get 
> additionally from NTS is that you can specify how many replicas to place in 
> each DC.
> 
> So RF = 1 and DC1:1, DC2:1 looks incorrect to me.
> 
> What is possible with NTS is following:
> RF=3, DC1=1, DC2=2
> 
> Would wait for others comments to see if my understand is correct.
> 
> -Naren
> 
> 
> On Wed, May 11, 2011 at 5:41 PM, Anurag Gujral  
> wrote:
> Thanks Sameer for your answer. 
> I am using two DCs DC1 , DC2 with both having one node each, my 
> straegy_options values are DC1:1,DC2:1  I am not sure what my RF should be , 
> should it be 1 or 2?
> Please Advise
> Thanks
> Anurag
> 
> 
> On Wed, May 11, 2011 at 5:27 PM, Sameer Farooqui  
> wrote:
> Anurag,
> 
> The Cassandra ring spans datacenters, so you can't use token 0 on both nodes. 
> Cassandra’s ring is from 0 to 2**127 in size.
> 
> Try assigning one node the token of 0 and the second node 8.50705917 × 10^37 
> (input this as a single long number).
> 
> To add a new keyspace in 0.8, run this from the CLI:
> create keyspace KEYSPACENAME with placement_strategy = 
> org.apache.Cassandra.locator.NetworkTopologyStrategy' and strategy_options = 
> [{replication_factor:2}];
> 
> If using 0.7, run "help create keyspace;" from the CLI and it'll show you the 
> correct syntax.
> 
> 
> More info on tokens:
> http://journal.paul.querna.org/articles/2010/09/24/cassandra-token-selection/
> http://wiki.apache.org/cassandra/Operations#Token_selection
> 
> 
> On Wed, May 11, 2011 at 4:58 PM, Anurag Gujral  
> wrote:
> Hi All,
>  I am testing network topology strategy in cassandra I am using 
> two nodes , one node each in different data center.
> Since the nodes are in different dc I assigned token 0 to both the nodes.
> I added both the nodes as seeds in the cassandra.yaml and  I am  using 
> properyfilesnitch as endpoint snitch where I have specified the colo details.
> 
> I started first node then I when I restarted second node I got an error that 
> token "0" is already being used.Why am I getting this error.
> 
> Second Question: I already have cassandra running in two different data 
> centers I want to add a new keyspace which uses networkTopology strategy
> in the light of above errors how can I accomplish this.
> 
> 
> Thanks
> Anurag
> 
> 
> 
> 
> 
> -- 
> Narendra Sharma
> Solution Architect
> http://www.persistentsys.com
> http://narendrasharma.blogspot.com/
> 
> 
>

Re: Excessive allocation during hinted handoff

2011-05-11 Thread Jonathan Ellis

Doesn't really look abnormal to me for a heavy write load situation
which is what "receiving hints" is.

On Wed, May 11, 2011 at 1:55 PM, Gabriel Tataranu  wrote:
> Greetings,
>
> I'm experiencing some issues with 2 nodes (out of more than 10). Right
> after startup (Listening for thrift clients...) the nodes will create
> objects at high rate using all available CPU cores:
>
>  INFO 18:13:15,350 GC for PS Scavenge: 292 ms, 494902976 reclaimed
> leaving 2024909864 used; max is 6658457600
>  INFO 18:13:20,393 GC for PS Scavenge: 252 ms, 478691280 reclaimed
> leaving 2184252600 used; max is 6658457600
> 
>  INFO 18:15:23,909 GC for PS Scavenge: 283 ms, 452943472 reclaimed
> leaving 5523891120 used; max is 6658457600
>  INFO 18:15:24,912 GC for PS Scavenge: 273 ms, 466157568 reclaimed
> leaving 5594606128 used; max is 6658457600
>
> This will eventually trigger old-gen GC and then the process repeats
> until hinted handoff finishes.
>
> The build version was updated from 0.7.2 to 0.7.5 but the behavior was
> exactly the same.
>
> Thank you.
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

54 matches

Mail list logo