Repair has now effect

2012-09-02 Thread Patricio Echagüe
Hey folks, perhaps a dumb question but I ran into this situation and it's a
bit unclear what's going on.

We are running a 3 nodes cluster with RF 3 (cass 1.0.11). We had one issue
with one node and it was down for like 1 hour.

I brought the node up again as soon as I realized it was down (and ganglia
+ nagios triggered some alerts). But when running repair to make sure it
had all the data, I get this message:

 INFO [AntiEntropySessions:6] 2012-09-02 15:46:23,022
AntiEntropyService.java (line 663) [repair #%s] No neighbors to repair with
on range %s: session completed

Am I missing something ?

thanks


Re: Repair has now effect

2012-09-02 Thread Jim Cistaro
What does "nt ring" show (on this node and on the other two)?

That may provide some clues.

From: Patricio Echagüe mailto:patric...@gmail.com>>
Reply-To: mailto:user@cassandra.apache.org>>
Date: Sun, 2 Sep 2012 15:50:31 -0700
To: 
mailto:cassandra-u...@incubator.apache.org>>
Subject: Repair has now effect

Hey folks, perhaps a dumb question but I ran into this situation and it's a bit 
unclear what's going on.

We are running a 3 nodes cluster with RF 3 (cass 1.0.11). We had one issue with 
one node and it was down for like 1 hour.

I brought the node up again as soon as I realized it was down (and ganglia + 
nagios triggered some alerts). But when running repair to make sure it had all 
the data, I get this message:

 INFO [AntiEntropySessions:6] 2012-09-02 15:46:23,022 AntiEntropyService.java 
(line 663) [repair #%s] No neighbors to repair with on range %s: session 
completed

Am I missing something ?

thanks


Re: performance is drastically degraded after 0.7.8 --> 1.0.11 upgrade

2012-09-02 Thread aaron morton
The whole test run is taking longer ? So it could be slower queries or slower 
test setup / tear down?

If you are creating and truncate the KS for each of the 500 tests is that 
taking longer ? (Schema code has changed a lot 0.7 > 1.0)
Can you log the execution time for tests and find ones that are taking longer ?
 
There are full request metrics available on the StorageProxy JMX object. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/08/2012, at 4:45 PM, Илья Шипицин  wrote:

> we are using functional tests ( ~500 tests in time).
> it is hard to tell which query is slower, it is "slower in general".
>  
> same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings.
> as we are talking about functional tests, so we recreate KS just before tests 
> are run.
>  
> I do not know how to record queries (there are a lot of them), if you are 
> interested, I can set up a special stand for you.
> 
> 2012/8/31 aaron morton 
>>> we are running somewhat queue-like with aggressive write-read patterns.
> We'll need some more details…
> 
> How much data ?
> How many machines ?
> What is the machine spec ?
> How many clients ?
> Is there an example of a slow request ? 
> How are you measuring that it's slow ? 
> Is there anything unusual in the log ? 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 31/08/2012, at 3:30 AM, Edward Capriolo  wrote:
> 
>> If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as
>> soon as possible. If you have large bloomfilters you can hit a bug
>> where the bloom filters will not work properly.
>> 
>> 
>> On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин  wrote:
>>> we are running somewhat queue-like with aggressive write-read patterns.
>>> I was looking for scripting queries from live Cassandra installation, but I
>>> didn't find any.
>>> 
>>> is there something like thrift-proxy or other query logging/scripting engine
>>> ?
>>> 
>>> 2012/8/30 aaron morton 
 
 in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
 times slower than cassandra-0.7.8
 
 We've not had any reports of a performance drop off. All tests so far have
 show improvements in both read and write performance.
 
 I agree, such digests save some network IO, but they seem to be very bad
 in terms of CPU and disk IO.
 
 The sha1 is created so we can diagnose corruptions in the -Data component
 of the SSTables. They are not used to save network IO.
 It is calculated while streaming the Memtable to disk so has no impact on
 disk IO. While not the fasted algorithm I would assume it's CPU overhead in
 this case is minimal.
 
 there's already relatively small Bloom filter file, which can be used for
 saving network traffic instead of sha1 digest.
 
 Bloom filters are used to test if a row key may exist in an SSTable.
 
 any explanation ?
 
 If you can provide some more information on your use case we may be able
 to help.
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 30/08/2012, at 5:18 AM, Илья Шипицин  wrote:
 
 in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!)
 times slower than cassandra-0.7.8
 after some investigation carried out I noticed files with "sha1" extension
 (which are missing for Cassandra-0.7.8)
 
 in maybeWriteDigest() function I see no option fot switching sha1 digests
 off.
 
 I agree, such digests save some network IO, but they seem to be very bad
 in terms of CPU and disk IO.
 why to use one more digest (which have to be calculated), there's already
 relatively small Bloom filter file, which can be used for saving network
 traffic instead of sha1 digest.
 
 any explanation ?
 
 Ilya Shipitsin
 
 
>>> 
> 
> 



Re: force gc?

2012-09-02 Thread Alexander Shutyaev
Hi Jeffrey,

I think I described the problem wrong :) I don't want to do Java's memory
GC. I want to do cassandra's GC - that is I want to "really" remove deleted
rows from a column family and get my disc space back.

2012/8/31 Jeffrey Kesselman 

> Cassandra at least used to do disc cleanup as a side effect of
> garbage collection through finalizers.  (This is a mistake for the
> reason outlined below.)
>
> It is important to understand that you can *never* "force* a gc in java.
> Even calling System.gc() is merely a hint to the VM. What you are doing is
> telling the VM that you are * willing* to give up some processor time right
> now to gc, how much it choses to actually collect or not collect is totally
> up to the VM.
>
> The *only* garbage collection guarantee in java is that it will make a
> "best effort" to collect what it can to avoid an out of memory exception at
> the time that it runs out of memory.  You are not guaranteed when *if
> ever*, a given object will actually be collected.  Since finalizers happen
> when an object is collected, and not when it becomes a candidate for
> collection, the same is true of the finalizer.  You are
> not guaranteed when, if ever, it will run.
>
>
> On Fri, Aug 31, 2012 at 9:03 AM, Alexander Shutyaev wrote:
>
>> Hi All!
>>
>> I have a problem with using cassandra. Our application does a lot of
>> overwrites and deletes. If I understand correctly cassandra does not
>> actually delete these objects until gc_grace seconds have passed. I tried
>> to "force" gc by setting gc_grace to 0 on an existing column family and
>> running major compaction afterwards. However I did not get disk space back,
>> although I'm pretty much sure that my column family should occupy many
>> times fewer space. We have also a PostgreSQL db and we duplicate each
>> operation with data in both dbs. And the PosgreSQL table is much more
>> smaller than the corresponding cassandra's column family. Does anyone have
>> any suggestions on how can I analyze my problem? Or maybe I'm doing
>> something wrong and there is another way to force gc on an existing column
>> family.
>>
>> Thanks in advance,
>> Alexander
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>


Re: Composite row keys with SSTableSimpleUnsortedWriter for Cassandra 1.0?

2012-09-02 Thread aaron morton
I think you want the o.a.c.db.marshal.TypeParser. 

You can pass a CLI format composite type to the parse() func. 

It's in 1.0X

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/09/2012, at 6:44 AM, Jeff Schmidt  wrote:

> Hello:
> 
> I'm using DataStax Enterprise 2.1, which is based on Cassandra 1.0.10 from 
> what I can tell.  For my project, I perform a content build that generates a 
> number of SSTables using SSTableSimpleUnsortedWriter. These are loaded using 
> either JMX or sstableloader depending on the environment.
> 
> I want to introduce a composite row key into some of the generated SSTables.  
> Also, I will be referring to these keys by using composite column names.
> 
> I can define the desired composite time and provide it to the 
> SSTableSimpleUnsortedWriter constructor:
> 
>   List> compositeList = new 
> ArrayList>();
>   compositeList.add(UTF8Type.instance)
>   compositeList.add(UTF8Type.instance)
>   compositeUtf8Utf8Type = CompositeType.getInstance(compositeList)
>   
>   articleWriter = new SSTableSimpleUnsortedWriter(
>   cassandraOutputDir,
>   "IngenuityContent",
>   "Articles",
>   compositeUtf8Utf8Type,
>   null,
>   64) 
> 
> I then figured I could use compositeUtf8Utf8Type when creating composite row 
> keys and column names of the kind I require.  Cassandra 1.1.x introduces the 
> CompositeType.Builder class for creating actual composite values, but that's 
> not available to me.  I've also  seen examples of using Hector's Composite to 
> create composite values.
> 
> But, I need to create these values using the various classes within Cassandra 
> 1.0 itself to work with SSTableSimpleUnsortedWriter. For that, I'm not 
> finding any examples on how one does that.
> 
> As far as I can tell, composite columns at least have been around since 
> Cassandra 0.8.x?  Is there the support I need in Cassandra 1.0.x?
> 
> Many thanks!
> 
> Jeff
> --
> Jeff Schmidt
> 535 Consulting
> j...@535consulting.com
> http://www.535consulting.com
> (650) 423-1068
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



Re: force gc?

2012-09-02 Thread Peter Schuller
> I think I described the problem wrong :) I don't want to do Java's memory
> GC. I want to do cassandra's GC - that is I want to "really" remove deleted
> rows from a column family and get my disc space back.

I think that was clear from your post. I don't see a problem with your
process. Setting gc grace to 0 and forcing compaction should indeed
return you to the smallest possible on-disk size.

Did you really not see a *decrease*, or are you just comparing the
final size with that of PostgreSQL? Keep in mind that in many cases
(especially if not using compression) the Cassandra on-disk format is
not as compact as PostgreSQL. For example column names are duplicated
in each row, and the row key is duplicated twice (once in index, once
in data).

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: force gc?

2012-09-02 Thread Peter Schuller
> I think that was clear from your post. I don't see a problem with your
> process. Setting gc grace to 0 and forcing compaction should indeed
> return you to the smallest possible on-disk size.

(But may be unsafe as documented; can cause deleted data to pop back up, etc.)

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: force gc?

2012-09-02 Thread Alexander Shutyaev
Hi Peter,

I don't compare it with PosgreSQL size, I just make some estimations.. This
table / column family stores some xml documents with average raw size of
2Mb each and total size about 5Gb. However the space cassandra occupies on
disc is 70Gb (after gc_grace was set to 0 and major compaction was run).

Maybe there is some tool to analyze it? It would be great if I could
somehow export each row of a column family into a separate file - so I
could see their count and sizes. Is there any such tool? Or maybe you have
some better thoughts...

2012/9/3 Peter Schuller 

> > I think that was clear from your post. I don't see a problem with your
> > process. Setting gc grace to 0 and forcing compaction should indeed
> > return you to the smallest possible on-disk size.
>
> (But may be unsafe as documented; can cause deleted data to pop back up,
> etc.)
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>


Re: force gc?

2012-09-02 Thread Peter Schuller
> Maybe there is some tool to analyze it? It would be great if I could somehow
> export each row of a column family into a separate file - so I could see
> their count and sizes. Is there any such tool? Or maybe you have some better
> thoughts...

Use something like pycassa to non-obnoxiously iterate over all rows:

 for row_id, row in your_column_family.get_range():


https://github.com/pycassa/pycassa

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: force gc?

2012-09-02 Thread Derek Williams
On Fri, Aug 31, 2012 at 7:03 AM, Alexander Shutyaev wrote:

> Does anyone have any suggestions on how can I analyze my problem? Or maybe
> I'm doing something wrong and there is another way to force gc on an
> existing column family.
>

Are you using leveled compaction? I haven't looked into it too much, but I
think forcing a major compaction when using leveled strategy doesn't have
the same effect as with size tiered.

-- 
Derek Williams


Re: force gc?

2012-09-02 Thread Alexander Shutyaev
Hi Derek,

I'm using size-tiered compaction.

2012/9/3 Derek Williams 

> On Fri, Aug 31, 2012 at 7:03 AM, Alexander Shutyaev wrote:
>
>> Does anyone have any suggestions on how can I analyze my problem? Or
>> maybe I'm doing something wrong and there is another way to force gc on an
>> existing column family.
>>
>
> Are you using leveled compaction? I haven't looked into it too much, but I
> think forcing a major compaction when using leveled strategy doesn't have
> the same effect as with size tiered.
>
> --
> Derek Williams
>
>


Re: How to set LeveledCompactionStrategy for an existing table

2012-09-02 Thread Data Craftsman 木匠
We have same problem.

On Friday, August 31, 2012, Jean-Armel Luce  wrote:
> Hello Aaron.
>
> Thanks for your answer
>
> Jira ticket 4597 created :
https://issues.apache.org/jira/browse/CASSANDRA-4597
>
> Jean-Armel
>
> 2012/8/31 aaron morton 
>
> Looks like a bug.
> Can you please create a ticket on
https://issues.apache.org/jira/browse/CASSANDRA and update the email thread
?
> Can you include this: CFPropDefs.applyToCFMetadata() does not set the
compaction class on CFM
> Thanks
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> On 31/08/2012, at 7:05 AM, Jean-Armel Luce  wrote:
>
> I tried as you said with cassandra-cli, and still unsuccessfully
>
> [default@unknown] use test1;
> Authenticated to keyspace: test1
> [default@test1] UPDATE COLUMN FAMILY pns_credentials with
compaction_strategy='LeveledCompactionStrategy';
> 8ed12919-ef2b-327f-8f57-4c2de26c9d51
> Waiting for schema agreement...
> ... schemas agree across the cluster
>
> And then, when I check the compaction strategy, it is still
SizeTieredCompactionStrategy
> [default@test1] describe pns_credentials;
> ColumnFamily: pns_credentials
>   Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>   Default column value validator:
org.apache.cassandra.db.marshal.UTF8Type
>   Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>   GC grace seconds: 864000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 0.1
>   DC Local Read repair chance: 0.0
>   Replicate on write: true
>   Caching: KEYS_ONLY
>   Bloom Filter FP chance: default
>   Built indexes: []
>   Column Metadata:
> Column Name: isnew
>   Validation Class: org.apache.cassandra.db.marshal.Int32Type
> Column Name: ts
>   Validation Class: org.apache.cassandra.db.marshal.DateType
> Column Name: mergestatus
>   Validation Class: org.apache.cassandra.db.marshal.Int32Type
> Column Name: infranetaccount
>   Validation Class: org.apache.cassandra.db.marshal.UTF8Type
> Column Name: user_level
>   Validation Class: org.apache.cassandra.db.marshal.Int32Type
> Column Name: msisdn
>   Validation Class: org.apache.cassandra.db.marshal.LongType
> Column Name: mergeusertype
>   Validation Class: org.apache.cassandra.db.marshal.Int32Type
>   Compaction Strategy:
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>   Compression Options:
> sstable_compression:
org.apache.cassandra.io.compress.SnappyCompressor
>
>
>
> I tried also to create a new table with LeveledCompactionStrategy (using
cqlsh), and when I check the compaction strategy, the
SizeTieredCompactionStrategy is set for this table.
>
> cqlsh:test1> CREATE TABLE pns_credentials3 (
>  ...   ise text PRIMARY KEY,
>  ...   isnew int,
>  ...   ts timestamp,
>  ...   mergestatus int,
>  ...   infranetaccount text,
>  ...   user_level int,
>  ...   msisdn bigint,
>  ...   mergeusertype int
>  ... ) WITH
>  ...   comment='' AND
>

-- 
Thanks,

Charlie (@mujiang) 木匠
===
Data Architect Developer
http://mujiang.blogspot.com