Re : Compressed families not created on new node

2012-01-16 Thread Alexis Lauthier
The I/O errors are caused by disk failure. Syslog contains some of those things:


Jan 16 09:53:24 --- kernel: [7065781.460804] sd 4:0:0:0: [sda]  Add. Sense: 
Unrecovered read error
Jan 16 09:53:24 --- kernel: [7065781.460810] sd 4:0:0:0: [sda] CDB: Read(10): 
28 00 11 cf 60 70 00 00 08 00
Jan 16 09:53:24 --- kernel: [7065781.460820] end_request: I/O error, dev sda, 
sector 298803312



Scrub failed:



 INFO [CompactionExecutor:5818] 2012-01-16 09:45:20,650 CompactionManager.java 
(line 477) Scrubbing 
SSTableReader(path='/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db')
ERROR [CompactionExecutor:5818] 2012-01-16 09:47:51,531 PrecompactedRow.java 
(line 119) Skipping row 
DecoratedKey(Token(bytes[01f9332e566a3a8d5a1cc17e530ae46e]), 
01f9332e566a3a8d5a1cc17e530ae46e) in 
/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db
java.io.IOException: (/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db) 
failed to read 13705 bytes from offset 3193541.
    at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:87)
    at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:75)
    at 
org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302)
    at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
    at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
    at 
org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
    at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
    at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:354)
    at 
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
    at 
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
    at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:147)
    at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:232)
    at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:115)
    at 
org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:102)
    at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:133)
    at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:139)
    at 
org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:565)
    at 
org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:472)
    at 
org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
    at 
org.apache.cassandra.db.compaction.CompactionManager$3.call(CompactionManager.java:224)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)
 WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,531 CompactionManager.java 
(line 581) Non-fatal error reading row (stacktrace follows)
java.lang.NullPointerException
 WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,532 CompactionManager.java 
(line 623) Row at 14740167 is unreadable; skipping to next
ERROR [CompactionExecutor:5818] 2012-01-16 09:53:24,395 
AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
Thread[CompactionExecutor:5818,1,RMI Runtime]
java.io.IOException: (/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db) 
failed to read 13705 bytes from offset 3193541.
    at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:87)
    at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:75)
    at 
org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:259)
    at 
org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:625)
    at 
org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:472)
    at 
org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
    at 
org.apache.cassandra.db.compaction.CompactionManager$3.call(CompactionManager.java:224)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)



The same kind of "failed to read" IOExceptions have been routine

Re: About initial token, autobootstraping and load balance

2012-01-16 Thread Віталій Тимчишин
Yep, I think I can. Here you are: https://github.com/tivv/cassandra-balancer

2012/1/15 Carlos Pérez Miguel 

> If you can partage it would be greate
>
> Carlos Pérez Miguel
>
>
>
> 2012/1/15 Віталій Тимчишин :
> > Yep. Have written groovy script this friday to perform autobalancing :)
> I am
> > going to add it to my jenkins soon.
> >
> >
> > 2012/1/15 Maxim Potekhin 
> >>
> >> I see. Sure, that's a bit more complicated and you'd have to move tokens
> >> after adding a machine.
> >>
> >> Maxim
> >>
> >>
> >>
> >> On 1/15/2012 4:40 AM, Віталій Тимчишин wrote:
> >>
> >> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes,
> >> growing.
> >>
> >> 2012/1/14 Maxim Potekhin 
> >>>
> >>> I'm just wondering -- what's wrong with manual specification of tokens?
> >>> I'm so glad I did it and have not had problems with balancing and all.
> >>>
> >>> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster,
> >>> when had to move tokens to make it 33/33/33 and I screwed up a little
> in
> >>> that the first one did not start with 0, which is not a good idea.
> >>>
> >>> Maxim
> >>>
> >>>
> >>
> >> --
> >> Best regards,
> >>  Vitalii Tymchyshyn
> >>
> >>
> >
> >
> >
> > --
> > Best regards,
> >  Vitalii Tymchyshyn
>



-- 
Best regards,
 Vitalii Tymchyshyn


nodetool ring question

2012-01-16 Thread Michael Vaknine
Hi,

 

I have a 4 nodes cluster 1.0.3 version

 

This is what I get when I run nodetool ring

 

Address DC  RackStatus State   LoadOwns
Token

 
127605887595351923798765477786913079296

10.8.193.87 datacenter1 rack1   Up Normal  46.47 GB
25.00%  0

10.5.7.76   datacenter1 rack1   Up Normal  48.01 GB
25.00%  42535295865117307932921825928971026432

10.8.189.197datacenter1 rack1   Up Normal  53.7 GB
25.00%  85070591730234615865843651857942052864

10.5.3.17   datacenter1 rack1   Up Normal  43.49 GB
25.00%  127605887595351923798765477786913079296

 

I have finished running repair on all 4 nodes.

 

I have less then 10 GB on the /var/lib/cassandra/data/ folders

 

My question is Why nodetool reports almost 50 GB on each node?

 

Thanks

Michael



[RELEASE] Apache Cassandra 1.0.7 released

2012-01-16 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra
version 1.0.7.

Cassandra is a highly scalable second-generation distributed database,
bringing together Dynamo's fully distributed design and Bigtable's
ColumnFamily-based data model. You can read more here:

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is maintenance/bug fix release[1]. As always, please pay
attention to the release notes[2] and Let us know[3] if you were to encounter
any problem.

Have fun!

[1]: http://goo.gl/t92dy (CHANGES.txt)
[2]: http://goo.gl/glkt5 (NEWS.txt)
[3]: https://issues.apache.org/jira/browse/CASSANDRA


Configuring leveled compaction

2012-01-16 Thread Radim Kolar
Its technically possible to have without breaking basic levelDB 
algorithm configurable sstable size and count on different levels?


something like:

level 1 - 10 x 50 MB tables
level 2 - 60 x 40 MB tables
level 3 - 150 x 30 MB tables

I am interested in more deeper leveldb research, because currently 
generates too much compaction IO.


RE: JMX BulkLoad weirdness

2012-01-16 Thread Scott Fines
Unfortunately, I'm not doing a 1-1 migration; I'm moving data from a 15-node to 
a 6-node cluster. In this case, that means an excessive amount of time spent 
repairing data put on to the wrong machines.

Also, the bulkloader's requirement of having either a different IP address or a 
different machine is something that I don't really want to bother with, if I 
can activate it through JMX.

It seems like the JMX bulkloader works perfectly fine, however, except for the 
error that I mentioned below. So I suppose I'll ask again, is that error 
something to be concerned about?

Thanks,

Scott

From: aaron morton [aa...@thelastpickle.com]
Sent: Sunday, January 15, 2012 12:07 PM
To: user@cassandra.apache.org
Subject: Re: JMX BulkLoad weirdness

If you are doing a straight one-to-one copy from one cluster to another try…

1) nodetool snapshot on each prod node for the system and application key 
spaces.
2) rsync system and app key space snapshots
3) update the yaml files on the new cluster to have the correct initial_tokens. 
This is not necessary as they are stored in the system KS, but it is limits 
surprises later.
4) Start the new cluster.

For bulk load you will want to use the sstableloader 
http://www.datastax.com/dev/blog/bulk-loading


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/01/2012, at 3:32 AM, Scott Fines wrote:

Hi all,

I'm trying to copy a column family from our production cluster to our 
development one for testing purposes, so I thought I would try the bulkload 
API. Since I'm lazy, I'm using the Cassandra bulkLoad JMX call from one of the 
development machines. Here are the steps I followed:

1. (on production C* node): nodetool flush  
2. rsync SSTables from production C* node to development C* node
3. bulkLoad SSTables through JMX

But when I do that, on one of the development C* nodes, I keep getting this 
exception:

java.lang.NullPointerException
at org.apache.cassandra.io.sstable.SSTable.getMinimalKey(SSTable.java:156)
at 
org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:334)
at 
org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:302)
at 
org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:156)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:88)
at 
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:184)

After which, the node itself seems to stream data successfully (I'm in the 
middle of checking that right now).

Is this an error that I should be concerned about?

Thanks,

Scott





Brisk with standard C* cluster

2012-01-16 Thread Mohit Anchlia
Is it possible to add Brisk only nodes to standard C* cluster? So if
we have node A,B,C with standard C* then add Brisk node D,E,F for
analytics?


Hector + Range query problem

2012-01-16 Thread Philippe
Hello,
I've been trying to retrieve rows based on key range but every single time
I test, Hector retrieves ALL the rows, no matter the range I give it.
What can I possibly be doing wrong ? Thanks.

I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column
family (I've added & truncated the CF quite a few times during my tests).
Each row has a single column whose name is the byte value "2". The keys are
0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3.
list in the CLI gives me

Using default limit of 100
---
RowKey: 02
=> (column=02, value=00, timestamp=1326750723079000)
---
RowKey: 010002
=> (column=02, value=01, timestamp=1326750723239000)
---
RowKey: 020002
=> (column=02, value=02, timestamp=1326750723329000)
---
RowKey: 030002
=> (column=02, value=03, timestamp=1326750723416000)

4 Rows Returned.



Hector code:

> RangeSlicesQuery query =
> HFactory.createRangeSlicesQuery(keyspace, keySerializer,
> columnNameSerializer, BytesArraySerializer
> .get());
> query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames((
> byte)2);

query.execute();


The execution log shows

1359 [main] INFO  com.sensorly.heatmap.drawing.cassandra.CassandraTileDao
>  - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2]
> to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] => morton codes =
> [02,010002]
> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0,
> zoom=2] with 1 columns, morton = 02
> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0,
> zoom=2] with 1 columns, morton = 010002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1,
> zoom=2] with 1 columns, morton = 020002
> getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1,
> zoom=2] with 1 columns, morton = 030002

=> ALL rows are returned when I really expect it to only return the 1st one.


Re: Compressed families not created on new node

2012-01-16 Thread aaron morton
eeek, HW errors. 

I would guess (thats all it is) that an IO error may have stopped the schema 
from migrating. 

Stop cassandra on that node and copy the files off as best you can. 

I would then try a node replacement

First remove the failed new node with nodetool decomission or removetoken. 

You are now down to one server. 

Copy the yaml file from the old machine (with IO errors) to a new one. To make 
things potentially less complicated bump the initial token slightly (e.g. add 1 
to it) so the new node will not be exactly replacing the old one. 

Now start the new node. The other node will notice the schema is out of date 
and send it across. 

Once all the CF's are added and the schema's match stop the new node, copy the 
SSTable data from the old node to the new one and restart it. There are other 
ways to do this, this is the simplest though. 

With the new node in place the ring should now show the IO error node as down, 
the new node with a token very close to the IO error node and the one other 
node. 

You can now remove the IO error node with decomission or removetoken. 

Now run a repair from the new node.

At any stage the rollback plan is to simply turn the IO error node back on. 

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/01/2012, at 10:41 PM, Alexis Lauthier wrote:

> The I/O errors are caused by disk failure. Syslog contains some of those 
> things:
> 
> Jan 16 09:53:24 --- kernel: [7065781.460804] sd 4:0:0:0: [sda]  Add. Sense: 
> Unrecovered read error
> Jan 16 09:53:24 --- kernel: [7065781.460810] sd 4:0:0:0: [sda] CDB: Read(10): 
> 28 00 11 cf 60 70 00 00 08 00
> Jan 16 09:53:24 --- kernel: [7065781.460820] end_request: I/O error, dev sda, 
> sector 298803312
> 
> 
> Scrub failed:
> 
> 
>  INFO [CompactionExecutor:5818] 2012-01-16 09:45:20,650 
> CompactionManager.java (line 477) Scrubbing 
> SSTableReader(path='/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db')
> ERROR [CompactionExecutor:5818] 2012-01-16 09:47:51,531 PrecompactedRow.java 
> (line 119) Skipping row 
> DecoratedKey(Token(bytes[01f9332e566a3a8d5a1cc17e530ae46e]), 
> 01f9332e566a3a8d5a1cc17e530ae46e) in 
> /home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db
> java.io.IOException: (/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db) 
> failed to read 13705 bytes from offset 3193541.
> at 
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:87)
> at 
> org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:75)
> at 
> org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302)
> at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
> at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
> at 
> org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
> at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:354)
> at 
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
> at 
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
> at 
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:147)
> at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:232)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:115)
> at 
> org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:102)
> at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:133)
> at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:139)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:565)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:472)
> at 
> org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$3.call(CompactionManager.java:224)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>  WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,531 
> CompactionManager.java (line 581) Non-fatal error reading row (stacktrace 
> follows)
> java.lang.NullPointerException
>  WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,

Re: nodetool ring question

2012-01-16 Thread aaron morton
You can cross check the load with the SSTable Live metric for each CF in 
nodetool cfstats. 

Can you also double check what you are seeing on disk ? (sorry got to ask :) )

Finally compare du -h and df -h to make sure they match. (Sure they will, just 
a simple way to check disk usage makes sense). 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/01/2012, at 11:04 PM, Michael Vaknine wrote:

> Hi,
>  
> I have a 4 nodes cluster 1.0.3 version
>  
> This is what I get when I run nodetool ring
>  
> Address DC  RackStatus State   LoadOwns   
>  Token
>   
>  127605887595351923798765477786913079296
> 10.8.193.87 datacenter1 rack1   Up Normal  46.47 GB25.00% 
>  0
> 10.5.7.76   datacenter1 rack1   Up Normal  48.01 GB25.00% 
>  42535295865117307932921825928971026432
> 10.8.189.197datacenter1 rack1   Up Normal  53.7 GB 25.00% 
>  85070591730234615865843651857942052864
> 10.5.3.17   datacenter1 rack1   Up Normal  43.49 GB25.00% 
>  127605887595351923798765477786913079296
>  
> I have finished running repair on all 4 nodes.
>  
> I have less then 10 GB on the /var/lib/cassandra/data/ folders
>  
> My question is Why nodetool reports almost 50 GB on each node?
>  
> Thanks
> Michael