Re : Compressed families not created on new node
The I/O errors are caused by disk failure. Syslog contains some of those things: Jan 16 09:53:24 --- kernel: [7065781.460804] sd 4:0:0:0: [sda] Add. Sense: Unrecovered read error Jan 16 09:53:24 --- kernel: [7065781.460810] sd 4:0:0:0: [sda] CDB: Read(10): 28 00 11 cf 60 70 00 00 08 00 Jan 16 09:53:24 --- kernel: [7065781.460820] end_request: I/O error, dev sda, sector 298803312 Scrub failed: INFO [CompactionExecutor:5818] 2012-01-16 09:45:20,650 CompactionManager.java (line 477) Scrubbing SSTableReader(path='/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db') ERROR [CompactionExecutor:5818] 2012-01-16 09:47:51,531 PrecompactedRow.java (line 119) Skipping row DecoratedKey(Token(bytes[01f9332e566a3a8d5a1cc17e530ae46e]), 01f9332e566a3a8d5a1cc17e530ae46e) in /home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db java.io.IOException: (/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db) failed to read 13705 bytes from offset 3193541. at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:87) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:75) at org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:354) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:147) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:232) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:115) at org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:102) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:133) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:139) at org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:565) at org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:472) at org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$3.call(CompactionManager.java:224) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,531 CompactionManager.java (line 581) Non-fatal error reading row (stacktrace follows) java.lang.NullPointerException WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,532 CompactionManager.java (line 623) Row at 14740167 is unreadable; skipping to next ERROR [CompactionExecutor:5818] 2012-01-16 09:53:24,395 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:5818,1,RMI Runtime] java.io.IOException: (/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db) failed to read 13705 bytes from offset 3193541. at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:87) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:75) at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:259) at org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:625) at org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:472) at org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63) at org.apache.cassandra.db.compaction.CompactionManager$3.call(CompactionManager.java:224) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) The same kind of "failed to read" IOExceptions have been routine
Re: About initial token, autobootstraping and load balance
Yep, I think I can. Here you are: https://github.com/tivv/cassandra-balancer 2012/1/15 Carlos Pérez Miguel > If you can partage it would be greate > > Carlos Pérez Miguel > > > > 2012/1/15 Віталій Тимчишин : > > Yep. Have written groovy script this friday to perform autobalancing :) > I am > > going to add it to my jenkins soon. > > > > > > 2012/1/15 Maxim Potekhin > >> > >> I see. Sure, that's a bit more complicated and you'd have to move tokens > >> after adding a machine. > >> > >> Maxim > >> > >> > >> > >> On 1/15/2012 4:40 AM, Віталій Тимчишин wrote: > >> > >> It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes, > >> growing. > >> > >> 2012/1/14 Maxim Potekhin > >>> > >>> I'm just wondering -- what's wrong with manual specification of tokens? > >>> I'm so glad I did it and have not had problems with balancing and all. > >>> > >>> Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster, > >>> when had to move tokens to make it 33/33/33 and I screwed up a little > in > >>> that the first one did not start with 0, which is not a good idea. > >>> > >>> Maxim > >>> > >>> > >> > >> -- > >> Best regards, > >> Vitalii Tymchyshyn > >> > >> > > > > > > > > -- > > Best regards, > > Vitalii Tymchyshyn > -- Best regards, Vitalii Tymchyshyn
nodetool ring question
Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
[RELEASE] Apache Cassandra 1.0.7 released
The Cassandra team is pleased to announce the release of Apache Cassandra version 1.0.7. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here: http://cassandra.apache.org/ Downloads of source and binary distributions are listed in our download section: http://cassandra.apache.org/download/ This version is maintenance/bug fix release[1]. As always, please pay attention to the release notes[2] and Let us know[3] if you were to encounter any problem. Have fun! [1]: http://goo.gl/t92dy (CHANGES.txt) [2]: http://goo.gl/glkt5 (NEWS.txt) [3]: https://issues.apache.org/jira/browse/CASSANDRA
Configuring leveled compaction
Its technically possible to have without breaking basic levelDB algorithm configurable sstable size and count on different levels? something like: level 1 - 10 x 50 MB tables level 2 - 60 x 40 MB tables level 3 - 150 x 30 MB tables I am interested in more deeper leveldb research, because currently generates too much compaction IO.
RE: JMX BulkLoad weirdness
Unfortunately, I'm not doing a 1-1 migration; I'm moving data from a 15-node to a 6-node cluster. In this case, that means an excessive amount of time spent repairing data put on to the wrong machines. Also, the bulkloader's requirement of having either a different IP address or a different machine is something that I don't really want to bother with, if I can activate it through JMX. It seems like the JMX bulkloader works perfectly fine, however, except for the error that I mentioned below. So I suppose I'll ask again, is that error something to be concerned about? Thanks, Scott From: aaron morton [aa...@thelastpickle.com] Sent: Sunday, January 15, 2012 12:07 PM To: user@cassandra.apache.org Subject: Re: JMX BulkLoad weirdness If you are doing a straight one-to-one copy from one cluster to another try… 1) nodetool snapshot on each prod node for the system and application key spaces. 2) rsync system and app key space snapshots 3) update the yaml files on the new cluster to have the correct initial_tokens. This is not necessary as they are stored in the system KS, but it is limits surprises later. 4) Start the new cluster. For bulk load you will want to use the sstableloader http://www.datastax.com/dev/blog/bulk-loading Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/01/2012, at 3:32 AM, Scott Fines wrote: Hi all, I'm trying to copy a column family from our production cluster to our development one for testing purposes, so I thought I would try the bulkload API. Since I'm lazy, I'm using the Cassandra bulkLoad JMX call from one of the development machines. Here are the steps I followed: 1. (on production C* node): nodetool flush 2. rsync SSTables from production C* node to development C* node 3. bulkLoad SSTables through JMX But when I do that, on one of the development C* nodes, I keep getting this exception: java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTable.getMinimalKey(SSTable.java:156) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:334) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:302) at org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:156) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:88) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:184) After which, the node itself seems to stream data successfully (I'm in the middle of checking that right now). Is this an error that I should be concerned about? Thanks, Scott
Brisk with standard C* cluster
Is it possible to add Brisk only nodes to standard C* cluster? So if we have node A,B,C with standard C* then add Brisk node D,E,F for analytics?
Hector + Range query problem
Hello, I've been trying to retrieve rows based on key range but every single time I test, Hector retrieves ALL the rows, no matter the range I give it. What can I possibly be doing wrong ? Thanks. I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column family (I've added & truncated the CF quite a few times during my tests). Each row has a single column whose name is the byte value "2". The keys are 0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3. list in the CLI gives me Using default limit of 100 --- RowKey: 02 => (column=02, value=00, timestamp=1326750723079000) --- RowKey: 010002 => (column=02, value=01, timestamp=1326750723239000) --- RowKey: 020002 => (column=02, value=02, timestamp=1326750723329000) --- RowKey: 030002 => (column=02, value=03, timestamp=1326750723416000) 4 Rows Returned. Hector code: > RangeSlicesQuery query = > HFactory.createRangeSlicesQuery(keyspace, keySerializer, > columnNameSerializer, BytesArraySerializer > .get()); > query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames(( > byte)2); query.execute(); The execution log shows 1359 [main] INFO com.sensorly.heatmap.drawing.cassandra.CassandraTileDao > - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] > to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] => morton codes = > [02,010002] > getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0, > zoom=2] with 1 columns, morton = 02 > getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0, > zoom=2] with 1 columns, morton = 010002 > getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1, > zoom=2] with 1 columns, morton = 020002 > getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1, > zoom=2] with 1 columns, morton = 030002 => ALL rows are returned when I really expect it to only return the 1st one.
Re: Compressed families not created on new node
eeek, HW errors. I would guess (thats all it is) that an IO error may have stopped the schema from migrating. Stop cassandra on that node and copy the files off as best you can. I would then try a node replacement First remove the failed new node with nodetool decomission or removetoken. You are now down to one server. Copy the yaml file from the old machine (with IO errors) to a new one. To make things potentially less complicated bump the initial token slightly (e.g. add 1 to it) so the new node will not be exactly replacing the old one. Now start the new node. The other node will notice the schema is out of date and send it across. Once all the CF's are added and the schema's match stop the new node, copy the SSTable data from the old node to the new one and restart it. There are other ways to do this, this is the simplest though. With the new node in place the ring should now show the IO error node as down, the new node with a token very close to the IO error node and the one other node. You can now remove the IO error node with decomission or removetoken. Now run a repair from the new node. At any stage the rollback plan is to simply turn the IO error node back on. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/01/2012, at 10:41 PM, Alexis Lauthier wrote: > The I/O errors are caused by disk failure. Syslog contains some of those > things: > > Jan 16 09:53:24 --- kernel: [7065781.460804] sd 4:0:0:0: [sda] Add. Sense: > Unrecovered read error > Jan 16 09:53:24 --- kernel: [7065781.460810] sd 4:0:0:0: [sda] CDB: Read(10): > 28 00 11 cf 60 70 00 00 08 00 > Jan 16 09:53:24 --- kernel: [7065781.460820] end_request: I/O error, dev sda, > sector 298803312 > > > Scrub failed: > > > INFO [CompactionExecutor:5818] 2012-01-16 09:45:20,650 > CompactionManager.java (line 477) Scrubbing > SSTableReader(path='/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db') > ERROR [CompactionExecutor:5818] 2012-01-16 09:47:51,531 PrecompactedRow.java > (line 119) Skipping row > DecoratedKey(Token(bytes[01f9332e566a3a8d5a1cc17e530ae46e]), > 01f9332e566a3a8d5a1cc17e530ae46e) in > /home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db > java.io.IOException: (/home/cassprod/data/ptprod/UrlInfo-hb-1326-Data.db) > failed to read 13705 bytes from offset 3193541. > at > org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:87) > at > org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:75) > at > org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) > at > org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95) > at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) > at > org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:354) > at > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120) > at > org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37) > at > org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:147) > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:232) > at > org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:115) > at > org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:102) > at > org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:133) > at > org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:139) > at > org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:565) > at > org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:472) > at > org.apache.cassandra.db.compaction.CompactionManager.access$300(CompactionManager.java:63) > at > org.apache.cassandra.db.compaction.CompactionManager$3.call(CompactionManager.java:224) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,531 > CompactionManager.java (line 581) Non-fatal error reading row (stacktrace > follows) > java.lang.NullPointerException > WARN [CompactionExecutor:5818] 2012-01-16 09:47:51,
Re: nodetool ring question
You can cross check the load with the SSTable Live metric for each CF in nodetool cfstats. Can you also double check what you are seeing on disk ? (sorry got to ask :) ) Finally compare du -h and df -h to make sure they match. (Sure they will, just a simple way to check disk usage makes sense). Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/01/2012, at 11:04 PM, Michael Vaknine wrote: > Hi, > > I have a 4 nodes cluster 1.0.3 version > > This is what I get when I run nodetool ring > > Address DC RackStatus State LoadOwns > Token > > 127605887595351923798765477786913079296 > 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB25.00% > 0 > 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB25.00% > 42535295865117307932921825928971026432 > 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% > 85070591730234615865843651857942052864 > 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB25.00% > 127605887595351923798765477786913079296 > > I have finished running repair on all 4 nodes. > > I have less then 10 GB on the /var/lib/cassandra/data/ folders > > My question is Why nodetool reports almost 50 GB on each node? > > Thanks > Michael