Great! I'm not using PIG. Thanks.
-----Original Message----- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Wednesday, May 18, 2011 3:07 PM To: user@cassandra.apache.org Subject: Re: AssertionError The compose() and decompose() methods of AbstractType are used only by the PIG driver (in 0.7 at least, in 0.8 I think CQL uses them too). If you're not using PIG, you safe with making those function simple pass-through, i.e, to have something along the line of: class CustomComparator extends AbstractType<ByteBuffer> { ... public ByteBuffer compose(ByteBuffer v) { return v; } public ByteBuffer decompose(ByteBuffer v) { return v; } } I'm not a PIG expert, but even if you're using it I'm not sure how much useful it is to actually diverge from what's above since PIG probably doesn't know much about your type. In any case, those function are not called during "normal" query. Sylvain On Wed, May 18, 2011 at 2:40 PM, Desimpel, Ignace <ignace.desim...@nuance.com> wrote: > Hi Sylvain, > > I did the upgrade from 0.7.4 to 0.7.5 and the exception does not occur > anymore (on Windows ...). Thanks for pointing me to the bug fix. > From the 0.7.5 version I upgraded to the 0.7.6 version, and this is also OK, > without any code changes and by still keeping the same data files generate > with the 0.7.4 version. > > Could you still give me a comment on the question regarding the AbstractType > class change? To be on the save side, I could simply make new array backed > byte buffers (that is what I need). But I ask the question because I want to > avoid allocating any object if it is not really needed since I know that I > will query for a lot of data of that type. > > Ignace > > > -----Original Message----- > From: Desimpel, Ignace [mailto:ignace.desim...@nuance.com] > Sent: Tuesday, May 17, 2011 3:33 PM > To: user@cassandra.apache.org > Subject: RE: AssertionError > > Seems like the AbstractType class has changed going from 0.7.4 to 0.7.5. > It is now required to implement a compose and decompose method. Already did > that, and it starts up with the 0.7.5 code using the 0.7.4 data and > configuration (using a smaller extra test database) Below I made a sample > implementations to illustrate another question : On the compose method , can > I simply create my own AbstractType class and use the given ByteBuffer. Or > like in the decompose example, do I need to duplicate the ByteBuffer or could > the paramT object be reused or should I make a complete copy? > > @Override > public Object compose(ByteBuffer paramByteBuffer){ > ReverseCFFloatValues oNew = new ReverseCFFloatValues(); > oNew.paramByteBuffer = paramByteBuffer; > return oNew; > } > > @Override > public ByteBuffer decompose(Object paramT){ > return > ((ReverseCFFloatValues)paramT).paramByteBuffer.duplicate(); > } > > > > -----Original Message----- > From: Sylvain Lebresne [mailto:sylv...@datastax.com] > Sent: Tuesday, May 17, 2011 2:50 PM > To: user@cassandra.apache.org > Subject: Re: AssertionError > > On Tue, May 17, 2011 at 1:46 PM, Desimpel, Ignace > <ignace.desim...@nuance.com> wrote: >> Ok, I will do that (next test will be done on some linux boxes being >> installed now, but at this time I need to gone with the current windows >> setup). >> Question : Can I use the 0.7.4 data files as is? Do I need to backup the >> datafiles in order to be able to get back to the 0.7.4 version if needed? > > Yes you can use 0.7.4 data files as is. And I can't think of a reason why you > should have problem getting back to 0.7.4 if needed, though snapshotting > before cannot hurt. > >> >> Ignace >> >> -----Original Message----- >> From: Sylvain Lebresne [mailto:sylv...@datastax.com] >> Sent: Tuesday, May 17, 2011 1:16 PM >> To: user@cassandra.apache.org >> Subject: Re: AssertionError >> >> First thing to do would be to update to 0.7.5. >> >> The assertionError you're running into is a assertion where we check if a >> skipBytes did skip all the bytes we had ask him to. As it turns out, the >> spec for skipBytes authorize it to not skip all the bytes asked even with no >> good reason. I'm pretty sure on a linux box skipBytes on a file will always >> read the number of asked bytes unless it reaches EOF, but I see you're >> running windows, so who knows what can happen. >> >> Anyway, long story short, it's a "bug" in 0.7.4 that has been fixed in >> 0.7.5. If you still run into this in 0.7.5 at least we'll know it's >> something else (and we will have a more helpful error message). >> >> -- >> Sylvain >> >> On Tue, May 17, 2011 at 12:41 PM, Desimpel, Ignace >> <ignace.desim...@nuance.com> wrote: >>> I use a custom comparator class. So I think there is a high chance >>> that I do something wrong there. I was thinking that the stack trace >>> could give a clue and help me on the way, maybe because some already got >>> the same error. >>> >>> >>> >>> Anyway, here is some more information you requested. >>> >>> >>> >>> Yaml definition : >>> >>> name: ForwardStringValues >>> >>> column_type: Super >>> >>> compare_with: >>> be.landc.services.search.server.db.cassandra.node.ForwardCFStringVal >>> u >>> e >>> s >>> >>> compare_subcolumns_with: BytesType >>> >>> keys_cached: 100000 >>> >>> rows_cached: 0 >>> >>> comment: Stores the values of functions returning string >>> >>> memtable_throughput_in_mb: 64 >>> >>> memtable_operations_in_millions: 15 >>> >>> min_compaction_threshold: 2 >>> >>> max_compaction_threshold: 5 >>> >>> >>> >>> Column Family: ForwardStringValues >>> >>> SSTable count: 8 >>> >>> Space used (live): 131311776690 >>> >>> Space used (total): 131311776690 >>> >>> Memtable Columns Count: 0 >>> >>> Memtable Data Size: 0 >>> >>> Memtable Switch Count: 0 >>> >>> Read Count: 1 >>> >>> Read Latency: 404.890 ms. >>> >>> Write Count: 0 >>> >>> Write Latency: NaN ms. >>> >>> Pending Tasks: 0 >>> >>> Key cache capacity: 100000 >>> >>> Key cache size: 8 >>> >>> Key cache hit rate: 1.0 >>> >>> Row cache: disabled >>> >>> Compacted row minimum size: 150 >>> >>> Compacted row maximum size: 7152383774 >>> >>> Compacted row mean size: 3064535 >>> >>> >>> >>> No secondary indexes. >>> >>> Total database disk size 823 Gb >>> >>> disk_access_mode: auto on 64 bit windows os >>> >>> partitioner: org.apache.cassandra.dht.ByteOrderedPartitioner >>> >>> Data was stored over a period of 5 days. >>> >>> Cassandra 0.7.4 was running as an embedded server. >>> >>> Batch insert, using the StorageProxy.mutate. >>> >>> No errors were logged during the batch insert period. >>> >>> The row key is a string representation of a positive integer value. >>> >>> The same row key is used during many different mutate calls, but all >>> super column names are different for each call. >>> >>> The column name of the super class stored is composed of the 32 >>> bytes and the bytes of 2 integer (positive and negative) values and >>> the bytes (UTF8) of the string value :[32 bytes][4 int bytes][4 int >>> bytes][string bytes] >>> >>> The custom comparator class ...ForwardCFStringValues sorts the names >>> by first sorting the string , then the 32 bytes, and then the two >>> integer values >>> >>> For each column name two subcolumns are inserted with fixed name and >>> some small binary value (about 40 bytes) >>> >>> >>> >>> The query : >>> >>> Get_slice using thrift. >>> >>> Params : >>> >>> Row key : the string representation of the positive integer String '1788' >>> thus hex values 31 37 38 38 >>> >>> ColumnParent : the column family ForwardStringValues >>> >>> SlicePredicate : SlicePredicate(slice_range:SliceRange(start:00 00 >>> 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 00 00 >>> 00 00 00 FF FF FF FF FF FF FF FF 55 52 49 4E 41 52 59 20 54 52 41 43 >>> 54 20 >>> 49 4E 46 45 43 54 49 4F 4E, finish:7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F >>> 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F 7F FF FF >>> FF 7F FF FF FF 7F 55 52 49 4E 41 52 59 20 54 52 41 43 54 20 49 4E 46 >>> 45 >>> 43 54 49 4F 4E, reversed:false, count:10000)) >>> >>> >>> >>> This SlicePredicate is supposed to fetch all the columns with the >>> string '55 >>> 52 49 4E 41 52 59 20 54 52 41 43 54 20 49 4E 46 45 43 54 49 4F 4E' >>> regardless of the other bytes in the column name. So the start and >>> finish have the same string bytes. The rest of the bytes for the >>> start values are set to the lowest possible value (32 zero bytes and >>> the bytes FFFFFFFF representing the integer value -1) , the finish >>> is set the highest possible value (32 bytes with value 7F, ...) >>> >>> >>> >>> I tested the same code but with a small data set and all seemed to be OK. >>> Even on the same database I get back results without exception if I >>> use different String values. I'm almost sure that there should be >>> columns with that string. If the string is not present I don't get the >>> error. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> From: Aaron Morton [mailto:aa...@thelastpickle.com] >>> Sent: Monday, May 16, 2011 11:33 PM >>> To: user@cassandra.apache.org >>> Subject: Re: AssertionError >>> >>> >>> >>> The code is trying to follow the column index for a row in an >>> sstable, but it cannot skip as many bytes as it would like to to get to the >>> column. >>> Helpfully the help says running out of bytes is only one of the >>> reasons why this could happen:) >>> >>> >>> >>> Can you provide some more information about the query and the data, >>> and also the upgrade history for your cluster. >>> >>> >>> >>> Thanks >>> >>> Aaron >>> >>> On 17/05/2011, at 3:07 AM, "Desimpel, Ignace" >>> <ignace.desim...@nuance.com> >>> wrote: >>> >>> Environment : java 64 bit server, java client, thrift get_slice >>> method, Cassandra 0.7.4, single node >>> >>> Depending on the data I pass for a query on a CF I get the following >>> listed below. Any suggestions what could be wrong based on the stack trace? >>> >>> >>> >>> java.lang.AssertionError >>> >>> at >>> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlo >>> c >>> k >>> Fetcher.getNextBlock(IndexedSliceReader.java:176) >>> >>> at >>> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNex >>> t >>> ( >>> IndexedSliceReader.java:120) >>> >>> at >>> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNex >>> t >>> ( >>> IndexedSliceReader.java:48) >>> >>> at >>> com.google.common.collect.AbstractIterator.tryToComputeNext(Abstract >>> I >>> t >>> erator.java:136) >>> >>> at >>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator. >>> j >>> a >>> va:131) >>> >>> at >>> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext( >>> S >>> S >>> TableSliceIterator.java:108) >>> >>> at >>> org.apache.commons.collections.iterators.CollatingIterator.set(Colla >>> t >>> i >>> ngIterator.java:282) >>> >>> at >>> org.apache.commons.collections.iterators.CollatingIterator.least(Col >>> l >>> a >>> tingIterator.java:325) >>> >>> at >>> org.apache.commons.collections.iterators.CollatingIterator.next(Coll >>> a >>> t >>> ingIterator.java:229) >>> >>> at >>> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIter >>> a >>> t >>> or.java:68) >>> >>> at >>> com.google.common.collect.AbstractIterator.tryToComputeNext(Abstract >>> I >>> t >>> erator.java:136) >>> >>> at >>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator. >>> j >>> a >>> va:131) >>> >>> at >>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumn >>> s >>> ( >>> SliceQueryFilter.java:116) >>> >>> at >>> org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(Qu >>> e >>> r >>> yFilter.java:130) >>> >>> at >>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnF >>> a >>> m >>> ilyStore.java:1368) >>> >>> at >>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFami >>> l >>> y >>> Store.java:1245) >>> >>> at >>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFami >>> l >>> y >>> Store.java:1173) >>> >>> at >>> org.apache.cassandra.db.Table.getRow(Table.java:333) >>> >>> at >>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCom >>> m >>> a >>> nd.java:63) >>> >>> at >>> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayTh >>> r >>> o >>> w(StorageProxy.java:453) >>> >>> at >>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java: >>> 3 >>> 0 >>> ) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExe >>> c >>> u >>> tor.java:886) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor. >>> java:908) >>> >>> at java.lang.Thread.run(Thread.java:662) >>> >>> >>> >>> Ignace Desimpel >> >