Strange things happen.

It wasn't a single row, but one single "part" file of the Hadoop's input that failed - we didn't manage to find a specific row that causes the problem. However, it keeps failing only on production, where we can't experiment with it a lot. We tried to reproduce it in a few ways on 3 different environments, but we were unable to do it.

We have to leave this problem for now.
Thanks for help anyway :-)

M.

W dniu 02.04.2013 10:02, Michal Michalski pisze:
Thanks for reply, Aaron. Unluckily, I think it's not the case - we did
some quick tests last week and for now it _seems_ that:

1) There was no empty / zero-lenght key in data we loaded - that was the
first thing we checked
2) By "bisecting" the data, we found out that the row that makes the
problem is the one with longest key (184 characters; much longer that
other keys we have in this file, but it's still not much and definitely
far, far below the 64K limit mentioned here:
http://wiki.apache.org/cassandra/FAQ#max_key_size ) - not sure yet if it
matters, but it's the only thing that makes him different. It has only
one, short column - nothing special.
3) Loading the same data using Thrift finished with no error, but the
row we have a problem with is NOT present in Cassandra - this is so
strange, that I'll double-check it.

However, we'll try do a few more tests in next few days to make 100%
sure what in our data causes the problem. I'll update you if we learn
something new.

M.

W dniu 31.03.2013 12:01, aaron morton pisze:
  but yesterday one of 600 mappers failed

:)

 From what I can understand by looking into the C* source, it seems
to me that the problem is caused by a empty (or surprisingly
finished?) input buffer (?) causing token to be set to -1 which is
improper for RandomPartitioner:
Yes, there is a zero length key which as a -1 token.

However, I can't figure out what's the root cause of this problem.
Any ideas?
mmm, the BulkOutputFormat uses a SSTableSimpleUnsortedWriter and
neither of them check for zero length row keys. I would look there first.

There is no validation in the  AbstractSSTableSimpleWriter, not sure
if that is by design or an oversight. Can you catch the zero length
key in your map job ?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/03/2013, at 2:26 PM, Michal Michalski <mich...@opera.com> wrote:

We're streaming data to Cassandra directly from MapReduce job using
BulkOutputFormat. It's been working for more than a year without any
problems, but yesterday one of 600 mappers faild and we got a
strange-looking exception on one of the C* nodes.

IMPORTANT: It happens on one node and on one cluster only. We've
loaded the same data to test cluster and it worked.


ERROR [Thread-1340977] 2013-03-28 06:35:47,695 CassandraDaemon.java
(line 133) Exception in thread Thread[Thread-1340977,5,main]
java.lang.RuntimeException: Last written key
DecoratedKey(5664330507961197044404922676062547179,
302c6461696c792c32303133303332352c312c646f6d61696e2c756e6971756575736572732c633a494e2c433a6d63635f6d6e635f636172726965725f43656c6c4f6e655f4b61726e6174616b615f2842616e67616c6f7265295f494e2c643a53616d73756e675f47542d49393037302c703a612c673a3133)
>= current key DecoratedKey(-1, ) writing into
/cassandra/production/IndexedValues/production-IndexedValues-tmp-ib-240346-Data.db

    at
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:133)

    at
org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:209)

    at
org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:179)

    at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122)

    at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226)

    at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166)

    at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)



 From what I can understand by looking into the C* source, it seems
to me that the problem is caused by a empty (or surprisingly
finished?) input buffer (?) causing token to be set to -1 which is
improper for RandomPartitioner:

public BigIntegerToken getToken(ByteBuffer key)
{
    if (key.remaining() == 0)
        return MINIMUM;        // Which is -1
    return new BigIntegerToken(FBUtilities.hashToBigInteger(key));
}

However, I can't figure out what's the root cause of this problem.
Any ideas?

Of course I can't exclude a bug in my code which streams these data,
but - as I said - it works when loading the same data to test cluster
(which has different number of nodes, thus different token
assignment, which might be a case too).

MichaƂ



Reply via email to