Thanks for tracking that down! 0.7 OPP adds additional checks (and if you're starting from scratch you should use BOP instead) that keys are valid UTF8, so it shouldn't be an issue there.
On Mon, May 2, 2011 at 7:39 AM, Daniel Doubleday <daniel.double...@gmx.net> wrote: > Just for the record: > > The problem had nothing to do with bad memory. After some more digging it > turned out that due to a bug we wrote invalid utf-8 sequences as row keys. In > 0.6 the key tokens are constructed from string decoded bytes. This does not > happen anymore in 0.7 files. So what apparently happened during compaction was > > 1. read sst and generate string based order rows > 2. write the new file based on that order > 3. read the compacted file based on raw bytes order -> crash > > That bug never made it to production so we are fine. > > On Apr 29, 2011, at 10:32 AM, Daniel Doubleday wrote: > >> Bad == Broken >> >> That means you cannot rely on 1 == 1. In such a scenario everything can >> happen including data loss. >> That's why you want ECC mem on production servers. Our cheapo dev boxes dont. >> >> On Apr 28, 2011, at 7:46 PM, mcasandra wrote: >> >>> What do you mean by Bad memory? Is it less heap size, OOM issues or >>> something >>> else? What happens in such scenario, is there a data loss? >>> >>> Sorry for many questions just trying to understand since data is critical >>> afterall :) >>> >>> -- >>> View this message in context: >>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Strange-corrupt-sstable-tp6314052p6314218.html >>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at >>> Nabble.com. >> > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com