Absolutely right! Thanks, fixed for 0.7.4. On Fri, Mar 11, 2011 at 4:14 PM, Erik Forkalsrud <eforkals...@cj.com> wrote: > On 03/11/2011 12:13 PM, Jonathan Ellis wrote: >> >> https://issues.apache.org/jira/browse/CASSANDRA-2158, fixed in 0.7.3 >> >> you could have saved a lot of time just by upgrading first. :) > > > It looks like the fix isn't entirely correct. The bug is still in 0.7.3. > In Memtable.java, the line: > > THRESHOLD = cfs.getMemtableThroughputInMB() * 1024 * 1024; > > should be changed to: > > THRESHOLD = cfs.getMemtableThroughputInMB() * 1024L * 1024L; > > > Here's some code that illustrates the difference: > > public void testMultiplication() { > int memtableThroughputInMB = 2300; > long thresholdA = memtableThroughputInMB * 1024 * 1024; > long thresholdB = memtableThroughputInMB * 1024L * 1024L; > System.out.println("a=" + thresholdA + " b=" + thresholdB); > } > > > > - Erik - > > >> On Fri, Mar 11, 2011 at 2:02 PM, Erik Forkalsrud<eforkals...@cj.com> >> wrote: >>> >>> On 03/11/2011 04:56 AM, Zhu Han wrote: >>>> >>>> When I run it on my laptop (Fedora 14, 64-bit, 4 cores, 8GB RAM) it >>>> flushes one Memtable with 5000 operations >>>> When I run it on a server (RHEL5, 64-bit, 16 cores, 96GB RAM) it >>>> flushes >>>> 100 Memtables with anywhere between 1 operation and 359 operations (35 >>>> bytes >>>> and 12499 bytes) >>> >>> What's the settings of commit log flush, periodic or in batch? >>> >>> >>> It's whatever the default setting is, (in the cassandra.yaml that is >>> packaged in the apache-cassandra-0.7.3-bin.tar.gz download) specifically: >>> >>> commitlog_rotation_threshold_in_mb: 128 >>> commitlog_sync: periodic >>> commitlog_sync_period_in_ms: 10000 >>> flush_largest_memtables_at: 0.75 >>> >>> If I describe keyspace I get: >>> >>> [default@unknown] describe keyspace Events; >>> Keyspace: Events: >>> Replication Strategy: org.apache.cassandra.locator.SimpleStrategy >>> Replication Factor: 1 >>> Column Families: >>> ColumnFamily: Event >>> Columns sorted by: org.apache.cassandra.db.marshal.TimeUUIDType >>> Row cache size / save period: 0.0/0 >>> Key cache size / save period: 200000.0/14400 >>> Memtable thresholds: 14.109375/3010/1440 >>> GC grace seconds: 864000 >>> Compaction min/max thresholds: 4/32 >>> Read repair chance: 1.0 >>> Built indexes: [] >>> >>> >>> It turns out my suspicion was right. When I tried overriding the jvm >>> memory >>> parameters calculated in conf/cassandra-env.sh to use the values >>> calculated >>> on my 8GB laptop like this: >>> >>> MAX_HEAP_SIZE=3932m HEAP_NEWSIZE=400m ./mutate.sh >>> >>> That made the server behave much nicer. This time it kept all 5000 >>> operations in a single Memtable. Also, when running with these memory >>> settings the Memtable thresholds changed to "1.1390625/243/1440" (from >>> "14.109375/3010/1440") (all the other output from "describe >>> keyspace" >>> remains the same) >>> >>> So it looks like something goes wrong when cassandra gets too much >>> memory. >>> >>> >>> -- >>> Erik Forkalsrud >>> Commission Junstion >>> >>> >> >> > >
-- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com