On Mon, Apr 20, 2015 at 10:16 AM, Kevin Burton <bur...@spinn3r.com> wrote:
> On Mon, Apr 20, 2015 at 6:24 AM, Tim Bain <tb...@alumni.duke.edu> wrote: > > > I'm confused about what would drive the need for this. > > > > Is it the ability to hold more messages than your JVM size allows? If > so, > > we already have both KahaDB and LevelDB; what does Chronicle offer that > > those other two don't? > > > > > The ability to pack more messages into the JVM but keep it in memory which > is MUCH faster than disk, even with message serialization. > > Also, the ability to avoid GC lock pauses during compaction of large heaps. > > > This is already a technique used in other databases. For example, > Cassandra does slab allocation where memtables are stored off heap for the > same reasons. > > > > Is it because you see some kind of inefficiency in how ActiveMQ uses > memory > > or how the JVM's GC strategies work? > > > Both, and how hash table re-allocation works. > > > > If so, can you elaborate on what > > you're concerned about? (You made a statement that sounds like "the JVM > > can only use half its memory, because the other half has to be kept free > > for GCing", > > > I misspoke. I meant to say hash re-allocation. > > The three things things this solves are: > > 1. MUCH tighter storage of objects in memory. a 15-20x memory saving is > possible because Java is very bad at representing objects in memory. > I hadn't realized it could drive that large of a reduction in memory usage; that makes sense for why it would be valuable for people who are keeping lots of messages in memory. I'm not sure we'd make the same tradeoff (though we might to save money on RAM), so as usual my response is "great idea, but make it a configurable option rather than the only way to do it", but now I can certainly see why you'd want it. > 2. Lower full GC pauses since the JVM doesn’t have to copy 10GB of RAM > each full GC. > > 3. Additional free memory because less is needed during hashtable > re-allocation. (though I haven’t been able to duplicate this in practice > with LinkedHashMap). > > which doesn't match my experience at all. I've observed G1GC > > to successfully GC when the heap was nearly 100% full, I'm certain it's > not > > a problem for CMS because CMS is a non-compacting Old Gen GC strategy - > > that's why it's subject to fragmentation - and I believe that ParallelGC > > does in-place compaction so it wouldn't require additional memory though > I > > haven't directly observed it during a GC. Please either correct my > > interpretation of what your statement or provide the data you're basing > it > > on.) > > > > #1 above is my BIG motivating factor. I haven’t seen #2 or #3 in > production though I may have seen #3 with parallel GC. I resolved it by > allocating more memory. > If you're not using G1GC, you really should consider it. Pure throughput is a bit lower (G1 comes with more overhead than Parallel GC), but the frequency of full GCs is way lower because it's able to do incremental collection of most Old Gen objects. Some Old Gen objects fail to collect during the incremental phase but successfully collect during the full GC (and we haven't spent the time to see if tuning the G1GC parameters allows all of them to collect), but it's still a huge improvement over how often full GCs happened under Parallel GC. > > > > One difference in GC behavior with what you're proposing is that under > your > > algorithm you'd GC each message at least twice (once when it's received > and > > put into Chronicle, and once when it's pulled from Chronicle and sent > > onward, > > > Oh yes.. but it’s in the young generation. It’s just normal garbage. > If you're using G1GC, even messages that make it to Old Gen are just normal garbage... > plus any additional reads needed to operate on the message such as > > if a new subscriber with a non-matching selector connected to the broker) > > instead of just once under the current algorithm. > > > This is a traditional space / time tradeoff. > > I’m trading a bit more CPU time for a LOT more memory. So if it’s 5% more > CPU time for 15x more memory, that’s a big $ savings. > > It can still do something like 180M messages per second encode/decode. > Which is more than fine. If you’re doing THAT many messages on your broker > you probably have other performance issues. > > We’re doing about 1000 transactions per second on our broker. > > > > > > One other thing: this would give compression at rest, but not in motion, > > and it comes at the expense of two serialization/deserialization and > > compression/decompression operations per broker traversed. > > > yes. not in motion. When you read a message, it’s temporarily decompressed, > then the messages in in Java as a real messages, then just discarded as a > young generation GC. > > But it’s part of normal GC at this point, and won’t waste any more memory. > My point was simply that this doesn't work instead of compression; it's a complementary feature, not an improved replacement. But as I thought about it I realized that when you read the message body to serialize it to Chronicle, you're likely to invoke the decompression code and end up undoing the compression of the message. (Maybe you realized the same thing in your last line; I wasn't sure.) So you either need a flag that'll tell you to recompress it when you read it back out of Chronicle (which then means you're doing wasteful decompress/compress operations) or you'll need to access the compressed bytes without running through the decompression code (e.g. using reflection to grab the field directly). Either way, make sure the message is still compressed when you send it onwards if it was compressed when you received it... > Also, there’s going to be a memory savings here WITHOUT compression. > Strings can be stored as UTF8 internally and will be MUCH more efficient. > Plus, you don’t have JVM memory overhead of non-packed objects. The > chronicle objects would be packed so you would get REALLY good storage > efficiency. > > you could use snappy compression if you wanted too but it would depend on > your benchmarks. Maybe it’s not needed. Snappy is VERy fast though. Like > 150MB/s on a single core. > > > > Maybe being > > able to store more messages in a given amount of memory is worth it to > you > > (your volumes seem a lot higher than ours, and than most installations'), > > but latency and throughput matter more to us than memory usage so we'd > live > > with using more memory to avoid the extra operations. > > > > > Yes. Latency and throughput matter MUCH more to us too.. which is why > we’re using memory. > > And the latencies and throughput in this situation would be MUCH MUCH > higher than what you can get from KahaDB and LevelDB. > Absolutely. And lower than what you could get from using Chronicle-less memory. So it could be a great additional option, for when the tradeoff was worth it to people. > > The question about why to use message bodies at all is an interesting > one, > > though the ability to compress the body once and have it stay compressed > > through multiple network writes is a compelling reason in the near term. > > > > Ah. yes. That’s a good point. The network stack would need to be > rewritten to avoid a decompression from memory, then a compression again > while sent over the wire. > I don't think it's the network stack where that code works; I'm pretty sure the message itself does decompression when the body is accessed via the getter. But when you read the message body to serialize it to Chronicle, you're likely to invoke that decompression code and end up undoing the compression of the message. So you could use a flag that'll tell you to recompress it when you read it back out of Chronicle, but then you're doing wasteful decompress/compress operations. It would be better to access the compressed bytes without running through the decompression code (e.g. using reflection to grab the field directly) to avoid those wasteful operations. Either way, make sure the message is still compressed when you send it onwards if it was compressed when you received it... > Kevin > > -- > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > blog: http://burtonator.wordpress.com > … or check out my Google+ profile > <https://plus.google.com/102718274791889610666/posts> > <http://spinn3r.com> >