Yea, you're definitely on the right track. Have you considered systems programming, Friso? :)
Hopefully should have a candidate patch to LZO later today. -Todd On Wed, Jan 12, 2011 at 1:20 PM, Friso van Vollenhoven < fvanvollenho...@xebia.com> wrote: > Hi, > My guess is indeed that it has to do with using the reinit() method on > compressors and making them long lived instead of throwaway together with > the LZO implementation of reinit(), which magically causes NIO buffer > objects not to be finalized and as a result not release their native > allocations. It's just theory and I haven't had the time to properly verify > this (unfortunately, I spend most of my time writing application code), but > Todd said he will be looking into it further. I browsed the LZO code to see > what was going on there, but with my limited knowledge of the HBase code it > would be bald to say that this is for sure the case. It would be my first > direction of investigation. I would add some logging to the LZO code where > new direct byte buffers are created to log how often that happens and what > size they are and then redo the workload that shows the leak. Together with > some profiling you should be able to see how long it takes for these get > finalized. > > Cheers, > Friso > > > > On 12 jan 2011, at 20:08, Stack wrote: > > > 2011/1/12 Friso van Vollenhoven <fvanvollenho...@xebia.com>: > >> No, I haven't. But the Hadoop (mapreduce) LZO compression is not the > problem. Compressing the map output using LZO works just fine. The problem > is HBase LZO compression. The region server process is the one with the > memory leak... > >> > > > > (Sorry for dumb question Friso) But HBase is leaking because we make > > use of the Compression API in a manner that produces leaks? > > Thanks, > > St.Ack > > -- Todd Lipcon Software Engineer, Cloudera