Re: compression performance

2013-08-16 Thread Jan Kotek
> you see that with no compression 80% of the time goes to FileChannel.write, > But with snappy enabled only 5% goes to writing data, 50% of the time goes > to byte copying and allocation, and only about 22% goes to actual I had similar problem with MapDB, it was solved by using memory mapped fil

Re: compression performance

2013-08-15 Thread Jay Kreps
Sriram, I think I agree. Guozhang's proposal is clever but it exposes a lot of complexity to the consumer. But I think it is good to have the complete discussion. Chris, we will certainly not mess up the uncompressed case, don't worry. I think your assumption is that compression needs to be slow.

Re: compression performance

2013-08-15 Thread Chris Hogue
I would generally agree with the key goals you've suggested. I'm just coming to this discussion after some recent testing with 0.8 so I may be missing some background. The reference I found to this discussion is the JIRA issue below. Please let me know if there are others things I should look at.

Re: compression performance

2013-08-15 Thread Sriram Subramanian
We need to first decide on the right behavior before optimizing on the implementation. Few key goals that I would put forward are - 1. Decoupling compression codec of the producer and the log 2. Ensuring message validity by the server on receiving bytes. This is done by the iterator today and thi

Re: compression performance

2013-08-15 Thread Jay Kreps
Here is a comment from Guozhong on this issue. He posted it on the compression byte-copying issue, but it is really about not needing to do compression. His suggestion is interesting though it ends up pushing more complexity into consumers. Guozhang Wang commented on KAFKA-527: ---