Here's my summary of the state of the compression discussion:

   1. We all agree that current compression performance isn't very good and
   it would be nice to improve it.
   2. This is not entirely due to actual (de)compression, in large part it
   is inefficiencies in the current implementation. Snappy is like
   300Mb/sec/core so should not be a bottleneck. We could probably hugely
   improve performance without any fundamental changes. See:
   https://issues.apache.org/jira/browse/KAFKA-527
   3. There are really three separate things that get conflated:
      1. De-compression on the server
      2. Re-compression on the server
      3. De-compression and re-compression in mirror maker
   4. Getting rid of de-compression on the server is unlikely to happen
   because de-compression is required to validate the data sent. In the very
   early days of Kafka we did indeed just append whatever the client sent us
   to the binary log without validation. Then we realized that any bug in any
   of the clients in all the languages would potentially corrupt the log and
   potentially thus bring down the whole cluster. You can imagine how we
   realized this! This is why basically no system in the world appends client
   data directly to it's binary on disk structures. Decompression can
   potentially be highly optimized, though, by not fully instantiating
   messages.
   5. The current compression code re-compresses the data to assign it
   sequential offsets. It would be possible to improve this by allowing some
   kind of relative offset scheme where the individual messages would have
   offsets like (-3,-2,-1, 0) and this would be interpreted relative to the
   offset of the batch. This would let us avoid recompression for co-operating
   clients.
   6. This would likely require bumping the log version. Prior to doing
   this we need to have better backwards compatibility support in place to
   make this kind of upgrade easy to do.
   7. Optimizing de-compression and re-compression in mm requires having
   APIs that give you back uncompressed messages and let you send already
   compressed batches. This might be possible but it would break a lot of
   things like the proposed filters in mm. We would also need to do this in a
   way that it wasn't too gross of an API.

-Jay

On Sun, Feb 22, 2015 at 2:16 AM, Jan Filipiak <jan.filip...@trivago.com>
wrote:

> I just want to bring up that idea of no server side de/recompression
> again. Features like KAFKA-1499 <https://issues.apache.org/
> jira/browse/KAFKA-1499> seem to steer the project into a different
> direction and the fact that tickets like KAFKA-845 <
> https://issues.apache.org/jira/browse/KAFKA-845> are not getting much
> attention gives the same impression. This is something my head keeps
> spinning around almost 24/7 recently.
>
> The problem I see is that CPU's are not the cheapest part of a new server
> and if you can spare a gigahertz or some cores by just making sure your
> configs are the same across all producers I would always opt for the
> operational overhead instead of the bigger servers. I think this will
> usually decrease the tco's of kafka installations.
>
> I am currently not familiar enough with the codebase to judge if server
> side decompression happens before acknowledge. If so, these would be some
> additional milliseconds to respond faster if we could spare
> de/recompression.
>
> Those are my thoughts about server side de/recompression. It would be
> great if I could get some responses and thoughts back.
>
> Jan
>
>
>
>
> On 07.11.2014 00:23, Jay Kreps wrote:
>
>> I suspect it is possible to save and reuse the CRCs though it might be a
>> bit of an invasive change. I suspect the first usage is when we are
>> checking the validity of the messages and the second is from when we
>> rebuild the compressed message set (I'm assuming you guys are using
>> compression because I think we optimize this out otherwise). Technically I
>> think the CRCs stay the same.
>>
>> An alternative approach, though, would be working to remove the need for
>> recompression entirely on the broker side by making the offsets in the
>> compressed message relative to the base offset of the message set. This is
>> a much more invasive change but potentially better as it would also remove
>> the recompression done on the broker which is also CPU heavy.
>>
>> -Jay
>>
>> On Thu, Nov 6, 2014 at 2:36 PM, Allen Wang <aw...@netflix.com.invalid>
>> wrote:
>>
>>  Sure. Here is the link to the screen shot of jmc with the JTR file
>>> loaded:
>>>
>>> http://picpaste.com/fligh-recorder-crc.png
>>>
>>>
>>>
>>> On Thu, Nov 6, 2014 at 2:12 PM, Neha Narkhede <neha.narkh...@gmail.com>
>>> wrote:
>>>
>>>  Allen,
>>>>
>>>> Apache mailing lists don't allow attachments. Could you please link to a
>>>> pastebin or something?
>>>>
>>>> Thanks,
>>>> Neha
>>>>
>>>> On Thu, Nov 6, 2014 at 12:02 PM, Allen Wang <aw...@netflix.com.invalid>
>>>> wrote:
>>>>
>>>>  After digging more into the stack trace got from flight recorder (which
>>>>>
>>>> is
>>>>
>>>>> attached), it seems that Kafka (0.8.1.1) can optimize the usage of
>>>>>
>>>> Crc32.
>>>
>>>> The stack trace shows that Crc32 is invoked twice from Log.append().
>>>>>
>>>> First
>>>>
>>>>> is from the line number 231:
>>>>>
>>>>> val appendInfo = analyzeAndValidateMessageSet(messages)
>>>>>
>>>>> The second time is from line 252 in the same function:
>>>>>
>>>>> validMessages = validMessages.assignOffsets(offset, appendInfo.codec)
>>>>>
>>>>> If one of the Crc32 invocation can be eliminated, we are looking at
>>>>>
>>>> saving
>>>>
>>>>> at least 7% of CPU usage.
>>>>>
>>>>> Thanks,
>>>>> Allen
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 5, 2014 at 6:32 PM, Allen Wang <aw...@netflix.com> wrote:
>>>>>
>>>>>  Hi,
>>>>>>
>>>>>> Using flight recorder, we have observed high CPU usage of CRC32
>>>>>> (kafka.utils.Crc32.update()) on Kafka broker. It uses as much as 25%
>>>>>>
>>>>> of
>>>
>>>> CPU
>>>>
>>>>> on an instance. Tracking down stack trace, this method is invoked by
>>>>>> ReplicaFetcherThread.
>>>>>>
>>>>>> Is there any tuning we can do to reduce this?
>>>>>>
>>>>>> Also on the topic of CPU utilization, we observed that overall CPU
>>>>>> utilization is proportional to AllTopicsBytesInPerSec metric. Does
>>>>>>
>>>>> this
>>>
>>>> metric include incoming replication traffic?
>>>>>>
>>>>>> Thanks,
>>>>>> Allen
>>>>>>
>>>>>>
>>>>>>
>

Reply via email to