e number of children)
>>into
>> smaller messages, having to parse the large xml root document and
>>re-wrap
>> each child element with a shallow clone of its parents as I iterated the
>> stream.
>>
>> -Luke
>>
>> __
g to parse the large xml root document and re-wrap
> each child element with a shallow clone of its parents as I iterated the
> stream.
>
> -Luke
>
>
> From: Denny Lee >
> Sent: Tuesday, June 24, 2014 10:35 AM
>
e 24, 2014 10:35 AM
> To: users@kafka.apache.org
> Subject: Experiences with larger message sizes
>
> By any chance has anyone worked with using Kafka with message sizes that
> are approximately 50MB in size? Based on from some of the previous threads
> there are probably s
14 10:35 AM
To: users@kafka.apache.org
Subject: Experiences with larger message sizes
By any chance has anyone worked with using Kafka with message sizes that are
approximately 50MB in size? Based on from some of the previous threads there
are probably some concerns on memory pressure due t
Thanks for the info Joe - yes, I do think this will be very useful. Will look
out for this, eh?!
On June 24, 2014 at 10:32:08 AM, Joe Stein (joe.st...@stealth.ly) wrote:
You could then chunk the data (wrapped in an outer message so you have meta
data like file name, total size, current chunk si
You could then chunk the data (wrapped in an outer message so you have meta
data like file name, total size, current chunk size) and produce that with
the partition key being filename.
We are in progress working on a system for doing file loading to Kafka
(which will eventually support both chunke
Hey Joe,
Yes, I have - my original plan is to do something similar to what you suggested
which was to simply push the data into HDFS / S3 and then having only the event
information within Kafka so that way multiple consumers can just read the event
information and ping HDFS/S3 for the actual me
Hi Denny, have you considered saving those files to HDFS and sending the
"event" information to Kafka?
You could then pass that off to Apache Spark in a consumer and get data
locality for the file saved (or something of the sort [no pun intended]).
You could also stream every line (or however you
By any chance has anyone worked with using Kafka with message sizes that are
approximately 50MB in size? Based on from some of the previous threads there
are probably some concerns on memory pressure due to the compression on the
broker and decompression on the consumer and a best practices on