You mentioned that you saw few corrupted messages, (< 0.1%). If so are you able to see some corrupted messages if you produce, say, 10M messages?
On Wed, Mar 23, 2016 at 9:40 PM, sunil kalva <kalva.ka...@gmail.com> wrote: > I am using java client and kafka 0.8.2, since events are corrupted in > kafka broker i cant read and replay them again. > > On Thu, Mar 24, 2016 at 9:42 AM, Becket Qin <becket....@gmail.com> wrote: > > > Hi Sunil, > > > > The messages in Kafka has a CRC stored with each of them. When consumer > > receives a message, it will compute the CRC from the message bytes and > > compare it to the stored CRC. If the computed CRC and stored CRC does not > > match, that indicates the message has corrupted. I am not sure in your > case > > why the message is corrupted. Corrupted message seems to be pretty rare > > because the broker actually validate the CRC before it stores the > messages > > on to the disk. > > > > Is this problem reproduceable? If so, can you find out the messages that > > are corrupted? Also, are you using the Java clients or some other > clients? > > > > Jiangjie (Becket) Qin > > > > On Wed, Mar 23, 2016 at 8:28 PM, sunil kalva <kalva.ka...@gmail.com> > > wrote: > > > > > can some one help me out here. > > > > > > On Wed, Mar 23, 2016 at 7:36 PM, sunil kalva <kalva.ka...@gmail.com> > > > wrote: > > > > > > > Hi > > > > I am seeing few messages getting corrupted in kafka, It is not > > happening > > > > frequently and percentage is also very very less (less than 0.1%). > > > > > > > > Basically i am publishing thrift events in byte array format to kafka > > > > topics(with out encoding like base64), and i also see more events > than > > i > > > > publish (i confirm this by looking at the offset for that topic). > > > > For example if i publish 100 events and i see 110 as offset for that > > > topic > > > > (since it is in production i could not get exact messages which > causing > > > > this problem, and we will only realize this problem when we consume > > > because > > > > our thrift deserialization fails). > > > > > > > > So my question is, is there any magic byte which actually determines > > the > > > > boundary of the message which is same as the byte i am sending or or > > for > > > > any n/w issues messages get chopped and stores as one message to > > multiple > > > > messages on server side ? > > > > > > > > tx > > > > SunilKalva > > > > > > > > > >