Hi, Bo, I am glad that your issue is resolved! As for the upgrade to Kafka 0.9, you should not need to wait if you want to upgrade broker version. The bug is on the broker side, hence, if you upgrade Kafka broker version to 0.9, you will bring in the fix. And Kafka broker 0.9 supports client with lower version (e.g. the version Samza currently uses: 0.8.2). LinkedIn has been running w/ this combination since Samza 0.10 was released and it is proven to be reliable.
Please let us know if you have any further questions on this. Thanks! -Yi On Mon, May 9, 2016 at 9:05 PM, Liu Bo <diabl...@gmail.com> wrote: > Hi Yi > > Thanks a log the reply and the hint. It's more about a kafka issue, the > lucky thing is that Samza experts "happens to" be kafka experts. :) > > I just checked the cleaner log and found out we run to this issue: > https://issues.apache.org/jira/browse/KAFKA-1641 > the log cleaner stop for about a month for the checkpoint topic. > > I remove the cleaner-offset-checkpoint for the corresponding broker and > restart them to let the log cleaner running again from the beginning. > > Now the checkpoint size is reduced to KB level after the first cleanup, a > healthy log cleaner thread will definitely solve this problem. I will keep > monitoring the cleaner log. > > The other thing is that issue is fixed in kafka 0.9.0.0 > <https://issues.apache.org/jira/browse/KAFKA/fixforversion/12328745>, and > I'm really looking forward to samza support for kafka 0.9.0, I saw some > discussion about this topic in the email list, I guess I have to wait for a > while. > > > > On 10 May 2016 at 01:24, Yi Pan <nickpa...@gmail.com> wrote: > > > Hi, Bo, > > > > I embedded my answers in-between: > > > > On Sun, May 8, 2016 at 9:00 PM, Liu Bo <diabl...@gmail.com> wrote: > > > > > The other thing is log retention is set to 24 hour or 30GB. But seems > not > > > working for checkpoint topic. As all the *.log file are there unlike > the > > > data topic which only has recent ones. > > > > > > > > When your topic cleanup policy is set to log compact, the time-retention > > policy will not be effective again. Hence, the reduction of checkpoint > > topic size purely depends on the log compaction on the Kafka broker. > > > > > > > I am going to dig further on this (never config compaction before), and > > > your suggestions would be grateful. > > > > > > > > It would be good to check whether your log compact thread on the Kafka > > broker works in a healthy mode, and how often it is triggered. > > > > > > > > > > > > > -- > > > All the best > > > > > > Liu Bo > > > > > > > > > -- > All the best > > Liu Bo >