Re: KafkaCheckPointManager is too slow

2015-11-04 Thread Chinmay Soman
Hey Jae , we've turned off log compaction because of issues seen earlier with the log compaction thread on kafka. We're still on 0.8.2 so we cant really turn it on. Our best bet is to set retention to like 3 hours or something. On Nov 4, 2015 12:18 AM, "Bae, Jae Hyeon" wrote: > Also, I am using

Re: KafkaCheckPointManager is too slow

2015-11-04 Thread Bae, Jae Hyeon
Hi Yi There are 8 partitions in the input topic. The size of checkpoint topic is 26 MB. segment.bytes26214400cleanup.policycompact On Tue, Nov 3, 2015 at 11:12 PM, Yi Pan wrote: > Hi, Jae, > > That's correct. I mentioned that just to confirm that checkpoint topic > should be log-compacted. The

Re: KafkaCheckPointManager is too slow

2015-11-04 Thread Bae, Jae Hyeon
Also, I am using Samza 0.9.1. On Wed, Nov 4, 2015 at 12:18 AM, Bae, Jae Hyeon wrote: > Hi Yi > > There are 8 partitions in the input topic. The size of checkpoint topic is > 26 MB. > > segment.bytes26214400cleanup.policycompact > > On Tue, Nov 3, 2015 at 11:12 PM, Yi Pan wrote: > >> Hi, Jae, >>

Re: KafkaCheckPointManager is too slow

2015-11-03 Thread Yi Pan
Hi, Jae, That's correct. I mentioned that just to confirm that checkpoint topic should be log-compacted. Then, the next question is: what's the size of the data in the checkpoint topic and how many input topic partitions you have in the job? It would also be helpful if you can share which version

Re: KafkaCheckPointManager is too slow

2015-11-03 Thread Bae, Jae Hyeon
Hi Yi My colleague found that samza automatically set log compaction when creating the checkpointing topic. Topic:__samza_checkpoint_ver_1_for_xxx_1 PartitionCount:1 ReplicationFactor:3 Configs:segment.bytes=26214400,cleanup.policy=compact Topic: __samza_checkpoint_ver_1_for_xxx_1 Partition: 0 L

Re: KafkaCheckPointManager is too slow

2015-11-03 Thread Bae, Jae Hyeon
I didn't know I have to manually set the checkpoint topic as log-compaction. Is there any reason why log-compaction is not being set by default on the checkpoint topic? I will try and see how it works. Thank you Best, Jae On Tue, Nov 3, 2015 at 8:33 PM, Yi Pan wrote: > Hi, Bae, > > Where did y

Re: KafkaCheckPointManager is too slow

2015-11-03 Thread Yi Pan
Hi, Bae, Where did you see this log? Is it in JobRunner? Or AppMaster? Or SamzaContainer? There are a few factors that may have the impact: 1. How many system stream partitions you have as the input? And how many tasks are there? 2. Did you set your checkpoint topic as log-compact topic in Kafka?

KafkaCheckPointManager is too slow

2015-11-03 Thread Bae, Jae Hyeon
Hi Samza Dev Do you know why the following job is taking too long? 2015-11-03 23:58:17 KafkaCheckpointManager [INFO] Get latest offset 3386930 for topic __samza_checkpoint_ver_1_for_xxx_1 and partition 0. This is seriously slowing down development. How can I fix this problem? Thank you Best, Ja