Ok..I got your point. Currently we check the log segment constraints (segment.bytes, segment.ms) only before appending new messages. So we will not create a new log segment until new data comes.
In your case, your approach(sending periodic dummy/ping message) should be fine. On Tue, Jun 16, 2015 at 7:19 PM, Shayne S <shaynest...@gmail.com> wrote: > Thank you for the response! > > Unfortunately, those improvements would not help. It is the lack of > activity resulting in a new segment that prevents compaction. > > I was confused by what qualifies as the active segment. The active segment > is the last segment as opposed to the segment that would be written to if > something were received right now. > > On Tue, Jun 16, 2015 at 8:38 AM, Manikumar Reddy <ku...@nmsworks.co.in> > wrote: > > > Hi, > > > > Your observation is correct. we never compact the active segment. > > Some improvements are proposed here, > > https://issues.apache.org/jira/browse/KAFKA-1981 > > > > > > Manikumar > > > > On Tue, Jun 16, 2015 at 5:35 PM, Shayne S <shaynest...@gmail.com> wrote: > > > > > Some further information, and is this a bug? I'm using 0.8.2.1. > > > > > > Log compaction will only occur on the non active segments. Intentional > > or > > > not, it seems that the last segment is always the active segment. In > > other > > > words, an expired segment will not be cleaned until a new segment has > > been > > > created. > > > > > > As a result, a log won't be compacted until new data comes in (per > > > partition). Does this mean I need to send the equivalent of a pig ( > > > https://en.wikipedia.org/wiki/Pigging) through each partition in order > > to > > > force compaction? Or can I force the cleaning somehow? > > > > > > Here are the steps to recreate: > > > > > > 1. Create a new topic with a 5 minute segment.ms: > > > > > > kafka-topics.sh --zookeeper localhost:2181 --create --topic TEST_TOPIC > > > --replication-factor 1 --partitions 1 --config cleanup.policy=compact > > > --config min.cleanable.dirty.ratio=0.01 --config segment.ms=300000 > > > > > > 2. Repeatedly add messages with identical keys (3x): > > > > > > echo "ABC123,{\"test\": 1}" | kafka-console-producer.sh --broker-list > > > localhost:9092 --topic TEST_TOPIC --property parse.key=true --property > > > key.separator=, --new-producer > > > > > > 3. Wait 5+ minutes and confirm no log compaction. > > > 4. Once satisfied, send a new message: > > > > > > echo "DEF456,{\"test\": 1}" | kafka-console-producer.sh --broker-list > > > localhost:9092 --topic TEST_TOPIC --property parse.key=true --property > > > key.separator=, --new-producer > > > > > > 5. Log compaction will occur quickly soon after. > > > > > > Is my use case of infrequent logs not supported? Is this intentional > > > behavior? It's unnecessarily challenging to target each partition with > a > > > dummy message to trigger compaction. > > > > > > Also, I believe there is another issue with logs originally configured > > > without a segment timeout that lead to my original issue. I still > cannot > > > get those logs to compact. > > > > > > Thanks! > > > Shayne > > > > > >