> On Oct. 15, 2014, 4:19 a.m., Joel Koshy wrote: > > core/src/main/scala/kafka/log/Log.scala, line 515 > > <https://reviews.apache.org/r/26663/diff/2/?file=721126#file721126line515> > > > > Thinking about this a bit more: do you think it would be safer to > > interpret jitter as an additive value to segmentMs? > > > > i.e., the actual age for rolling will be config.segmentMs + > > segment.rollJitterMs; (and limit segment.rollJitterMs to an interval of > > [0, config.segmentMs] which you are already doing.) > > > > Otherwise if a user happens to set a high jitter time then nearly empty > > segments roll often (with high probability). > > > > Another way to interpret it is as a jitter window. i.e., the actual age > > for rolling will be config.segmentMs + segment.rollJitterMs; and limit > > segment.rollJitterMs to an interval of [-config.segmentMs / 2, > > config.segmentMs / 2] > > > > Thoughts? > > Ewen Cheslack-Postava wrote: > I considered all of these options. The reason I went with the approach in > the patch is that it preserves the meaning of segmentJitterMs which says it > is a maximum, albeit soft, time. That said, since this needs to be explicitly > enabled by setting the new parameter to be non-zero, I don't think it would > be unreasonable to expect someone enabling it to understand the implications. > I personally think the final option that causes the average time to be > config.segmentMs is most intuitive, but as long as the effect is clearly > documented they are all effectively equivalent assuming uniform sampling. > > Neha Narkhede wrote: > Thinking about this more in the context of how we initially found this > problem. Users want to use time based rolling along side time based retention > since it makes it easier to reason about the time window of data in a > segment. This is mainly useful when resetting offset based on time since > offsets are returned only on segment boundaries. From a user perspective, > time based rolling is just supposed to work out of the box and not have > performance implications in large clusters, which in fact, today it does. > This is also very nuanced for most users to understand and work around and > almost everyone would just expect Kafka to do the right thing. Essentially, > I'm arguing this not be a configurable constant value but a value derived > from performance tests done by us. Even if it has to be exposed through a > config, it seems better for it to be a function of the segment roll time > instead of a constant value. This way, people don't have to worry about it > except in rare cases where it nee ds to be tuned and even then is difficult to screw up. > > It might make sense to run a stress test (possibly using some tweaked > version of StressTestLog.scala). > > Joel Koshy wrote: > I think Ewen's evaluation criteria is a useful one. i.e., what is the > average age going to be. In the current patch, the age ranges from > [segment.ms - randomJitter, segment.ms] where randomJitter ranges from [0, > min(jitterMs, segment.ms)]. If jitterMs == segment.ms the average age will be > segment.ms / 2. If the age ranges from [segment.ms, segment.ms + > randomJitter] the average age will be segment.ms + segment.ms / 2. If the age > ranges from [segment.ms - randomJitter / 2, segment.ms + randomJitter / 2) > the average age will be segment.ms - which is the most intuitive. > > @Neha I actually think all this will be an interim solution until we get > timestamps into the message metadata and log index. There is a thread > discussing that. When that is done I think we can do with only size-based > rolling and time-based retention can be done by using message header metadata.
The concerns here are very much opposed -- Joel seems to be interested in good, intuitive control over the exact results of using jitter but Neha wants things to just work. I assume this issue only comes up when you have a large enough deployment that a lot of logs can roll at once, in which case you're probably tweaking a bunch of settings anyway. I'm also not sure we could come up with one good constant since the problem scales with the # of partitions. I think the best we could do is try to come up with a conservative maximum # of partitions/logs (per disk?) to support well without tweaking, then measure an average fsync time and choose a default based on that. Then again, for the "just works" case, the default roll time is 1 week, so even a small jitter (e.g. minutes) would have little impact on the timing and be more than enough jitter. I think the most useful input here would be from an ops person who could say what their ideal is (and whether they think a constant value would be able to reasonably solve the problem). - Ewen ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26663/#review56652 ----------------------------------------------------------- On Oct. 14, 2014, 10:33 p.m., Ewen Cheslack-Postava wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/26663/ > ----------------------------------------------------------- > > (Updated Oct. 14, 2014, 10:33 p.m.) > > > Review request for kafka. > > > Bugs: KAFKA-979 > https://issues.apache.org/jira/browse/KAFKA-979 > > > Repository: kafka > > > Description > ------- > > Add a new options log.roll.jitter.ms and log.roll.jitter.hours to > add random jitter to time-based log rolling so logs aren't likely to > roll at exactly the same time. Jitter always reduces the timeout so > log.roll.ms still provides a soft maximum. Defaults to 0 so no jitter is > added by default. > > Addressing warning and Util.abs comments. > > > Diffs > ----- > > core/src/main/scala/kafka/log/Log.scala > a123cdc52f341a802b3e4bfeb29a6154332e5f73 > core/src/main/scala/kafka/log/LogCleaner.scala > c20de4ad4734c0bd83c5954fdb29464a27b91dff > core/src/main/scala/kafka/log/LogConfig.scala > d2cc9e3d6b7a4fd24516d164eb3673e6ce052129 > core/src/main/scala/kafka/log/LogSegment.scala > 7597d309f37a0b3756381f9500100ef763d466ba > core/src/main/scala/kafka/server/KafkaConfig.scala > 7fcbc16da898623b03659c803e2a20c7d1bd1011 > core/src/main/scala/kafka/server/KafkaServer.scala > 3e9e91f2b456bbdeb3055d571e18ffea8675b4bf > core/src/test/scala/unit/kafka/log/LogSegmentTest.scala > 7b97e6a80753a770ac094e101c653193dec67e68 > core/src/test/scala/unit/kafka/log/LogTest.scala > a0cbd3bbbeeabae12caa6b41aec31a8f5dfd034b > > Diff: https://reviews.apache.org/r/26663/diff/ > > > Testing > ------- > > > Thanks, > > Ewen Cheslack-Postava > >