Re: log.retention.size

András Serény Fri, 23 May 2014 02:27:07 -0700


Hi Kafka users,

this feature would also be very useful for us. With lots of topics ofdifferent volume (and as they grow in number) it could become tedious tomaintain topic level settings.

As a start, I think uniform reduction is a good idea. Logs wouldn't beretained as long as you want, but that's already the case when alog.retention.bytes setting is specified. As for early rolling, I don'tthink it's necessary: currently, if there is no log segment eligible fordeletion, log.retention.bytes and log.retention.hours settings won'tkick in, so it's possible to exceed these limits, which is completelyfine (please correct me if I'm mistaken here).

All in all, introducing a global threshold doesn't seem to induce aconsiderable change in current retention logic.


Regards,
András

On 5/8/2014 2:00 AM, vinh wrote:

Agreed…a global knob is a bit tricky for exactly the reason you've identified.  
Perhaps the problem could be simplified though by considering the context and 
purpose of Kafka.  I would use a persistent message queue because I want to 
guarantee that data/messages don't get lost.  But, since Kafka is not meant to 
be a long term storage solution (other products can be used for that), I would 
clarify that guarantee to apply only to the most recent messages up until a 
certain configured threshold (i.e. max 24 hrs, max 500GB, etc).  Once those 
thresholds are reached, old messages are deleted first.

To ensure no message loss (up to a limit), I must ensure Kafka is highly 
available.  There's a small a chance that the message deletion rate is the same 
rate that receive rate.  For example, when the incoming volume is so high that 
the size threshold is reached before the time threshold.  But, I may be ok with 
that because if Kafka goes down, it can cause upstream applications to fail.  
This can result in higher losses overall, and particularly of the most *recent* 
messages.

In other words, in a persistent but ephemeral message queue, I would give 
higher precedence to recent messages over older ones.  On the flip side, by 
allowing Kafka to go down when a disk is full, applications are forced to deal 
with the issue.  This adds complexity to apps, but perhaps it's not a bad 
thing.  After all, in scalability, all apps should be designed to handle 
failure.

Having said that, next is to decide which messages to delete first.  I believe 
that's a separate issue and has its own complexities, too.

The main idea though is that a global knob would provide flexibility, even if 
not used.  From an operation perspective, if we can't ensure HA for all 
applications/components, it would be good if we can for at least some of the 
core ones, like Kafka.  This is much easier said that done though.


On May 5, 2014, at 9:16 AM, Jun Rao <jun...@gmail.com> wrote:

Yes, your understanding is correct. A global knob that controls aggregate
log size may make sense. What would be the expected behavior when that
limit is reached? Would you reduce the retention uniformly across all
topics? Then, it just means that some of the logs may not be retained as
long as you want. Also, we need to think through what happens when every
log has only 1 segment left and yet the total size still exceeds the limit.
Do we roll log segments early?

Thanks,

Jun


On Sun, May 4, 2014 at 4:31 AM, vinh <v...@loggly.com> wrote:

Thanks Jun.  So if I understand this correctly, there really is no master
property to control the total aggregate size of all Kafka data files on a
broker.

log.retention.size and log.file.size are great for managing data at the
application level.  In our case, application needs change frequently, and
performance itself is an ever evolving feature.  This means various configs
are constantly changing, like topics, # of partitions, etc.

What rarely changes though is provisioned hardware resources.  So a
setting to control the total aggregate size of Kafka logs (or persisted
data, for better clarity) would definitely simplify things at an
operational level, regardless what happens at the application level.


On May 2, 2014, at 7:49 AM, Jun Rao <jun...@gmail.com> wrote:

log.retention.size controls the total size in a log dir (per
partition). log.file.size
controls the size of each log segment in the log dir.

Thanks,

Jun


On Thu, May 1, 2014 at 9:31 PM, vinh <v...@loggly.com> wrote:

In the 0.7 docs, the description for log.retention.size and

log.file.size

sound very much the same.  In particular, that they apply to a single

log

file (or log segment file).

http://kafka.apache.org/07/configuration.html

I'm beginning to think there is no setting to control the max aggregate
size of all logs.  If this is correct, what would be a good approach to
enforce this requirement?  In my particular scenario, I have a lot of

data

being written to Kafka at a very high rate.  So a 1TB disk can easily be
filled up in 24hrs or so.  One option is to add more Kafka brokers to

add

more disk space to the pool, but I'd like to avoid that and see if I can
simply configure Kafka to not write more than 1TB aggregate.  Else,

Kafka

will OOM and kill itself, and possibly the crash the node itself because
the disk is full.


On May 1, 2014, at 9:21 PM, vinh <v...@loggly.com> wrote:

Using Kafka 0.7.2, I have the following in server.properties:

log.retention.hours=48
log.retention.size=107374182400
log.file.size=536870912

My interpretation of this is:
a) a single log segment file over 48hrs old will be deleted
b) the total combined size of *all* logs is 100GB
c) a single log segment file is limited to 500MB in size before a new

segment file is spawned spawning a new segment file

d) a "log file" can be composed of many "log segment files"

But, even after setting the above, I find that the total combined size

of all Kafka logs on disk is 200GB right now.  Isn't log.retention.size
supposed to limit it to 100GB?  Am I missing something?  The docs are

not

really clear, especially when it comes to distinguishing between a "log
file" and a "log segment file".

I have disk monitoring.  But like anything else in software, even

monitoring can fail.  Via configuration, I'd like to make sure that

Kafka

does not write more than the available disk space.  Or something like
log4j, where I can set a max number of log files and the max size per

file,

which essentially allows me to set a max aggregate size limit across all
logs.

Thanks,
-Vinh

Re: log.retention.size

Reply via email to