[
https://issues.apache.org/jira/browse/KAFKA-7137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brett Rann updated KAFKA-7137:
------------------------------
Description:
Just spent some time wrapping my head around the inner workings of compaction
and tombstoning, with a view to providing guarantees for deleting previous
values of tombstoned keys from kafka within a desired time.
There's a couple of good posts that touch on this:
https://www.confluent.io/blog/handling-gdpr-log-forget/
http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka-part-2/
Some existing controls:
{color:red}log.cleaner.min.cleanable.ratio{color} - ratio of duplicates in a
log before it will be considered for compaction
{color:red}min.cleanable.dirty.ratio{color} - topic level override for the above
{color:red}min.compaction.lag.ms{color} - minimum time a record will exist (eg:
to ensure it can be consumed before being compacted)
{color:red}delete.retention.ms{color} (how long a tombstone record is kept
before it may be compacted away. ie so downstream consumers can be given time
to see it).
{color:red}segment.ms{color} - maximum time before a new segment is rolled
(compaction only happens on inactive segments)
{color:red}segment.bytes{color} - the size of the segment. (compaction only
happens on inactive segments)
{color:red}log.cleaner.io.max.bytes.per.second{color} - global setting limiting
IO of the log cleaner thread
Currently the controls have focused around guaranteeing a minimum time records
and delete records will exist before they /may/ be compacted. But if you want
to guarantee they will be compacted by a specific time, there is no control.
To achieve this now {color:red}log.cleaner.min.cleanable.ratio{color} or
{color:red}min.cleanable.dirty.ratio{color} is hijacked to force aggressive
compaction (by setting it to 0, or 0.000000001 depending on what you read), and
along with segment.ms can provide timing guarantees that a tombstone will
result in any other values for the key will be deleted within a desired time,
/if/ a new record comes in to trigger a new segment roll.
But that sacrifices the utility of min.cleanable.dirty.ratio (and to a lesser
extent, control over segment sizes). On any duplicate key and a new segment
roll it will run compaction, when otherwise it might be preferrable to allow a
more generous dirty.ratio in the case of plain old duplicates.
It would be useful to have control over triggering a compaction without losing
the utility of the dirty.ratio setting. The pure need here is to specify a
minimum time for the log cleaner to run (or a maximum time where it doesn't
run!) on a topic that has keys replaced by a tombstone message that are past
the minimum retention times provided by min.compaction.lag.ms
Something like a {color:red}log.cleaner.max.delay.ms{color}, a topic
{color:red}max.cleanable.delay.ms{color} (or
{color:red}max.compaction.lag.ms{color} ?) and an {color:red}API to trigger
compaction{color}, with some nuances to be fleshed out.
In the mean time, this can be worked around with some duct tape:
* make sure any values you want deleted by a tombstone have passed min
retention configs
* set global log.cleaner.io.max.bytes.per.second to what you want for the
compaction task
* set topic min.cleanable.dirty.ratio=0 for the topic
* set a small segment.ms
* wait for a new segment to roll (ms + a message coming in) and wait for
compaction to kick in. GDPR met!
* undo the hacks
Another workaround is to set {color:red}min.compaction.lag.ms{color} to say 31
days, along with dirty.ratio=0, and then every 30 days set it to 0 until
compaction happens. That loses the normal dirty.ratio value the above
workaround gives though.
was:
Just spent some time wrapping my head around the inner workings of compaction
and tombstoning, with a view to providing guarantees for deleting previous
values of tombstoned keys from kafka within a desired time.
There's a couple of good posts that touch on this:
https://www.confluent.io/blog/handling-gdpr-log-forget/
http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka-part-2/
Some existing controls:
{color:red}log.cleaner.min.cleanable.ratio{color} - ratio of duplicates in a
log before it will be considered for compaction
{color:red}min.cleanable.dirty.ratio{color} - topic level override for the above
{color:red}min.compaction.lag.ms{color} - minimum time a record will exist (eg:
to ensure it can be consumed before being compacted)
{color:red}delete.retention.ms{color} (how long a tombstone record is kept
before it may be compacted away. ie so downstream consumers can be given time
to see it).
{color:red}segment.ms{color} - maximum time before a new segment is rolled
(compaction only happens on inactive segments)
{color:red}segment.bytes{color} - the size of the segment. (compaction only
happens on inactive segments)
{color:red}log.cleaner.io.max.bytes.per.second{color} - global setting limiting
IO of the log cleaner thread
Currently the controls have focused around guaranteeing a minimum time records
and delete records will exist before they /may/ be compacted. But if you want
to guarantee they will be compacted by a specific time, there is no control.
To achieve this now {color:red}log.cleaner.min.cleanable.ratio{color} or
{color:red}min.cleanable.dirty.ratio{color} is hijacked to force aggressive
compaction (by setting it to 0, or 0.000000001 depending on what you read), and
along with segment.ms can provide timing guarantees that a tombstone will
result in any other values for the key will be deleted within a desired time,
/if/ a new record comes in to trigger a new segment roll.
But that sacrifices the utility of min.cleanable.dirty.ratio (and to a lesser
extent, control over segment sizes). On any duplicate key and a new segment
roll it will run compaction, when otherwise it might be preferrable to allow a
more generous dirty.ratio in the case of plain old duplicates.
It would be useful to have control over triggering a compaction without losing
the utility of the dirty.ratio setting. The pure need here is to specify a
minimum time for the log cleaner to run (or a maximum time where it doesn't
run!) on a topic that has keys replaced by a tombstone message that are past
the minimum retention times provided by min.compaction.lag.ms
Something like a {color:red}log.cleaner.max.delay.ms{color}, a topic
{color:red}max.cleanable.delay.ms{color} (or
{color:red}max.compaction.lag.ms{color} ?) and an {color:red}API to trigger
compaction{color}, with some nuances to be fleshed out.
In the mean time, this can be worked around with some duct tape:
* make sure any values you want deleted by a tombstone have passed min
retention configs
* set global log.cleaner.io.max.bytes.per.second to what you want for the
compaction task
* set topic min.cleanable.dirty.ratio=0 for the topic
* set a small segment.ms
* wait for a new segment to roll (ms + a message coming in) and wait for
compaction to kick in. GDPR met!
* undo the hacks
> ability to trigger compaction for tombstoning and GDPR
> ------------------------------------------------------
>
> Key: KAFKA-7137
> URL: https://issues.apache.org/jira/browse/KAFKA-7137
> Project: Kafka
> Issue Type: Wish
> Reporter: Brett Rann
> Priority: Minor
>
> Just spent some time wrapping my head around the inner workings of compaction
> and tombstoning, with a view to providing guarantees for deleting previous
> values of tombstoned keys from kafka within a desired time.
> There's a couple of good posts that touch on this:
> https://www.confluent.io/blog/handling-gdpr-log-forget/
> http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka-part-2/
> Some existing controls:
> {color:red}log.cleaner.min.cleanable.ratio{color} - ratio of duplicates in a
> log before it will be considered for compaction
> {color:red}min.cleanable.dirty.ratio{color} - topic level override for the
> above
> {color:red}min.compaction.lag.ms{color} - minimum time a record will exist
> (eg: to ensure it can be consumed before being compacted)
> {color:red}delete.retention.ms{color} (how long a tombstone record is kept
> before it may be compacted away. ie so downstream consumers can be given time
> to see it).
> {color:red}segment.ms{color} - maximum time before a new segment is rolled
> (compaction only happens on inactive segments)
> {color:red}segment.bytes{color} - the size of the segment. (compaction only
> happens on inactive segments)
> {color:red}log.cleaner.io.max.bytes.per.second{color} - global setting
> limiting IO of the log cleaner thread
> Currently the controls have focused around guaranteeing a minimum time
> records and delete records will exist before they /may/ be compacted. But if
> you want to guarantee they will be compacted by a specific time, there is no
> control.
> To achieve this now {color:red}log.cleaner.min.cleanable.ratio{color} or
> {color:red}min.cleanable.dirty.ratio{color} is hijacked to force aggressive
> compaction (by setting it to 0, or 0.000000001 depending on what you read),
> and along with segment.ms can provide timing guarantees that a tombstone will
> result in any other values for the key will be deleted within a desired time,
> /if/ a new record comes in to trigger a new segment roll.
> But that sacrifices the utility of min.cleanable.dirty.ratio (and to a lesser
> extent, control over segment sizes). On any duplicate key and a new segment
> roll it will run compaction, when otherwise it might be preferrable to allow
> a more generous dirty.ratio in the case of plain old duplicates.
> It would be useful to have control over triggering a compaction without
> losing the utility of the dirty.ratio setting. The pure need here is to
> specify a minimum time for the log cleaner to run (or a maximum time where it
> doesn't run!) on a topic that has keys replaced by a tombstone message that
> are past the minimum retention times provided by min.compaction.lag.ms
> Something like a {color:red}log.cleaner.max.delay.ms{color}, a topic
> {color:red}max.cleanable.delay.ms{color} (or
> {color:red}max.compaction.lag.ms{color} ?) and an {color:red}API to trigger
> compaction{color}, with some nuances to be fleshed out.
> In the mean time, this can be worked around with some duct tape:
> * make sure any values you want deleted by a tombstone have passed min
> retention configs
> * set global log.cleaner.io.max.bytes.per.second to what you want for the
> compaction task
> * set topic min.cleanable.dirty.ratio=0 for the topic
> * set a small segment.ms
> * wait for a new segment to roll (ms + a message coming in) and wait for
> compaction to kick in. GDPR met!
> * undo the hacks
> Another workaround is to set {color:red}min.compaction.lag.ms{color} to say
> 31 days, along with dirty.ratio=0, and then every 30 days set it to 0 until
> compaction happens. That loses the normal dirty.ratio value the above
> workaround gives though.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)