[ 
https://issues.apache.org/jira/browse/KAFKA-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liquan Pei updated KAFKA-7023:
------------------------------
    Description: 
We observed frequent L0 -> L1 compaction during Kafka Streams state recovery. 

During bulk loading, the following options are set: 
[https://github.com/facebook/rocksdb/blob/master/options/options.cc] 
{code:java}
Options*
Options::PrepareForBulkLoad()
{
// never slowdown ingest.
level0_file_num_compaction_trigger = (1<<30);
level0_slowdown_writes_trigger = (1<<30);
level0_stop_writes_trigger = (1<<30);
soft_pending_compaction_bytes_limit = 0;
hard_pending_compaction_bytes_limit = 0;

// no auto compactions please. The application should issue a
// manual compaction after all data is loaded into L0.
disable_auto_compactions = true;
// A manual compaction run should pick all files in L0 in
// a single compaction run.
max_compaction_bytes = (static_cast<uint64_t>(1) << 60);

// It is better to have only 2 levels, otherwise a manual
// compaction would compact at every possible level, thereby
// increasing the total time needed for compactions.
num_levels = 2;

// Need to allow more write buffers to allow more parallism
// of flushes.
max_write_buffer_number = 6;
min_write_buffer_number_to_merge = 1;

// When compaction is disabled, more parallel flush threads can
// help with write throughput.
max_background_flushes = 4;

// Prevent a memtable flush to automatically promote files
// to L1. This is helpful so that all files that are
// input to the manual compaction are all at L0.
max_background_compactions = 2;

// The compaction would create large files in L1.
target_file_size_base = 256 * 1024 * 1024;
return this;
}
{code}
Especially, those values are set to a very large number to avoid compactions 
and ensures files are all on L0. 
{code:java}
level0_file_num_compaction_trigger = (1<<30);
level0_slowdown_writes_trigger = (1<<30);
level0_stop_writes_trigger = (1<<30);
{code}
However, in RockDBStore.java, openDB code, we first call:

options.prepareForBulkLoad() and then use the configs from the customized 
customized RocksDBConfigSetter. This may overwrite the configs set in 
prepareBulkLoad call. The fix is to move prepareBulkLoad call after applying 
configs customized RocksDBConfigSetter. 

  was:
We observed frequent L0 -> L1 compaction during Kafka Streams state recovery. 

During bulk loading, the following options are set: 
[https://github.com/facebook/rocksdb/blob/master/options/options.cc] 

Options*
Options::PrepareForBulkLoad()
{
 // never slowdown ingest.
 level0_file_num_compaction_trigger = (1<<30);
 level0_slowdown_writes_trigger = (1<<30);
 level0_stop_writes_trigger = (1<<30);
 soft_pending_compaction_bytes_limit = 0;
 hard_pending_compaction_bytes_limit = 0;

// no auto compactions please. The application should issue a
 // manual compaction after all data is loaded into L0.
 disable_auto_compactions = true;
 // A manual compaction run should pick all files in L0 in
 // a single compaction run.
 max_compaction_bytes = (static_cast<uint64_t>(1) << 60);

// It is better to have only 2 levels, otherwise a manual
 // compaction would compact at every possible level, thereby
 // increasing the total time needed for compactions.
 num_levels = 2;

// Need to allow more write buffers to allow more parallism
 // of flushes.
 max_write_buffer_number = 6;
 min_write_buffer_number_to_merge = 1;

// When compaction is disabled, more parallel flush threads can
 // help with write throughput.
 max_background_flushes = 4;

// Prevent a memtable flush to automatically promote files
 // to L1. This is helpful so that all files that are
 // input to the manual compaction are all at L0.
 max_background_compactions = 2;

// The compaction would create large files in L1.
 target_file_size_base = 256 * 1024 * 1024;
 return this;
}

Especially, those values are set to a very large number to avoid compactions 
and ensures files are all on L0. 

level0_file_num_compaction_trigger = (1<<30);
level0_slowdown_writes_trigger = (1<<30);
level0_stop_writes_trigger = (1<<30);

However, in RockDBStore.java, openDB code, we first call 

options.prepareForBulkLoad() and then use the configs from the customized 
customized RocksDBConfigSetter. This may overwrite the configs set in 
prepareBulkLoad call. The fix is to move prepareBulkLoad call after applying 
configs customized RocksDBConfigSetter. 


> Kafka Streams RocksDB bulk loading config may not be honored with customized 
> RocksDBConfigSetter 
> -------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7023
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7023
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 1.1.0
>            Reporter: Liquan Pei
>            Assignee: Liquan Pei
>            Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We observed frequent L0 -> L1 compaction during Kafka Streams state recovery. 
> During bulk loading, the following options are set: 
> [https://github.com/facebook/rocksdb/blob/master/options/options.cc] 
> {code:java}
> Options*
> Options::PrepareForBulkLoad()
> {
> // never slowdown ingest.
> level0_file_num_compaction_trigger = (1<<30);
> level0_slowdown_writes_trigger = (1<<30);
> level0_stop_writes_trigger = (1<<30);
> soft_pending_compaction_bytes_limit = 0;
> hard_pending_compaction_bytes_limit = 0;
> // no auto compactions please. The application should issue a
> // manual compaction after all data is loaded into L0.
> disable_auto_compactions = true;
> // A manual compaction run should pick all files in L0 in
> // a single compaction run.
> max_compaction_bytes = (static_cast<uint64_t>(1) << 60);
> // It is better to have only 2 levels, otherwise a manual
> // compaction would compact at every possible level, thereby
> // increasing the total time needed for compactions.
> num_levels = 2;
> // Need to allow more write buffers to allow more parallism
> // of flushes.
> max_write_buffer_number = 6;
> min_write_buffer_number_to_merge = 1;
> // When compaction is disabled, more parallel flush threads can
> // help with write throughput.
> max_background_flushes = 4;
> // Prevent a memtable flush to automatically promote files
> // to L1. This is helpful so that all files that are
> // input to the manual compaction are all at L0.
> max_background_compactions = 2;
> // The compaction would create large files in L1.
> target_file_size_base = 256 * 1024 * 1024;
> return this;
> }
> {code}
> Especially, those values are set to a very large number to avoid compactions 
> and ensures files are all on L0. 
> {code:java}
> level0_file_num_compaction_trigger = (1<<30);
> level0_slowdown_writes_trigger = (1<<30);
> level0_stop_writes_trigger = (1<<30);
> {code}
> However, in RockDBStore.java, openDB code, we first call:
> options.prepareForBulkLoad() and then use the configs from the customized 
> customized RocksDBConfigSetter. This may overwrite the configs set in 
> prepareBulkLoad call. The fix is to move prepareBulkLoad call after applying 
> configs customized RocksDBConfigSetter. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to