[ 
https://issues.apache.org/jira/browse/KAFKA-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798585#comment-16798585
 ] 

Matthias J. Sax commented on KAFKA-8042:
----------------------------------------

[~ableegoldman] From my understanding, the fast forwarding you mention is only 
done for a batch or records. If there is a larger backlog to restore, this 
would only provide a small lookahead and maybe skip over creating some 
segments. However, it believe that we need KAFKA-7934 for a  proper fix to do a 
"full look ahead" that allows us to not create any old segments at all.

I am just wondering if this ticket is a duplicate of KAFKA-7934? It would be 
great if [~amccague] could confirm this. If it's not a duplication, we should 
document the difference explicitly.

> Kafka Streams creates many segment stores on state restore
> ----------------------------------------------------------
>
>                 Key: KAFKA-8042
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8042
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.1.0, 2.1.1
>            Reporter: Adrian McCague
>            Priority: Major
>         Attachments: StateStoreSegments-StreamsConfig.txt
>
>
> Note that this is from the perspective of one instance of an application, 
> where there are 8 instances total, with partition count 8 for all topics and 
> of course stores. Standby replicas = 1.
> In the process there are multiple instances of {{KafkaStreams}} so the below 
> detail is from one of these.
> h2. Actual Behaviour
> During state restore of an application, many segment stores are created (I am 
> using MANIFEST files as a marker since they preallocate 4MB each). As can be 
> seen this topology has 5 joins - which is the extent of its state.
> {code:java}
> bash-4.2# pwd
> /data/fooapp/0_7
> bash-4.2# for dir in $(find . -maxdepth 1 -type d); do echo "${dir}: $(find 
> ${dir} -type f -name 'MANIFEST-*' -printf x | wc -c)"; done
> .: 8058
> ./KSTREAM-JOINOTHER-0000000025-store: 851
> ./KSTREAM-JOINOTHER-0000000040-store: 819
> ./KSTREAM-JOINTHIS-0000000024-store: 851
> ./KSTREAM-JOINTHIS-0000000029-store: 836
> ./KSTREAM-JOINOTHER-0000000035-store: 819
> ./KSTREAM-JOINOTHER-0000000030-store: 819
> ./KSTREAM-JOINOTHER-0000000045-store: 745
> ./KSTREAM-JOINTHIS-0000000039-store: 819
> ./KSTREAM-JOINTHIS-0000000044-store: 685
> ./KSTREAM-JOINTHIS-0000000034-store: 819
> There are many (x800 as above) of these segment files:
> ./KSTREAM-JOINOTHER-0000000025-store.1551466290000
> ./KSTREAM-JOINOTHER-0000000025-store.1551559020000
> ./KSTREAM-JOINOTHER-0000000025-store.1551492690000
> ./KSTREAM-JOINOTHER-0000000025-store.1551548790000
> ./KSTREAM-JOINOTHER-0000000025-store.1551698610000
> ./KSTREAM-JOINOTHER-0000000025-store.1551530640000
> ./KSTREAM-JOINOTHER-0000000025-store.1551484440000
> ./KSTREAM-JOINOTHER-0000000025-store.1551556710000
> ./KSTREAM-JOINOTHER-0000000025-store.1551686730000
> ./KSTREAM-JOINOTHER-0000000025-store.1551595650000
> ./KSTREAM-JOINOTHER-0000000025-store.1551757350000
> ./KSTREAM-JOINOTHER-0000000025-store.1551685740000
> ./KSTREAM-JOINOTHER-0000000025-store.1551635250000
> ./KSTREAM-JOINOTHER-0000000025-store.1551652410000
> ./KSTREAM-JOINOTHER-0000000025-store.1551466620000
> ./KSTREAM-JOINOTHER-0000000025-store.1551781770000
> ./KSTREAM-JOINOTHER-0000000025-store.1551587400000
> ./KSTREAM-JOINOTHER-0000000025-store.1551681450000
> ./KSTREAM-JOINOTHER-0000000025-store.1551662310000
> ./KSTREAM-JOINOTHER-0000000025-store.1551721710000
> ./KSTREAM-JOINOTHER-0000000025-store.1551750750000
> ./KSTREAM-JOINOTHER-0000000025-store.1551630960000
> ./KSTREAM-JOINOTHER-0000000025-store.1551615120000
> ./KSTREAM-JOINOTHER-0000000025-store.1551792330000
> ./KSTREAM-JOINOTHER-0000000025-store.1551462660000
> ./KSTREAM-JOINOTHER-0000000025-store.1551536910000
> ./KSTREAM-JOINOTHER-0000000025-store.1551592350000
> ./KSTREAM-JOINOTHER-0000000025-store.1551527340000
> ./KSTREAM-JOINOTHER-0000000025-store.1551606870000
> ./KSTREAM-JOINOTHER-0000000025-store.1551744150000
> ./KSTREAM-JOINOTHER-0000000025-store.1551508200000
> ./KSTREAM-JOINOTHER-0000000025-store.1551486420000
> ... etc
> {code}
> Once re-balancing and state restoration is complete - the redundant segment 
> files are deleted and the segment count drops to 508 total (where the above 
> mentioned state directory is one of many).
> We have seen the number of these segment stores grow to as many as 15000 over 
> the baseline 508 which can fill smaller volumes. *This means that a state 
> volume that would normally have ~300MB total disk usage would use in excess 
> of 30GB during rebalancing*, mostly preallocated MANIFEST files.
> h2. Expected Behaviour
> For this particular application we expect 508 segment folders total to be 
> active and existing throughout rebalancing. Give or take migrated tasks that 
> are subject to the {{state.cleanup.delay.ms}}.
> h2. Preliminary investigation
> * This does not appear to be the case in v1.1.0. With our application the 
> number of state directories only grows to 670 (over the base line 508)
> * The MANIFEST files were not preallocated to 4MB in v1.1.0 they are now in 
> v2.1.x, this appears to be expected RocksDB behaviour, but exacerbates the 
> many segment stores.
> * Suspect https://github.com/apache/kafka/pull/5253 to be the source of this 
> change of behaviour.
> A workaround is to use {{rocksdb.config.setter}} and set the preallocated 
> amount for MANIFEST files to a lower value such as 64KB, however the number 
> of segment stores appears to be unbounded so disk volumes may still fill up 
> for a heavier application.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to