[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset

Anton Karamanov (JIRA) Tue, 22 Jul 2014 09:49:44 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070489#comment-14070489
 ]


Anton Karamanov commented on KAFKA-1414:
----------------------------------------

Here are some measurements for 4 disks, each data directory mounted to its own 
disk, 1 Gb log segment size:

log-level parallelization:
{code}
1 thread:
[2014-07-21 17:33:32,750] WARN start loading logs (kafka.log.LogManager)
[2014-07-21 18:43:06,473] WARN finish loading logs (kafka.log.LogManager)
Time: 1h  9m 26s

4 threads:
[2014-07-21 18:46:16,615] WARN start loading logs (kafka.log.LogManager)
[2014-07-21 19:18:30,485] WARN finish loading logs (kafka.log.LogManager)
Time: 0h 32m 14s

8 threads:
[2014-07-21 19:40:47,666] WARN start loading logs (kafka.log.LogManager)
[2014-07-21 20:06:37,269] WARN finish loading logs (kafka.log.LogManager)
Time: 0h 26m 14s

16 threads:
[2014-07-21 20:12:54,167] WARN start loading logs (kafka.log.LogManager)
[2014-07-21 20:39:39,080] WARN finish loading logs (kafka.log.LogManager)
Time: 0h 26m 45s
{code}

dir-level parallelization:
{code}
1 thread:
[2014-07-22 18:21:43,840] WARN start loading logs (kafka.log.LogManager)
[2014-07-22 19:28:20,939] WARN finish loading logs (kafka.log.LogManager)
Time: 1h  6m 37s

4 threads:
[2014-07-22 01:36:44,065] WARN start loading logs (kafka.log.LogManager)
[2014-07-22 01:57:18,926] WARN finish loading logs (kafka.log.LogManager)
Time: 0h 20m 34s
{code}

It seems that parallelizing on dir level is faster after all, even though 
multiple threads reading from the
same disk are not leading to any performance degradation.

> Speedup broker startup after hard reset
> ---------------------------------------
>
>                 Key: KAFKA-1414
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1414
>             Project: Kafka
>          Issue Type: Improvement
>          Components: log
>    Affects Versions: 0.8.1.1, 0.8.2, 0.9.0
>            Reporter: Dmitry Bugaychenko
>            Assignee: Jay Kreps
>         Attachments: 
> 0001-KAFKA-1414-Speedup-broker-startup-after-hard-reset-a.patch, 
> KAFKA-1414-rev1.patch, KAFKA-1414-rev2.fixed.patch, KAFKA-1414-rev2.patch, 
> KAFKA-1414-rev3-interleaving.patch, KAFKA-1414-rev3.patch, freebie.patch, 
> parallel-dir-loading-0.8.patch, 
> parallel-dir-loading-trunk-fixed-threadpool.patch, 
> parallel-dir-loading-trunk-threadpool.patch, parallel-dir-loading-trunk.patch
>
>
> After hard reset due to power failure broker takes way too much time 
> recovering unflushed segments in a single thread. This could be easiliy 
> improved launching multiple threads (one per data dirrectory, assuming that 
> typically each data directory is on a dedicated drive). Localy we trie this 
> simple patch to LogManager.loadLogs and it seems to work, however I'm too new 
> to scala, so do not take it literally:
> {code}
>   /**
>    * Recover and load all logs in the given data directories
>    */
>   private def loadLogs(dirs: Seq[File]) {
>     val threads : Array[Thread] = new Array[Thread](dirs.size)
>     var i: Int = 0
>     val me = this
>     for(dir <- dirs) {
>       val thread = new Thread( new Runnable {
>         def run()
>         {
>           val recoveryPoints = me.recoveryPointCheckpoints(dir).read
>           /* load the logs */
>           val subDirs = dir.listFiles()
>           if(subDirs != null) {
>             val cleanShutDownFile = new File(dir, Log.CleanShutdownFile)
>             if(cleanShutDownFile.exists())
>               info("Found clean shutdown file. Skipping recovery for all logs 
> in data directory '%s'".format(dir.getAbsolutePath))
>             for(dir <- subDirs) {
>               if(dir.isDirectory) {
>                 info("Loading log '" + dir.getName + "'")
>                 val topicPartition = Log.parseTopicPartitionName(dir.getName)
>                 val config = topicConfigs.getOrElse(topicPartition.topic, 
> defaultConfig)
>                 val log = new Log(dir,
>                   config,
>                   recoveryPoints.getOrElse(topicPartition, 0L),
>                   scheduler,
>                   time)
>                 val previous = addLogWithLock(topicPartition, log)
>                 if(previous != null)
>                   throw new IllegalArgumentException("Duplicate log 
> directories found: %s, %s!".format(log.dir.getAbsolutePath, 
> previous.dir.getAbsolutePath))
>               }
>             }
>             cleanShutDownFile.delete()
>           }
>         }
>       })
>       thread.start()
>       threads(i) = thread
>       i = i + 1
>     }
>     for(thread <- threads) {
>       thread.join()
>     }
>   }
>   def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = {
>     logCreationOrDeletionLock synchronized {
>       this.logs.put(topicPartition, log)
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset

Reply via email to