[ 
https://issues.apache.org/jira/browse/KAFKA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506083#comment-13506083
 ] 

Neha Narkhede commented on KAFKA-521:
-------------------------------------

Thanks for v5, few things (mostly minor, except for 3.1) that I realized while 
reading v5 -

1. Log
1.1 The param name maxLogFileSize is invalid
1.2 The numbering in the API docs for analyzeAndValidateMessageSet regressed to 
all 1s :)
1.3 API doc bug above maybeFlush - @param The instead of @param 
numberOfMessages. Same for truncateTo() and truncateFullyAndStartAt() as well
2. LogManager
2.1 Same as 1.2 for createAndValidateLogDirs
3. LogSegment
3.1 The read API returns null if startPosition is null. Earlier it used to 
return empty set. At least one usage of this API depends on it returning a 
non-null value. For example nextOffset -

    val ms = read(index.lastOffset, None, log.sizeInBytes)
    ms.lastOption match {
      case None => baseOffset
      case Some(last) => last.nextOffset
    }

You took care of the other in Log.read. I personally prefer non null since it 
forces you to handle the null case, but I understand there is a minor 
performance implication and you probably prefer the null approach :)
4. Probably best to leave out the change to producer.num.retries. We are 
working on it on 0.8 and it will probably collide on the merge anyway. But if 
you want to leave it in, that's fine as well.


+1 otherwise
                
> Refactor Log subsystem
> ----------------------
>
>                 Key: KAFKA-521
>                 URL: https://issues.apache.org/jira/browse/KAFKA-521
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jay Kreps
>         Attachments: KAFKA-521-v1.patch, KAFKA-521-v2.patch, 
> KAFKA-521-v3.patch, KAFKA-521-v4.patch, KAFKA-521-v5.patch
>
>
> There are a number of items it would be nice to cleanup in the log subsystem:
> 1. Misc. funky apis in Log and LogManager
> 2. Much of the functionality in Log should move into LogSegment along with 
> corresponding tests
> 3. We should remove SegmentList and instead use a ConcurrentSkipListMap
> The general idea of the refactoring fall into two categories. First, improve 
> and thoroughly document the public APIs. Second, have a clear delineation of 
> responsibility between the various layers:
> 1. LogManager is responsible for the creation and deletion of logs as well as 
> the retention of data in log segments. LogManager is the only layer aware of 
> partitions and topics. LogManager consists of a bunch of individual Log 
> instances and interacts with them only through their public API (mostly true 
> today).
> 2. Log represents a totally ordered log. Log is responsible for reading, 
> appending, and truncating the log. A log consists of a bunch of LogSegments. 
> Currently much of the functionality in Log should move into LogSegment with 
> Log interacting only through the Log interface. Currently we reach around 
> this a lot to call into FileMessageSet and OffsetIndex.
> 3. A LogSegment consists of an OffsetIndex and a FileMessageSet. It supports 
> largely the same APIs as Log, but now localized to a single segment.
> This cleanup will simplify testing and debugging because it will make the 
> responsibilities and guarantees at each layer more clear.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to