Can some one please review this? On Fri, Mar 11, 2016 at 12:09 AM, Achanta Vamsi Subhash < achanta.va...@flipkart.com> wrote:
> Hi, > I would like to make this into 0.0.10.0 so can someone look into this and > review? > > On Wed, Mar 9, 2016 at 10:29 PM, Achanta Vamsi Subhash < > achanta.va...@flipkart.com> wrote: > >> Hi all, >> >> https://github.com/apache/kafka/pull/1035 >> This pull request will make the log-segment load parallel with two >> configurable properties "log.recovery.threads" and " >> log.recovery.max.interval.ms". >> >> On startup, currently the log segments within a logDir are loaded >> sequentially when there is a un-clean shutdown. This will take a lot of >> time for the segments to be loaded as the logSegment.recover(..) is called >> for every segment and for brokers which have many partitions, the time >> taken will be very high (we have noticed ~40mins for 2k partitions). >> >> Logic: >> 1. Have a threadpool defined of fixed length (log.recovery.threads) >> 2. Submit the logSegment recovery as a job to the threadpool and add the >> future returned to a job list >> 3. Wait till all the jobs are done within req. time ( >> log.recovery.max.interval.ms - default set to Long.Max). >> 4. If they are done and the futures are all null (meaning that the jobs >> are successfully completed), it is considered done. >> 5. If any of the recovery jobs failed, then it is logged and >> LogRecoveryFailedException is thrown >> 6. If the timeout is reached, LogRecoveryFailedException is thrown. >> The logic is backward compatible with the current sequential >> implementation as the default thread count is set to 1. >> >> JIRA link is here: >> https://issues.apache.org/jira/browse/KAFKA-3359 >> >> Please review and give me suggestions. Will make them and contribute. >> Thanks. >> >> >> On Wed, Mar 9, 2016 at 7:57 PM, vamsi-subhash <g...@git.apache.org> wrote: >> >>> GitHub user vamsi-subhash opened a pull request: >>> >>> https://github.com/apache/kafka/pull/1035 >>> >>> Parallel log-recovery of un-flushed segments on startup >>> >>> Did not find any tests for the method. Will be adding them >>> >>> You can merge this pull request into a Git repository by running: >>> >>> $ git pull https://github.com/vamsi-subhash/kafka trunk >>> >>> Alternatively you can review and apply these changes as the patch at: >>> >>> https://github.com/apache/kafka/pull/1035.patch >>> >>> To close this pull request, make a commit to your master/trunk branch >>> with (at least) the following in the commit message: >>> >>> This closes #1035 >>> >>> ---- >>> commit ecab815203a2b6396703660d5a2f9d9bb00efcf3 >>> Author: Vamsi Subhash Achanta <vamsi...@gmail.com> >>> Date: 2016-03-09T14:24:37Z >>> >>> Made log-recovery parallel >>> >>> ---- >>> >>> >>> --- >>> If your project is set up for it, you can reply to this email and have >>> your >>> reply appear on GitHub as well. If your project does not have this >>> feature >>> enabled and wishes so, or if the feature is enabled but not working, >>> please >>> contact infrastructure at infrastruct...@apache.org or file a JIRA >>> ticket >>> with INFRA. >>> --- >>> >> >> >> >> -- >> Regards >> Vamsi Subhash >> > > > > -- > Regards > Vamsi Subhash > -- Regards Vamsi Subhash