1. a. I think startup is a public method on KafkaServer so for people
embedding Kafka in some way this helps guarantee correctness.
b. I think KafkaScheduler tries to be a bit too clever, there is a patch
out there that just moves to global synchronization for the whole class
which is easier to reason about. Technically startup is not called from
multiple threads but the classes correctness should not depended on the
current usage so it should work correctly if it were.
c. I think in cases where you actually just want to start and run N
threads, using Thread directly is sensible. ExecutorService is useful but
does have a ton of gadgets and gizmos that obscure the basic usage in that
case.
d. Yeah we should probably wait until the processor threads start as well.
I think it probably doesn't cause misbehavior as is, but it would be better
if the postcondition of startup was that all threads had started.

2. a. There are different ways to do this. My overwhelming experience has
been that any attempt to share a selector across threads is very painful.
Making the selector loops single threaded just really really simplifies
everything, but also the performance tends to be a lot better because there
is far less locking inside that selector loop.
b. Yeah I share you skepticism of that call. I'm not sure why it is there
or if it is needed. I agree that wakeup should only be needed from other
threads. It would be good to untangle that mystery. I wonder what happens
if it is removed.

-Jay

On Wed, Jan 21, 2015 at 1:58 PM, Chittaranjan Hota <chitts.h...@gmail.com>
wrote:

> Hello,
> Congratulations to the folks behind kafka. Its has been a smooth ride
> dealing with multi TB data when the same set up in JMS fell apart often.
>
> Although I have been using kafka for more than a few days now, started
> looking into the code base since yesterday and already have doubts at the
> very beginning. Would need some inputs on why the implementation is done
> the way it is.
>
> Version : 0.8.1
>
> THREADING RELATED
> 1. Why in the start up code synchronized? Who are the competing threads?
>     a. startReporters func is synchronized
>     b. KafkaScheduler startup is synchronized? There is also a volatile
> variable declared when the whole synchronized block is itself guaranteeing
> "happens before".
>    c. Use of native new Thread syntax instead of relying on Executor
> service
>    d. processor thread uses a couthdownlatch but main thread doesnt await
> for processors to signal that startup is complete.
>
>
> NIO RELATED
> 2.
>    a. Acceptor, and each Processor thread have their own selector (since
> they are extending from abstract class AbstractServerThread). Ideally a
> single selector suffices multiplexing. Is there any reason why multiple
> selectors are used?
>    b. selector wake up calls by Processors in the read method (line 362
> SocketServer.scala) are MISSED calls since there is no thread waiting on
> the select at that point.
>
> Looking forward to learning the code further!
> Thanks in advance.
>
> Regards,
> Chitta
>

Reply via email to