That's a good question about # of sockets when a single consumer is connecting. I'll let someone from LinkedIn comment if each consumer has a socket per topic/partition or if it is per Broker, since I'm not familiar with that part of the code.
On Wed, May 29, 2013 at 9:53 AM, Withers, Robert <robert.with...@dish.com>wrote: > Thanks for the info. Are you saying that even with a single connector, > with say 3 topics and 3 threads per topic and 3 brokers with 3 partitions > for all 3 topics on all 3 brokers, that a consumer box would have 9 sockets > open? What if there are 6 partitions per topic, would that be 18 open > sockets? > > I have read somewhere that a high partition number, per topic, is > desirable, to scale out the consumers and to be prepared to dynamically > scale out consumption during a traffic spike. Is it so? 100 topics, with > 16 brokers and 200 partitions per topic with 1 consumer connector (just > hypothetically so) would be 1600 sockets or 20000 sockets? > > For sure these boxes have plenty of ports. I am just thinking through > possible failures and what flexibility we have in configuration of > producers/consumers to topics. Really the question is best practices in > this area. A producer server handling 100+ msg types could also connect > quite a bit. So, perhaps it is best to restrict producer and consumer > servers to process a restricted "class" of types. Certainly if the > producer is also hosting a web server, but perhaps not as dire on the > consumer side. > > thanks, > rob > ________________________________________ > From: Chris Curtin [curtin.ch...@gmail.com] > Sent: Wednesday, May 29, 2013 7:36 AM > To: users > Subject: Re: one consumerConnector or many? > > I'd look at a variation of #2. Can your messages by grouped into a 'class > (for lack of a better term)' that are consumed together? For example a > 'class' of 'auditing events' or 'sensor events'. The idea would to then > have a topic for 'class'. > > A couple of benefits to this: > - you can define your consumption of a 'class's resources by value. So the > 'audit' topic may only get a 2 threaded consumer while the 'sensor' class > gets a 10 threaded consumer. > - you can stop processing a 'class' of messages if you need to without > taking all the consumers off line (Assuming you have different processors > or a way while running to alter your number of threads per topic.) > > Since it sounds like you may be frequently adding new message types this > approach also allows you to decide if you want to shutdown only a part of > your processing to add the new code to handle the message. > > Finally, why the concern about socket use? A well configured Windows or > Linux machine can have thousands of open sockets without problems. Since > 0.8.0 only connects to the Broker with the topic/partition you end up with > 1 socket per topic/partition and consumer. > > Hope this helps, > > Chris > > > On Wed, May 29, 2013 at 9:13 AM, Rob Withers <reefed...@gmail.com> wrote: > > > In thinking about the design of consumption, we have in mind a generic > > consumer server which would consume from more than one message type. The > > handling of each type of message would be different. I suppose we could > > have upwards of say 50 different message types, eventually, maybe 100+ > > different types. Which of the following designs would be best and why > > would > > the other options be bad? > > > > > > > > 1) Have all message types go through one topic and use a dispatcher > > pattern to select the correct handler. Use one consumerConnector. > > > > 2) Use a different topic for each message type, but still use one > > consumerConnector and a dispatcher pattern. > > > > 3) Use a different topic for each message type and have a separate > > consumerConnector for each topic. > > > > > > > > I am struggling with whether my assumptions are correct. It seems that a > > single connector for a topic would establish one socket to each broker, > as > > rebalancing assigns various partitions to that thread. Option 2 would > pull > > messages from more than one topic through a single socket to a particular > > broker, is it so? Would option 3 be reasonable, establishing upwards of > > 100 > > sockets per broker? > > > > > > > > I am guestimating that option 2 is the right way forward, to bound socket > > use, and we'll need to figure out a way to parameterize stream > consumption > > with the right handlers for a particular msg type. If we add a topic, do > > you think we should create a new connector or restart the original > > connector > > with the new topic in the map? > > > > > > > > Thanks, > > > > rob > > > > >