If # of connections is not an issue, option 3 is fine too. Thanks,
Jun On Wed, May 29, 2013 at 9:09 AM, Withers, Robert <robert.with...@dish.com>wrote: > Thanks, Jun. We have considered doing message filtering in the consumer. > However, the thrust of my question below is not filtering, but > dispatching. If we take Chris' recommendation and pump a small set of msg > types, belonging to the same "class" of messages, such as Account History, > through the same topic, we will want to process all the messages, but we > will want to process each msg type within the "class" differently, so we > will want to dispatch to different handlers. > > I totally see your point that if we only want to process a subset of the > messages, then we really ought to filter in the producer and send the > filtered message stream to its own topic. > > I am leaning toward the architecture of having a different > consumerConnector per topic, as there ARE plenty of ports. This allows per > topic control, which is useful. Do you see any issues with this approach? > > Thanks, > rob > > > -----Original Message----- > From: Jun Rao [mailto:jun...@gmail.com] > Sent: Wednesday, May 29, 2013 9:58 AM > To: users@kafka.apache.org > Subject: Re: one consumerConnector or many? > > Rob, > > You are correct that each instance of consumer will use a single socket to > connect to a broker, independent of # topics/partitions. One thing that's > good to avoid is to read all data and filter in the consumer, especially > when the data is consumed multiple times by different consumers. In this > case, it's better to put the filtered data in a separate topic and let all > consumers consume the filtered data directly. > > Thanks, > > Jun > > > > > On Wed, May 29, 2013 at 6:13 AM, Rob Withers <reefed...@gmail.com> wrote: > > > In thinking about the design of consumption, we have in mind a generic > > consumer server which would consume from more than one message type. > > The handling of each type of message would be different. I suppose we > > could have upwards of say 50 different message types, eventually, > > maybe 100+ different types. Which of the following designs would be > > best and why would the other options be bad? > > > > > > > > 1) Have all message types go through one topic and use a dispatcher > > pattern to select the correct handler. Use one consumerConnector. > > > > 2) Use a different topic for each message type, but still use one > > consumerConnector and a dispatcher pattern. > > > > 3) Use a different topic for each message type and have a separate > > consumerConnector for each topic. > > > > > > > > I am struggling with whether my assumptions are correct. It seems > > that a single connector for a topic would establish one socket to each > > broker, as rebalancing assigns various partitions to that thread. > > Option 2 would pull messages from more than one topic through a single > > socket to a particular broker, is it so? Would option 3 be > > reasonable, establishing upwards of > > 100 > > sockets per broker? > > > > > > > > I am guestimating that option 2 is the right way forward, to bound > > socket use, and we'll need to figure out a way to parameterize stream > > consumption with the right handlers for a particular msg type. If we > > add a topic, do you think we should create a new connector or restart > > the original connector with the new topic in the map? > > > > > > > > Thanks, > > > > rob > > > > >