Thanks for the replies. I don't think Kafka quite fits our use case, unfortunately. To abstractly answer Edward's question: in a system with lots of users, we were considering having a topic per user (such that an individual user can connect from a number of endpoints and receive events, including events that were sent while the user was disconnected - persisting the events to disk and using offsets means we don't have to track which events each individual endpoint has received).
On 14 November 2013 04:38, Edward Capriolo <edlinuxg...@gmail.com> wrote: > Zookeeper will not be the only problem. The first is that each topic is a > directory on the file system. Each of those is going to have files inside > it. This is going to be fairly overwhelming for the file system. Also I can > not speak for the internals but there may be cases where this many topics > allocates a big array or some other non-optimal behaviour. > > Like a RDBMS with this many tables one might ask, why? Isn't there a way to > design the system multi-tennent where so many physical topics are not > needed? > > > On Wed, Nov 13, 2013 at 9:41 AM, Neha Narkhede <neha.narkh...@gmail.com > >wrote: > > > At those many topics, zookeeper will be the main bottleneck. Leader > > election process will take very long increasing the unavailability window > > of the cluster. > > > > Thanks, > > Neha > > On Nov 13, 2013 4:49 AM, "Joe Freeman" <joe.free...@bitroot.com> wrote: > > > > > Would I be correct in assuming that a Kafka cluster won't scale well to > > > support lots (tens of millions) of topics? If I understand correctly, a > > > node being added or removed would involve a leader election for each > > topic, > > > which is a relatively expensive operation? > > > > > > -- Bitroot - http://bitroot.com