If I understand your comment correctly you are trying to use kafka topics as per-endpoint message queues.
I may be mistaken, but to me Kafka seems does not really seem like a good match for that. For such a purpose you will eventually want something that is not actually a queue - a means to perform state compression, event-type-based and priority-based dequeue and other operations which are not appropriate to a throughput/FIFO oriented system. On Nov 14, 2013, at 6:18 AM, Joe Freeman <joe.free...@bitroot.com> wrote: > Thanks for the replies. I don't think Kafka quite fits our use case, > unfortunately. To abstractly answer Edward's question: in a system with > lots of users, we were considering having a topic per user (such that an > individual user can connect from a number of endpoints and receive events, > including events that were sent while the user was disconnected - > persisting the events to disk and using offsets means we don't have to > track which events each individual endpoint has received). > > > > On 14 November 2013 04:38, Edward Capriolo <edlinuxg...@gmail.com> wrote: > >> Zookeeper will not be the only problem. The first is that each topic is a >> directory on the file system. Each of those is going to have files inside >> it. This is going to be fairly overwhelming for the file system. Also I can >> not speak for the internals but there may be cases where this many topics >> allocates a big array or some other non-optimal behaviour. >> >> Like a RDBMS with this many tables one might ask, why? Isn't there a way to >> design the system multi-tennent where so many physical topics are not >> needed? >> >> >> On Wed, Nov 13, 2013 at 9:41 AM, Neha Narkhede <neha.narkh...@gmail.com >>> wrote: >> >>> At those many topics, zookeeper will be the main bottleneck. Leader >>> election process will take very long increasing the unavailability window >>> of the cluster. >>> >>> Thanks, >>> Neha >>> On Nov 13, 2013 4:49 AM, "Joe Freeman" <joe.free...@bitroot.com> wrote: >>> >>>> Would I be correct in assuming that a Kafka cluster won't scale well to >>>> support lots (tens of millions) of topics? If I understand correctly, a >>>> node being added or removed would involve a leader election for each >>> topic, >>>> which is a relatively expensive operation? >>>> >>> >> > > > > -- > Bitroot - http://bitroot.com