Philip, That is right. There is huge amount of data flushed into the topic within each 6 minutes. Then at the end of each 6 min, I only want to read from that specify topic, and data within that topic has to be processed as fast as possible. I was originally using redis queue for this purpose, but it takes much longer to process a redis queue than kafka queue(testing data is 2M messages). Since we already have kafka infrastructure setup, instead of seeking other tools(activeMQ, rabbitMQ etc), I would rather make use of kafka, although it does not seem like a common kafka user case.
Chen On Mon, Aug 11, 2014 at 5:01 PM, Philip O'Toole < philip.oto...@yahoo.com.invalid> wrote: > I'd love to know more about what you're trying to do here. It sounds like > you're trying to create topics on a schedule, trying to make it easy to > locate data for a given time range? I'm not sure it makes sense to use > Kafka in this manner. > > Can you provide more detail? > > > Philip > > > ----------------------------------------- > http://www.philipotoole.com > > > On Monday, August 11, 2014 4:45 PM, Chen Wang <chen.apache.s...@gmail.com> > wrote: > > > > Todd, > I actually only intend to keep each topic valid for 3 days most. Each of > our topic has 3 partitions, so its around 3*240*3 =2160 partitions. Since > there is no api for deleting topic, i guess i could set up a cron job > deleting the out dated topics(folders) from zookeeper.. > do you know when the delete topic api will be available in kafka? > Chen > > > > On Mon, Aug 11, 2014 at 3:47 PM, Todd Palino <tpal...@linkedin.com.invalid > > > wrote: > > > You need to consider your total partition count as you do this. After 30 > > days, assuming 1 partition per topic, you have 7200 partitions. Depending > > on how many brokers you have, this can start to be a problem. We just > > found an issue on one of our clusters that has over 70k partitions that > > there¹s now a problem with doing actions like a preferred replica > election > > for all topics because the JSON object that gets written to the zookeeper > > node to trigger it is too large for Zookeeper¹s default 1 MB data size. > > > > You also need to think about the number of open file handles. Even with > no > > data, there will be open files for each topic. > > > > -Todd > > > > > > On 8/11/14, 2:19 PM, "Chen Wang" <chen.apache.s...@gmail.com> wrote: > > > > >Folks, > > >Is there any potential issue with creating 240 topics every day? > Although > > >the retention of each topic is set to be 2 days, I am a little concerned > > >that since right now there is no delete topic api, the zookeepers might > be > > >overloaded. > > >Thanks, > > >Chen > > > > >