Got it. thanks for the input Todd! Chen
On Mon, Aug 11, 2014 at 9:31 PM, Todd Palino <tpal...@linkedin.com.invalid> wrote: > As I noted, we have a cluster right now with 70k partitions. It’s running > on over 30 brokers, partly to cover the number of partitions and and > partly to cover the amount of data that we push through it. If you can > have at least 4 or 5 brokers, I wouldn’t anticipate any problems with the > number of partitions. You may need more than that depending on the > throughput you want to handle. > > -Todd > > On 8/11/14, 9:20 PM, "Chen Wang" <chen.apache.s...@gmail.com> wrote: > > >Todd, > >Yes I actually thought about that. My concern is that even a weeks topic > >partition(240*7*3 = 5040) is too many. Does linkedin have a good > >experience > >in using this many topics in your system?:-) > >Thanks, > >Chen > > > > > >On Mon, Aug 11, 2014 at 9:02 PM, Todd Palino > ><tpal...@linkedin.com.invalid> > >wrote: > > > >> In order to delete topics, you need to shut down the entire cluster (all > >> brokers), delete the topics from Zookeeper, and delete the log files and > >> partition directory from the disk on the brokers. Then you can restart > >>the > >> cluster. Assuming that you can take a periodic outage on your cluster, > >>you > >> can do it this way. > >> > >> Reading what you’re intending to do in other parts of this thread, have > >> you considered setting up 1 week’s worth of topics with 3 day retention, > >> and having your producer and consumer rotate between them. That is, on > >> Sunday at 12:00 AM, you start with topic1, then proceed to topic2 at > >> 12:06, and so on. The next week, you loop around over exactly the same > >> topics, knowing that the retention settings have cleared out the old > >>data. > >> > >> -Todd > >> > >> On 8/11/14, 4:45 PM, "Chen Wang" <chen.apache.s...@gmail.com> wrote: > >> > >> >Todd, > >> >I actually only intend to keep each topic valid for 3 days most. Each > >>of > >> >our topic has 3 partitions, so its around 3*240*3 =2160 partitions. > >>Since > >> >there is no api for deleting topic, i guess i could set up a cron job > >> >deleting the out dated topics(folders) from zookeeper.. > >> >do you know when the delete topic api will be available in kafka? > >> >Chen > >> > > >> > > >> >On Mon, Aug 11, 2014 at 3:47 PM, Todd Palino > >> ><tpal...@linkedin.com.invalid> > >> >wrote: > >> > > >> >> You need to consider your total partition count as you do this. > >>After 30 > >> >> days, assuming 1 partition per topic, you have 7200 partitions. > >> >>Depending > >> >> on how many brokers you have, this can start to be a problem. We just > >> >> found an issue on one of our clusters that has over 70k partitions > >>that > >> >> there¹s now a problem with doing actions like a preferred replica > >> >>election > >> >> for all topics because the JSON object that gets written to the > >> >>zookeeper > >> >> node to trigger it is too large for Zookeeper¹s default 1 MB data > >>size. > >> >> > >> >> You also need to think about the number of open file handles. Even > >>with > >> >>no > >> >> data, there will be open files for each topic. > >> >> > >> >> -Todd > >> >> > >> >> > >> >> On 8/11/14, 2:19 PM, "Chen Wang" <chen.apache.s...@gmail.com> wrote: > >> >> > >> >> >Folks, > >> >> >Is there any potential issue with creating 240 topics every day? > >> >>Although > >> >> >the retention of each topic is set to be 2 days, I am a little > >> >>concerned > >> >> >that since right now there is no delete topic api, the zookeepers > >> >>might be > >> >> >overloaded. > >> >> >Thanks, > >> >> >Chen > >> >> > >> >> > >> > >> > >