So as Alex noted, there’s no immediate problem to doing this. Kafka itself doesn’t know much about the underlying hardware, so it’s not going to care. At the same time, this means that it has no way natively to know that those systems have more storage capacity. So they’re not going to automatically get more partitions.
You have some options here. 1. You could just ignore it, and treat everything like it’s the smaller brokers. This is easy, but you’ll waste your extra storage 2. You could manually assign more partitions to the larger brokers. This requires a little bit of work, but it will more effectively use the hardware. The gotcha with #2 is that you have to make sure you’re not sending too much network traffic to the larger brokers, and you need to make sure that you’re not exhausting the CPU as well. And, of course, you’re going to have to keep an eye on new topics or anything like that to make sure that your weighted cluster balance is still where you want it to be, and manually fix it if not. -Todd On Fri, Jun 10, 2016 at 5:26 AM, Alex Loddengaard <a...@confluent.io> wrote: > Hi Kevin, > > If you keep the same configs on the new brokers with more storage capacity, > I don't foresee any issues. Although I haven't tried it myself. > > What may introduce headaches is if you have different configuration options > per broker. Or if you try to assign more partitions to the newer brokers to > use more of their disk space. > > Let's see if others notice anything I'm missing (again, I've never tried > this before). Hope this helps. > > Alex > > On Thu, Jun 9, 2016 at 10:27 AM, Kevin A <k4m...@gmail.com> wrote: > > > Hi there, > > > > I have a couple of Kafka brokers and thinking about adding a few more. > The > > new broker machines would have a lot more storage available to them than > > the existing brokers. Am I setting myself up for operational headaches by > > deploying a heterogeneous (in terms of storage capacity) cluster? > > > > (Asked on IRC but thought I'd try here too.) > > > > Thanks! > > -Kevin > > > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino