To answer my own question to an extent, I guess one thing I could do is have a supplementary topic with 1/16th the partitions. You use that one for auto partition rebalancing and then subscribe explicitly to the main topic with [partition*16, partition*16 + 16) partitions. That way we can move block of partitions around automatically but still have the IO scaleout. I'm sure I can even think of a use for the supplementary topic!
-----Original Message----- From: Young, Ben [mailto:ben.yo...@fisglobal.com] Sent: 17 May 2017 19:09 To: users@kafka.apache.org Subject: Partition groups Hi I was wondering if something like this was possible. I'd like to be able to use partitions to gain some IO parallelism, but certain sets of partitions should not be distributed across different machines. Let's say I have data that can be processed by time bucket, but I'd like each day's data to go to a single machine. I'd have 4x 16 core servers and 64 partitions (for example), and each server would get a block of 16 partitions. This could be handled by making the hash key be the hashed date and then a random last 4 bits. With the range partitioner this works fine. However if one server dies you'll get a batch of 16 split across two servers, whereas I'd like to move a whole group of 16 to one of the remaining servers. Is this kind of thing possible at all? It can't be unusual to want a kind of affinity between related partitions? I know I can do this with manual assignment, but is this my only option? The other option is just to have 4 partitions and thread internally, but then I won't get the IO performance. Thanks, Ben Young The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.