Hey Dotan, Another way to create topics with specific partitions is to use the bin/kafka-topics.sh tool in Kafka's binary distribution. This allows you to create topics with a specific count.
> Our concern is how it affects the metrics and checkpoint topics. Good question. In 0.7.0, the checkpoint topic must have the same number of partitions as the input topic(s). Samza will use the max(partition count) when multiple input topics exist (e.g. input topics with 4 and 8 partitions would result in an 8 partition checkpoint topic). Samza will automatically create a checkpoint topic of the appropriate size when the job first executes. If the input topic(s)'s partition count changes after the job has been started, the checkpoint topic will have to be resized. In 0.8.0, the checkpoint topic is a single partition, and this problem goes away. Keep in mind that resizing a topic (checkpoint, or otherwise) if you're depending on keyed messages has a big impact, since it will change the partition that a key gets mapped to (key.hashCode % partitionCount = partition; partition changes when partitionCount changes). As for the metrics topic, there shouldn't be any issue. Cheers, Chris On Mon, Feb 2, 2015 at 10:00 PM, Dotan Patrich <[email protected]> wrote: > Hi, > > We're using Samza with Kafka and we would like to use multiple partitions > in our topics. > > We've noticed that the number of partitions is defined in > server.properties. > According to the Kafka documentation > <http://kafka.apache.org/07/configuration.html>, there are 2 options to > defined the number of partitions: > > - num.partitions - Specifies the default number of partitions per topic. > - topic.partition.count.map - Override parameter to control the number > of partitions for selected topics. E.g., topic1:10,topic2:20 > > We thought about using the first option (num.partitions) in order to avoid > the overhead of adding every new topic to a map. Our concern is how it > affects the metrics and checkpoint topics. > > Can anyone share what is the best practice for using multiple partitions in > Kafka? > > > Thanks, > Dotan >
