> I think this should be ' pick number of partitions that matches max number
> of possible keys in stream to be partitioned '.
> At least in my usecase , in which I am trying to partition stream by key
> and make windowed aggregations, if there are less number of topic
> partitions than possible ke
Thanks, I got the point. That solves my problem.
On Wed, Oct 5, 2016 at 10:58 PM, Matthias J. Sax
wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Hi,
>
> even if you have more distinct keys than partitions (ie, different key
> go to the same partition), if you do "aggregate by k
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512
Hi,
even if you have more distinct keys than partitions (ie, different key
go to the same partition), if you do "aggregate by key" Streams will
automatically separate the keys and compute an aggregate per key.
Thus, you do not need to worry about wh
Hi,
@Ali IMO, Yes. That is the job of kafka server to assign kafka
instances partition(s) to process. Each instance can process more than one
partition but one partition cannot be processed by more than one instance.
@Michael, Thanks for reply.
>Rather, pick the number of partitions in a way
> It's often a good
idea to over-partition your topics. For example, even if today 10 machines
(and thus 10 partitions) would be sufficient, pick a higher number of
partitions (say, 50) so you have some wiggle room to add more machines
(11...50) later if need be.
If you create e.g 30 partitions,
> So, in this case I should know the max number of possible keys so that
> I can create that number of partitions.
Assuming I understand your original question correctly, then you would not
need to do/know this. Rather, pick the number of partitions in a way that
matches your needs to process the
Hi Guozhang,
So, in this case I should know the max number of possible keys so that I
can create that number of partitions.
Thanks
Adrienne
On Wed, Oct 5, 2016 at 1:00 AM, Guozhang Wang wrote:
> By default the partitioner will use murmur hash on the key and mode on
> current num.partitions to
By default the partitioner will use murmur hash on the key and mode on
current num.partitions to determine which partitions to go to, so records
with the same key will be assigned to the same partition. Would that be OK
for your case?
Guozhang
On Tue, Oct 4, 2016 at 3:00 PM, Adrienne Kole
wrot
Hi,
>From Streams documentation, I can see that each Streams instance is
processing data independently (from other instances), reads from topic
partition(s) and writes to specified topic.
So here, the partitions of topic should be determined beforehand and should
remain static.
In my usecase I w