ok..Thanks Matthius for all the help.
-Sameer.
On Tue, Dec 19, 2017 at 11:14 PM, Matthias J. Sax
wrote:
> Processor API does not do any automatic repartitioning. If you need to
> repartition data, you always need to do it "manually" by writeing to and
> reading back from a topic.
>
>
> -Matthia
Processor API does not do any automatic repartitioning. If you need to
repartition data, you always need to do it "manually" by writeing to and
reading back from a topic.
-Matthias
On 12/19/17 3:49 AM, Sameer Kumar wrote:
> Could you please clarify, if i just choose to use low level processor a
Could you please clarify, if i just choose to use low level processor api,
what directs it to do re partitioning. I am not using them in conjunction
with DSLs, I plan to use only them.
Apart from sink processors, are there any conditions for re partitioning to
occur.
-Sameer.
On Tue, Dec 19, 2017
I understand it now, even if we are able to attach custom partitioning. The
data shall still travel from stream nodes to broker on join topic, so
travel to network will still be there.
-Sameer.
On Tue, Dec 19, 2017 at 1:17 AM, Matthias J. Sax
wrote:
> > need to map the keys, modify them
> >> an
> need to map the keys, modify them
>> and then do a join.
This will always trigger a rebalance. There is no API atm to tell KS
that partitioning is preserved.
Custom partitioner won't help for your case as far as I understand it.
-Matthias
On 12/17/17 9:48 PM, Sameer Kumar wrote:
> Actually,
Actually, I am doing joining after map. I need to map the keys, modify them
and then do a join.
I was thinking of using always passing a partition key based on which
partition happens.
Step by step flow is:-
1. Data is already partitoned by do userid.
2. I do a map to joins impressions tied to a u
Two comments:
1) As long, as you don't do an aggregation/join after a map(), there
will be not repartitioning. Streams does repartitioning "lazy", ie, only
if it's required. As long as you only chain filter/map etc, no
repartitioning will be done.
2) Can't you use mapValue() instead of map()? If
I have multiple map and filter phases in my application dag and though I am
generating different keys at different points, the data is still local.
Re-partitioning for me here is adding unnecessary network shuffling, I want
to minimize it.
-Sameer.
On Friday, December 15, 2017, Matthias J. Sax w
It's not recommended to write a custom partitioner because it's pretty
difficult to write a correct one. There are many dependencies and you
need deep knowledge of Kafka Streams internals to get it write.
Otherwise, your custom partitioner breaks Kafka Streams.
That is the reason why it's not docu
Hi,
I want to use the custom partitioner in streams, I couldnt find the same in
the documentation. I want to make sure that during map phase, the keys
produced adhere to the customized partitioner.
-Sameer.
10 matches
Mail list logo