Its better to have partition key based on some f(user).
This way a partition will always have same set of users and any new user
would get assigned to one of these partitions.
You can probably check
https://spark.apache.org/docs/2.2.0/streaming-kafka-0-10-integration.html
For kafka to spark integr
And can you share the patch...
On Sun, Dec 22, 2019 at 10:34 PM Vishal Santoshi
wrote:
> We also have a large number of topics 1500 plus and in a cross DC
> replication. How do we increase the default timeouts ?
>
>
> On Wed, Dec 11, 2019 at 2:26 PM Ryanne Dolan
> wrote:
>
>> Hey Peter. Do you
We also have a large number of topics 1500 plus and in a cross DC
replication. How do we increase the default timeouts ?
On Wed, Dec 11, 2019 at 2:26 PM Ryanne Dolan wrote:
> Hey Peter. Do you see any timeouts in the logs? The internal scheduler will
> timeout each task after 60 seconds by defa
Hi All
I have recently subscribed and am fairly new to Kafka so please pardon if
the question sounds too naive!
I'm trying to build a POC on clickstream analysis for logged in and
Anonymous users for our e-commerce application. I am coming after visiting
this thread -
https://stackoverflow.com/qu