Re: Clickstream partition design question

2019-12-22 Thread Sachin Mittal
Its better to have partition key based on some f(user). This way a partition will always have same set of users and any new user would get assigned to one of these partitions. You can probably check https://spark.apache.org/docs/2.2.0/streaming-kafka-0-10-integration.html For kafka to spark integr

Clickstream partition design question

2019-12-22 Thread Girish Vasmatkar
Hi All I have recently subscribed and am fairly new to Kafka so please pardon if the question sounds too naive! I'm trying to build a POC on clickstream analysis for logged in and Anonymous users for our e-commerce application. I am coming after visiting this thread - https://stackoverflow.com/qu