Hi All

I have recently subscribed and am fairly new to Kafka so please pardon if
the question sounds too naive!

I'm trying to build a POC on clickstream analysis for logged in and
Anonymous users for our e-commerce application. I am coming after visiting
this thread -
https://stackoverflow.com/questions/32761598/is-it-possible-to-create-a-kafka-topic-with-dynamic-partition-count

I've following questions in this regard -

   1. I am planning to have as many number of partitions as the number of
   users we have in the system such that clickstream events for individual
   users go in the dedicated partition for that user. The dedicated partition
   id to be derived the user id of the user. Does this sound like a decent
   approach? If not, what is the suggested way to go about this?
   2. If a partition per user strategy is good enough, then what happens
   when a new user signs up and obviously will have a new and unique user id.
   I am not sure if we can add a new partition to a new topic?
   3. This kafta streaming is going to be consumed by spark streaming job
   (Kafka Consumer). How do I set it up so that it gets clickstrem events from
   kafka topic for all users (irrespective of the partition id). In other
   words, can we have a one-for-all consumer for a topic?

Best,
Girish

Reply via email to