Hi Guozhang,
Thanks and sorry for the late reply. I'm overriding the
GROUP_INSTANCE_ID_CONFIG
& APPLICATION_SERVER_CONFIG.
Rest all are defaults. Even then I see more than one partition being
allocated to the same stream task.
Also I have an additional question regarding the replicas. The default
On Dataproc package kafka-python does not exist not installed as standard
sudo su - to root and install it as above
as root
pip list|grep kafka
root@ctpcluster-m:~#
pip install kafka-python
Collecting kafka-python
Downloading kafka_python-2.0.2-py2.py3-none-any.whl (246 kB)
|███
Have you installed the correct package kafka-python?
*pip install kafka-python*
Collecting kafka-python
Downloading kafka_python-2.0.2-py2.py3-none-any.whl (246 kB)
|| 246 kB 1.9 MB/s
Installing collected packages: kafka-python
Successfully installed kafka-p