Hello,

I am trying to connect to kafka using this command:
orderRawData = spark.readStream \
    .format("kafka") \
    .option("kafka.bootstrap.servers", "18.211.252.152:9092") \
    .option("startingOffsets","earliest") \
    .option("failOnDataLoss", "false") \
    .option("subscribe", "real-time-project") \
    .load()

It is giving me error as:

'Failed to find data source: kafka. Please deploy the application as
per the deployment section of "Structured Streaming + Kafka
Integration Guide".;'
Traceback (most recent call last):
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/streaming.py",
line 400, in load
    return self._df(self._jreader.load())
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
line 69, in deco
    raise AnalysisException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: 'Failed to find data source:
kafka. Please deploy the application as per the deployment section of
"Structured Streaming + Kafka Integration Guide".;'




could you please help me with this.

Reply via email to