Hello, I am trying to connect to kafka using this command: orderRawData = spark.readStream \ .format("kafka") \ .option("kafka.bootstrap.servers", "18.211.252.152:9092") \ .option("startingOffsets","earliest") \ .option("failOnDataLoss", "false") \ .option("subscribe", "real-time-project") \ .load()
It is giving me error as: 'Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".;' Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/streaming.py", line 400, in load return self._df(self._jreader.load()) File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ answer, self.gateway_client, self.target_id, self.name) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco raise AnalysisException(s.split(': ', 1)[1], stackTrace) pyspark.sql.utils.AnalysisException: 'Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".;' could you please help me with this.