Hi
I am trying to stream a Kafka topic using createDirectStream(). The Kafka
cluster is SSL enabled. The code for the same is:
***************************
import findspark
findspark.init('/u01/idp/spark')
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
kafkaParams = {"metadata.broker.list":"kfk-bro2.mydomain.com:9093",
"security.protocol":"ssl", "ssl.key.password":"Password123",
"ssl.keystore.location":"/tmp/keystore.jks",
"ssl.keystore.password":"Password123",
"ssl.truststore.location":"/tmp/truststore.jks",
"ssl.truststore.password":"Password123",
"ssl.endpoint.identification.algorithm":""}
if __name__ == "__main__":
sc = SparkContext(appName="PythonStreamingReciever")
ssc = StreamingContext(sc, 30)
message = KafkaUtils.createDirectStream(ssc, ["test1_topic"],
kafkaParams)
lines = message.map(lambda x: x[1])
lines.pprint()
ssc.start()
ssc.awaitTermination()
***************************
Submitting the python script to the cluster using spark-submit
# spark-submit --master yarn --deploy-mode client --files
/u01/idp/spark/conf/log4j.properties --conf
"spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'"
--driver-java-options
"-Dlog4j.configuration=file:/u01/idp/spark/conf/log4j.properties"
--packages
org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.3,org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.3
streamingKafka.py
But during the execution of the above script, i am getting the following
error.
File "/home/spark/streamingKafka.py", line 23, in <module>
message = KafkaUtils.createDirectStream(ssc, ["test1_topic"],
kafkaParams)
.........
.........
File "/u01/idp/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1257, in __call__
File "/u01/idp/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line
328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
o43.createDirectStreamWithoutMessageHandler.
........
........
What could be the possible causes of the error ?
I can stream Kafka topic using console consumer and can reach any one of the
broker.
Kafka version: 2.12
Spark version: 2.4.6
Thanks
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]