Hi everyone,
I am encountering an annoying issue when running spark with external jar
dependency downloaded from maven. This is how we run it
spark-shell --repositories --packages
When we release a new version and we have some big change in the API,
things start to randomly break for some user
Dear spark dev
I am trying to run IPython notebook with Kafka structured streaming
support, I couldn't find a way to load Kafka package by adding "--packages
org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0"
to PYSPARK_DRIVER_PYTHON_OPTS or even I changed my local pyspark script to
"exec "${SPARK_H
Thank you for replying, Sean. error is as follows:
Py4JJavaError: An error occurred while calling o49.load.
: org.apache.spark.sql.AnalysisException: Failed to find data source:
kafka. Please deploy the application as per the deployment section of
"Structured Streaming + Kafka Integration Guide".;
Hi Xiangrui,
Thank you for the quick reply and the great questions.
“How does mmlspark handle dynamic allocation? Do you have a watch thread on the
driver to restart the job if there are more workers? And when the number of
workers decrease, can training continue without driver involved?”
Curren
Hi,
I am trying to use the DataSourceV2 API to implement a spark connector for
Apache Phoenix. I am not using JDBCRelation because I want to optimize how
partitions are created during reads and provide support for more
complicated filter pushdown.
For reading I am using JdbcUtils.resultSetToSpark