Hi all,
We have a spark streaming job which reads from two kafka topics with 10
partitions each. And we are running the streaming job with 3 concurrent
microbatches. (So total 20 partitions and 3 concurrency)
We have following question:
In our processing DAG, we do a rdd.persist() at one stage,
Hi,
I was trying to create a custom Query execution listener by extending
the org.apache.spark.sql.util.QueryExecutionListener class. My custom
listener just contains some logging statements. But i do not see those
logging statements when i run a spark job.
Here are the steps that i did:
1.
Does Spark JDBC thrift server allow connections over HTTP?
http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#running-the-thrift-jdbc-server
doesn't see to indicate this feature.
If the feature isn't there it it planned? Is there a tracking JIRA?
Thank you,
Vinay
--
View this messa
The Hortonworks Tech Preview of Spark is for Spark on YARN. It does not
require Spark to be installed on all nodes manually. When you submit the
Spark assembly jar it will have all its dependencies. YARN will instantiate
Spark App Master & Containers based on this jar.
--
View this message in co
Konstantin,
HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you can try
from
http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
Let me know if you see issues with the tech preview.
"spark PI example on HDP 2.0
I downloaded spark 1.0 pre-build from http://s