Which version do you use ?
Best Regard, Jeff Zhang From: Jhon Anderson Cardenas Diaz <jhonderson2...@gmail.com<mailto:jhonderson2...@gmail.com>> Reply-To: "us...@zeppelin.apache.org<mailto:us...@zeppelin.apache.org>" <us...@zeppelin.apache.org<mailto:us...@zeppelin.apache.org>> Date: Friday, June 8, 2018 at 11:08 PM To: "us...@zeppelin.apache.org<mailto:us...@zeppelin.apache.org>" <us...@zeppelin.apache.org<mailto:us...@zeppelin.apache.org>>, "dev@zeppelin.apache.org<mailto:dev@zeppelin.apache.org>" <dev@zeppelin.apache.org<mailto:dev@zeppelin.apache.org>> Subject: All PySpark jobs are canceled when one user cancel his PySpark paragraph (job) Dear community, Currently we are having problems with multiple users running paragraphs associated with pyspark jobs. The problem is that if an user aborts/cancels his pyspark paragraph (job), the active pyspark jobs of the other users are canceled too. Going into detail, I've seen that when you cancel a user's job this method is invoked (which is fine): sc.cancelJobGroup("zeppelin-[notebook-id]-[paragraph-id]") But somehow unknown to me, this method is also invoked: sc.cancelAllJobs() The above is due to the trace of the log that appears in the jobs of the other users: Py4JJavaError: An error occurred while calling o885.count. : org.apache.spark.SparkException: Job 461 cancelled as part of cancellation of all jobs at org.apache.spark.scheduler.DAGScheduler.org<http://org.apache.spark.scheduler.DAGScheduler.org>$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435) at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1375) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply$mcVI$sp(DAGScheduler.scala:721) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:721) at org.apache.spark.scheduler.DAGScheduler$$anonfun$doCancelAllJobs$1.apply(DAGScheduler.scala:721) at scala.collection.mutable.HashSet.foreach(HashSet.scala:78) at org.apache.spark.scheduler.DAGScheduler.doCancelAllJobs(DAGScheduler.scala:721) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1628) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1925) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1938) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1951) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1965) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) at org.apache.spark.rdd.RDD.collect(RDD.scala:935) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:275) at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2386) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788) at org.apache.spark.sql.Dataset.org<http://org.apache.spark.sql.Dataset.org>$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2385) at org.apache.spark.sql.Dataset.org<http://org.apache.spark.sql.Dataset.org>$apache$spark$sql$Dataset$$collect(Dataset.scala:2392) at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2420) at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2419) at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2801) at org.apache.spark.sql.Dataset.count(Dataset.scala:2419) at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748) (<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError('An error occurred while calling o885.count.\n', JavaObject id=o886), <traceback object at 0x7f9e669ae588>) Any idea of why this could be happening? (I have 0.8.0 version from September 2017) Thank you!