date:20201204

Re: How to Spawn Child Thread or Sub-jobs in a Spark Session

2020-12-04 Thread Raghavendra Ganesh

There should not be any need to explicitly make DF-2, DF-3 computation parallel. Spark generates execution plans and it can decide what can run in parallel (ideally you should see them running parallel in spark UI). You need to cache DF-1 if possible (either in memory/disk), otherwise computation

Re: Typed datataset from Avro generated classes?

2020-12-04 Thread Nads

Same problem here. A google search shows a few related jira tickets in "Resolved" state but I am getting the same error in Spark 3.0.1. I'm pasting my `spark-shell` output below: scala> import org.apache.spark.sql.Encoders import org.apache.spark.sql.Encoders scala> val linkageBean = Encoders.b

RE: Spark UI Storage Memory

2020-12-04 Thread Jack Yang

unsubsribe

Re: Spark UI Storage Memory

2020-12-04 Thread Amit Sharma

Is there any memory leak in spark 2.3.3 version as mentioned in below Jira. https://issues.apache.org/jira/browse/SPARK-29055. Please let me know how to solve it. Thanks Amit On Fri, Dec 4, 2020 at 1:55 PM Amit Sharma wrote: > Can someone help me on this please. > > > Thanks > Amit > > On Wed,

How to Spawn Child Thread or Sub-jobs in a Spark Session

2020-12-04 Thread Artemis User

We have a Spark job that produces a result data frame, say DF-1 at the end of the pipeline (i.e. Proc-1). From DF-1, we need to create two or more dataf rames, say DF-2 and DF-3 via additional SQL or ML processes, i.e. Proc-2 and Proc-3. Ideally, we would like to perform Proc-2 and Proc-3 in

Re: Spark UI Storage Memory

2020-12-04 Thread Amit Sharma

Can someone help me on this please. Thanks Amit On Wed, Dec 2, 2020 at 11:52 AM Amit Sharma wrote: > Hi , I have a spark streaming job. When I am checking the Excetors tab , > there is a Storage Memory column. It displays used memory /total memory. > What is used memory. Is it memory in use

Spark thrift server ldap

2020-12-04 Thread mickymiek

Hi everyone We're using the spark thrift server with spark 3.0.1. We're using it to query hive with jdbc queries using ldap authentication, and it seems that the LdapAuthenticationProviderImpl.java provided by spark thrift server is way outdated (https://github.com/apache/spark/blob/v3.0.1/sql/hi

Broadcast size increases with subsequent iterations

2020-12-04 Thread Kalin Stoyanov

Hi all, I have an iterative algorithm in spark that uses each iteration as the input for the following one, but the size of the data does not change. I am using localCheckpoint to cut the data's lineage (and also facilitate some computations that reuse df-s). However, this runs slower and slower a

Re: In windows 10, accessing Hive from PySpark with PyCharm throws error

2020-12-04 Thread Mich Talebzadeh

OK with PyCharm itself, i am getting this error pyspark.sql.utils.AnalysisException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\Users\admin\PycharmProjects\pythonProject\hive-scratchdir I gat

Re: How to Spawn Child Thread or Sub-jobs in a Spark Session

Re: Typed datataset from Avro generated classes?

RE: Spark UI Storage Memory

Re: Spark UI Storage Memory

How to Spawn Child Thread or Sub-jobs in a Spark Session

Re: Spark UI Storage Memory

Spark thrift server ldap

Broadcast size increases with subsequent iterations

Re: In windows 10, accessing Hive from PySpark with PyCharm throws error

9 matches

Site Navigation

Mail list logo

Footer information