hi Battini,
The limit is a k8s construct that tells k8s how much cpu/cores your driver
*can* consume.
when you have the same value for 'spark.driver.cores' and '
spark.kubernetes.driver.limit.cores' your driver then runs at the
'Guranteed' k8s quality of service class, which can make your driver
Consider the following *intended* sql:
select row_number()
over (partition by Origin order by OnTimeDepPct desc) OnTimeDepRank,*
from flights
This will *not* work in *structured streaming* : The culprit is:
partition by Origin
The requirement is to use a timestamp-typed field such as
par
Hi Spark Users,
repartition and partitionBy seems to be very same in Df.
In which scenario we use one?
As per my understanding repartition is very expensive operation as it needs
full shuffle then when do we use repartition ?
Thanks
Rajat
--
Sent from: http://apache-spark-user-list.1001560.n
Kanchan,
the `toDebugString` looks unformatted because in some scenarios you need to
parse it before (can't remember the reason, though). I suggest you to print
the RDD Lineage using
`print(rdd.toDebugString().decode("utf-8"))` instead (obs: this only occurs
in Pyspark).
About the other question,
Dear All,
Greetings!
I am new to Apache Spark and working on RDDs using pyspark. I am trying to
understand the logical plan provided by toDebugString function, but I find
two issues a) the output is not formatted when I print the result
b) I do not see number of partitions shown.
Can anyone dire
Hi,
I have a series of queries to extract from multiple tables in hive and do a
feature engineering on the extracted final data.. I can run queries using
spark sql and use mllib to perform the feature transformation I needed.
The question is do you guys use any kind of tool to perform this workfl
Hi Rajat,
A little more color:
The executor classpath will be used by the spark workers/slaves. For
example, all JVMs that are started with $SPARK_HOME/sbin/start-slave.sh. If
you run with --deploy-mode cluster, then the driver itself will be run from
on the cluster (with executor classpath).
If
Hey Rajat,
The documentation page is self explanatory..
You can refer this for more configs
https://spark.apache.org/docs/2.0.0/configuration.html
or any version of Spark documentation
Thanks.
Subash
On Sat, 20 Apr 2019 at 16:04, rajat kumar
wrote:
> Hi,
>
> Can anyone pls explain ?
>
>
> O
Hi,
Can anyone pls explain ?
On Mon, 15 Apr 2019, 09:31 rajat kumar Hi All,
>
> I came across different parameters in spark submit
>
> --jars , --spark.executor.extraClassPath , --spark.driver.extraClassPath
>
> What are the differences between them? When to use which one? Will it
> differ
> if