I do not think we intentionally dropped it. Could you open a ticket in
Spark JIRA with your query?
Cheers,
Xiao
On Thu, Mar 12, 2020 at 8:24 PM 马阳阳 wrote:
> Hi,
> I wonder why the changes made in
> "[SPARK-9241][SQL] Supporting
> multiple DISTINCT columns (2) -
> Rewriting Rule" are not presen
Hi,
I wonder why the changes made in
"[SPARK-9241][SQL] Supporting
multiple DISTINCT columns (2) -
Rewriting Rule" are not present in
Spark (verson 2.4) now. This caused
execution of count distinct in Spark
much slower than Spark 1.6 and hive
(Spark 2.4.4 more than 18 minutes;
hive about 80s, spar
hey Dodgy Bob, Linux & C programmers, conscientious non - objector,
I have a great idea I want share with you.
In linux I am familiar with wc {wc = word count} (linux users don't like
long winded typing ).
wc flags are :
-c, --bytes print the byte counts
-m, --chars
print th
I've noticed that DataSet.sqlContext is public in Scala but the equivalent
(DataFrame._sc) in PySpark is named as if it should be treated as private.
Is this intentional? If so, what's the rationale? If not, then it feels
like a bug and DataFrame should have some form of public access back to th
Hi community,
I'm having this error in some kafka streams:
Caused by: java.io.FileNotFoundException: File
file:/efs/.../kafka/checkpoint/state/0/0/1.delta does not exist
Because of this I have some streams down. How can I fix this?
Thank you.
--
Miguel Silvestre
This is where the exception occurs:
myAppDes.coalesce(1)
.write
.format("com.databricks.spark.redshift")
.option("url", redshiftURL)
.option("dbtable", redshiftTableName)
.option("forward_spark_s3_credentials", "true")
.option("tempdir", "s3a://zest-