Hi all,
If I try joining the table with itself using join columns, I am
getting the following error:
"Join condition is missing or trivial. Use the CROSS JOIN syntax to
allow cartesian products between these relations.;"
This is not true, and my join is not trivial and is not a real cross
join. I
Hi Michael,
-dev +user
What's the query? How do you "fool spark"?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/ma
Hey,
We use a customize receiver to receive data from our MQ. We used to use def
store(dataItem: T) to store data however I found the block size can be very
different from 0.5K to 5M size. So that data partition processing time is
very different. Shuffle is an option, but I want to avoid it.
Hey Marco,
A Cartesian product is an inner join by definition :). The current
cartesian product operator does not support outer joins, so we use the only
operator that does: BroadcastNestedLoopJoinExec. This is far from great,
and it does have the potential to OOM, there are some safety nets in th
Hi,
today I have updated my test cluster to current Spark master, after that my SQL
Visualization page started to crash with following error in JS:
[cid:part1.DB2FB812.D25D60D1@outlook.com]
Screenshot was cut for readability and to hide internal server names ;)
It may be caused by upgrade or b
Did you include any picture ?
Looks like the picture didn't go thru.
Please use third party site.
Thanks
Original message From: Tomasz Gawęda
Date: 1/15/18 2:07 PM (GMT-08:00) To:
dev@spark.apache.org, u...@spark.apache.org Subject: Broken SQL Visualization?
Hi,
today I hav
Hi, thanks for reporting, can you include the steps to reproduce this bug?
On Tue, Jan 16, 2018 at 7:07 AM, Ted Yu wrote:
> Did you include any picture ?
>
> Looks like the picture didn't go thru.
>
> Please use third party site.
>
> Thanks
>
> Original message
> From: Tomasz G
Hi All,
I've seen a couple issues lately related to cloudpickle, notably
https://issues.apache.org/jira/browse/SPARK-22674, and would like to get
some feedback on updating the version in PySpark which should fix these
issues and allow us to remove some workarounds. Spark is currently using a
fork
Hi Bryan,
Yup, I support to match the version. I pushed it forward before to match it
with https://github.com/cloudpipe/cloudpickle
before few times in Spark's copy and also cloudpickle itself with few
fixes. I believe our copy is closest to 0.4.1.
I have been trying to follow up the changes in c