Hi all,
this thread got a bit stuck. Hence, if there are no objections, I'd go
ahead with a design doc describing the solution/workaround I mentioned
before. Any concerns?
Thanks,
Marco
Il giorno gio 13 dic 2018 alle ore 18:15 Ryan Blue ha
scritto:
> Thanks for the extra context, Marco. I thoug
Thanks for the extra context, Marco. I thought you were trying to propose a
solution.
On Thu, Dec 13, 2018 at 2:45 AM Marco Gaido wrote:
> Hi Ryan,
>
> My goal with this email thread is to discuss with the community if there
> are better ideas (as I was told many other people tried to address th
Hi Ryan,
My goal with this email thread is to discuss with the community if there
are better ideas (as I was told many other people tried to address this).
I'd consider this as a brainstorming email thread. Once we have a good
proposal, then we can go ahead with a SPIP.
Thanks,
Marco
Il giorno m
Marco,
I'm actually asking for a design doc that clearly states the problem and
proposes a solution. This is a substantial change and probably should be an
SPIP.
I think that would be more likely to generate discussion than referring to
PRs or a quick paragraph on the dev list, because the only p
Thank you all for your answers.
@Ryan Blue sure, let me state the problem more clearly:
imagine you have 2 dataframes with a common lineage (for instance one is
derived from the other by some filtering or anything you prefer). And
imagine you want to join these 2 dataframes. Currently, there is a
I don’t know your exact underlying business problem, but maybe a graph
solution, such as Spark Graphx meets better your requirements. Usually
self-joins are done to address some kind of graph problem (even if you would
not describe it as such) and is for these kind of problems much more efficie
Marco,
Thanks for starting the discussion! I think it would be great to have a
clear description of the problem and a proposed solution. Do you have
anything like that? It would help bring the rest of us up to speed without
reading different pull requests.
Thanks!
rb
On Tue, Dec 11, 2018 at 3:5
Hi all,
I'd like to bring to the attention of a more people a problem which has
been there for long, ie, self joins. Currently, we have many troubles with
them. This has been reported several times to the community and seems to
affect many people, but as of now no solution has been accepted for it
Should I file a JIRA for this?
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/SQL-Self-join-with-ArrayType-columns-problems-tp10269p10322.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com
Using Spark 1.2.0, we are facing some weird behaviour when performing self
join on a table with some ArrayType field.
(potential bug ?)
I have set up a minimal non working example here:
https://gist.github.com/pierre-borckmans/4853cd6d0b2f2388bf4f
<https://gist.github.com/pierre-borckm
10 matches
Mail list logo