This shouldn't be required anymore since Spark 2.0.
On Tue, Oct 25, 2016 at 6:16 AM, Matt Smith wrote:
> Is there an alternative function or design pattern for the collect_list
> UDAF that can used without taking a dependency on HiveContext? How does
> one typically roll things up into an arra
What version of Spark are you using? We introduced a Spark native
collect_list in 2.0.
It still has the usual caveats, but it should quite a bit faster.
On Tue, Oct 25, 2016 at 6:16 AM, Matt Smith wrote:
> Is there an alternative function or design pattern for the collect_list
> UDAF that can u
Is there an alternative function or design pattern for the collect_list
UDAF that can used without taking a dependency on HiveContext? How does
one typically roll things up into an array when outputting JSON?
The advice to avoid idioms that may not be universally understood is good.
My further issue with the misuse of "straw-man" (which really is not, or
should not be, separable from "straw-man argument") is that a "straw-man"
in the established usage is something that is always intended to be a
failure
Well, it's more of a reference to the fallacy than anything. Writing down a
proposed action implicitly claims it's what others are arguing for. It's
self-deprecating to call it a "straw man", suggesting that it may not at
all be what others are arguing for, and is done to openly invite criticism
an
Alright, that does it! Who is responsible for this "straw-man" abuse
that is becoming too commonplace in the Spark community? "Straw-man" does
not mean something like "trial balloon" or "run it up the flagpole and see
if anyone salutes", and I would really appreciate it if Spark developers
would
What kind of partitioning are you exploring? GraphX actually has some built in
partitioning algorithms but if you are interested in spectral or hierarchical
methods you might want to look at Metis/Zoltan? There was some interest in
integrating Metis style algorithms in Spark (GraphX or GraphF
Maybe titandb ?! It uses Hbase to store graphs and solr (on HDFS) to index
graphs. I am not 100% sure it supports it, but probably.
It can also integrate Spark, but analytics on a given graph only.
Otherwise you need to go for dedicated graph system.
> On 24 Oct 2016, at 16:41, Marco wrote:
>
>
Hi,
I'm a student in Computer Science and I'm working for my master thesis=20
on Graph Partitioning problem, focusing on dynamic graph.
I'm searching for a framework to manage Dynamic Graph, with possible=20
disappearing of edges/nodes. Now the problem is: GraphX alone cannot=20
provide solution
BTW I wrote up a straw-man proposal for migrating the wiki content:
https://issues.apache.org/jira/browse/SPARK-18073
On Tue, Oct 18, 2016 at 12:25 PM Holden Karau wrote:
> Right now the wiki isn't particularly accessible to updates by external
> contributors. We've already got a contributing t
10 matches
Mail list logo