context:
http://apache-spark-user-list.1001560.n3.nabble.com/GraphX-Connected-Components-tp10869p28049.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: user-unsubscr
needed. Arriving at the answer through experimentation
isn’t a good approach, because that assumes -- chicken-and-egg problem -- that
we have already arrived at an optimal configuration.
-- Does GraphX connected-components performance degrade slowly or
catastrophically when that memory limit is
015 at 00:13, John Lilley wrote:
> Greetings,
>
> We are looking into using the GraphX connected-components algorithm on
> Hadoop for grouping operations. Our typical data is on the order of
> 50-200M vertices with an edge:vertex ratio between 2 and 30. While there
> are pathologic
Greetings,
We are looking into using the GraphX connected-components algorithm on Hadoop
for grouping operations. Our typical data is on the order of 50-200M vertices
with an edge:vertex ratio between 2 and 30. While there are pathological cases
of very large groups, they tend to be small. I
respectively.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Connected-components-tp24165p24209.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
een 6,0
(3,3)-OK
(7,7)-This shd have been 7,3
(5,3)-OK
(2,0)-OK
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Connected-components-tp24165p24190.html
Sent from the Apache Spark User List mailing
Manning Publications Co.
http://www.manning.com/malak/ <http://www.manning.com/malak/>
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Connected-components-tp24165p24166.html
Sent from the Apache Spark User List mailing list archive at Nabb
Id, Int]=
graph.stronglyConnectedComponents(10).
help needed in completing the code.I do not know from now on how to get
stronglyconnected nodes .Pls help in completing this code/
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Conn
Hey all,
I’m trying to run connected components in graphx on about 400GB of data on 50
m3.xlarge nodes on emr. I keep getting java.nio.channels.CancelledKeyException
when it gets to "mapPartitions at VertexRDD.scala:347”. I haven’t been able to
find much about this online, and nothing that seem
On Wed, Jul 30, 2014 at 11:32 PM, Jeffrey Picard wrote:
> That worked! The entire thing ran in about an hour and a half, thanks!
Great!
> Is there by chance an easy way to build spark apps using the master branch
> build of spark? I’ve been having to use the spark-shell.
The easiest way is pro
On Jul 30, 2014, at 4:39 PM, Ankur Dave wrote:
> Jeffrey Picard writes:
>> I tried unpersisting the edges and vertices of the graph by hand, then
>> persisting the graph with persist(StorageLevel.MEMORY_AND_DISK). I still see
>> the same behavior in connected components however, and the same th
Jeffrey Picard writes:
> I tried unpersisting the edges and vertices of the graph by hand, then
> persisting the graph with persist(StorageLevel.MEMORY_AND_DISK). I still see
> the same behavior in connected components however, and the same thing you
> described in the storage page.
Unfortunately
On Jul 30, 2014, at 5:18 AM, Ankur Dave wrote:
> Jeffrey Picard writes:
>> As the program runs I’m seeing each iteration take longer and longer to
>> complete, this seems counter intuitive to me, especially since I am seeing
>> the shuffle read/write amounts decrease with each iteration. I wo
Jeffrey Picard writes:
> As the program runs I’m seeing each iteration take longer and longer to
> complete, this seems counter intuitive to me, especially since I am seeing
> the shuffle read/write amounts decrease with each iteration. I would think
> that as more and more vertices converged t
Hey all,
I’m currently trying to run connected components using GraphX on a large graph
(~1.8b vertices and ~3b edges, most of them are self edges where the only edge
that exists for vertex v is v->v) on emr using 50 m3.xlarge nodes. As the
program runs I’m seeing each iteration take longer and
15 matches
Mail list logo