I'm fine with moving GraphX to GraphFrames and it looks like we almost
reach a consensus about it in GraphFrames maintainers.

Quick question: what should I put to NOTICE file of GraphFrames? Is it
enough just to add the following:

"""
This project contains the code of Apache Spark GraphX
Copyright 2014-2025 The Apache Software Foundation.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
"""

?

Best regards,
Sem

On Tue, 2025-09-09 at 22:04 -0700, Russell Jurney wrote:
> Yeah, GraphFrames ingesting GraphX sounds like a good idea. There are
> if I recall zero issues relating to GraphX in JIRA, so not a lot of
> demand for it there and it's already deprecated.
> 
> To ask another question... Sem has been adding property graph support
> to GraphFrames. One way to bring Graphs to Spark users could be to
> implement a Spark SQL extension for the SQL 2023 PGQ standard for
> property graphs. Would that be a feasible integration strategy should
> we attempt to bring GraphFrames back into Spark, without GraphX? It
> would be nice to have something like I guess 50-100x as many users
> the exposure would get us, but we noted the Cypher SPIP was never
> accepted. So I wanted to test the waters...
> 
> Thanks!
> Russell
> 
> On Tue, Sep 9, 2025 at 1:10 PM Mich Talebzadeh
> <[email protected]> wrote:
> > Agreed. will be good
> > 
> > HTH
> > Dr Mich Talebzadeh,
> > Architect | Data Science | Financial Crime | Forensic Analysis |
> > GDPR
> > 
> >    view my Linkedin profile
> > 
> >  
> > 
> > 
> > On Tue, 9 Sept 2025 at 14:38, Enrico Minack
> > <[email protected]> wrote:
> > > Hi all,
> > > 
> > > maybe this is the right moment to move GraphX into GraphFrames to
> > > maintain it there.
> > > 
> > > Cheers,
> > > Enrico
> > > 
> > > Am 09.09.25 um 13:17 schrieb Sem:
> > > > Hello!
> > > > 
> > > > Because of deprecation of GraphX in Spark 4.x I have a
> > > > question.
> > > > Working on performance improvements in GraphFrames that is
> > > > using GraphX
> > > > under the hood, I found a way to improve the performance of the
> > > > LabelPropagation algorithm in GraphX.
> > > > 
> > > > On my tests (LDBC graph "wiki-Talk", 2.3M vertices, 5M edges)
> > > > it
> > > > improves the performance from ~3500 seconds to ~50 seconds. The
> > > > new
> > > > solution is slightly increasing the average memory usage per
> > > > iteration
> > > > but also it is decreasing the peak memory usage overall (the
> > > > 1st
> > > > iteration of the current implementation).
> > > > 
> > > > I'm ready to provide all the details and explanations, fill the
> > > > Jira
> > > > ticket, etc. But my main question is does GraphX accept patches
> > > > or
> > > > because of deprecation it is not considered anymore?
> > > > 
> > > > Thanks in advance!
> > > > Best regards,
> > > > Sem
> > > > 
> > > > ---------------------------------------------------------------
> > > > ------
> > > > To unsubscribe e-mail: [email protected]
> > > > 
> > > 
> > > 
> > > -----------------------------------------------------------------
> > > ----
> > > To unsubscribe e-mail: [email protected]
> > > 

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Reply via email to