I'm fine with moving GraphX to GraphFrames and it looks like we almost reach a consensus about it in GraphFrames maintainers.
Quick question: what should I put to NOTICE file of GraphFrames? Is it enough just to add the following: """ This project contains the code of Apache Spark GraphX Copyright 2014-2025 The Apache Software Foundation. This product includes software developed at The Apache Software Foundation (http://www.apache.org/). """ ? Best regards, Sem On Tue, 2025-09-09 at 22:04 -0700, Russell Jurney wrote: > Yeah, GraphFrames ingesting GraphX sounds like a good idea. There are > if I recall zero issues relating to GraphX in JIRA, so not a lot of > demand for it there and it's already deprecated. > > To ask another question... Sem has been adding property graph support > to GraphFrames. One way to bring Graphs to Spark users could be to > implement a Spark SQL extension for the SQL 2023 PGQ standard for > property graphs. Would that be a feasible integration strategy should > we attempt to bring GraphFrames back into Spark, without GraphX? It > would be nice to have something like I guess 50-100x as many users > the exposure would get us, but we noted the Cypher SPIP was never > accepted. So I wanted to test the waters... > > Thanks! > Russell > > On Tue, Sep 9, 2025 at 1:10 PM Mich Talebzadeh > <[email protected]> wrote: > > Agreed. will be good > > > > HTH > > Dr Mich Talebzadeh, > > Architect | Data Science | Financial Crime | Forensic Analysis | > > GDPR > > > > view my Linkedin profile > > > > > > > > > > On Tue, 9 Sept 2025 at 14:38, Enrico Minack > > <[email protected]> wrote: > > > Hi all, > > > > > > maybe this is the right moment to move GraphX into GraphFrames to > > > maintain it there. > > > > > > Cheers, > > > Enrico > > > > > > Am 09.09.25 um 13:17 schrieb Sem: > > > > Hello! > > > > > > > > Because of deprecation of GraphX in Spark 4.x I have a > > > > question. > > > > Working on performance improvements in GraphFrames that is > > > > using GraphX > > > > under the hood, I found a way to improve the performance of the > > > > LabelPropagation algorithm in GraphX. > > > > > > > > On my tests (LDBC graph "wiki-Talk", 2.3M vertices, 5M edges) > > > > it > > > > improves the performance from ~3500 seconds to ~50 seconds. The > > > > new > > > > solution is slightly increasing the average memory usage per > > > > iteration > > > > but also it is decreasing the peak memory usage overall (the > > > > 1st > > > > iteration of the current implementation). > > > > > > > > I'm ready to provide all the details and explanations, fill the > > > > Jira > > > > ticket, etc. But my main question is does GraphX accept patches > > > > or > > > > because of deprecation it is not considered anymore? > > > > > > > > Thanks in advance! > > > > Best regards, > > > > Sem > > > > > > > > --------------------------------------------------------------- > > > > ------ > > > > To unsubscribe e-mail: [email protected] > > > > > > > > > > > > > ----------------------------------------------------------------- > > > ---- > > > To unsubscribe e-mail: [email protected] > > > --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
