Re: Spark 3.5.4 gpg validation

2025-01-14 Thread Rozov, Vlad
Open https://issues.apache.org/jira/browse/SPARK-50816 Thank you, Vlad On Jan 10, 2025, at 9:35 AM, Rozov, Vlad wrote: Sounds that KEYS were not updated. The best option is for PMC members to update KEYS in https://downloads.apache.org/spark/KEYS, meanwhile try to use KEYS from https://dist.

Re: GraphFrames' ConnectedComponentSuite test 'two components and two dangling vertices' fails with OutOfMemoryError: Java heap space

2025-01-14 Thread Russell Jurney
Can you please share the code? It doesn't seem an ideal solution, but if AQE is confused, disabling it makes sense. I can't figure out why a low partition count for an 8 node, 6 edge network would require a lot of partitions... users may have different numbers... do you suggest we enforce some mini

Re: Storing a JDBC-based table in a catalog for direct use in Spark SQL

2025-01-14 Thread Aaron Grubb
In case anyone comes across this, I managed to accomplish this using the Hive Standalone Metastore, with help from an article [1]. Required configurations: spark.sql.warehouse.dir=s3a:/// spark.sql.catalogImplementation=hive hive.metastore.uris=thrift://:9083 This enables me to run the example s

Re: GraphFrames' ConnectedComponentSuite test 'two components and two dangling vertices' fails with OutOfMemoryError: Java heap space

2025-01-14 Thread Ángel
Are you sure that temporarily disabling a global setting like AQE is the best approach to fix this issue? I increased the number of shuffle partitions in the Spark session configure in the GraphFrameTestSparkContext.scala from 4 to 10, and the "checkpoint interval" test ran perfectly without throwi