efaracci018 opened a new pull request, #50989:
URL: https://github.com/apache/spark/pull/50989

   ### What changes were proposed in this pull request?
   
   Fix the version parsing logic in `HiveExternalCatalogVersionsSuite` to 
properly handle new artifact paths in 
https://dist.apache.org/repos/dist/release/spark/ so that "backward 
compatibility" test can be run.
   
   This change creates a constant `val SparkVersionPattern = """<a 
href="spark-(\d.\d.\d)/">""".r` for more precise version matching, and removes 
redundant `.filterNot(_.contains("preview"))` which is no longer needed. 
   
   ### Why are the changes needed?
   
   The suite is failing to execute the "backward compatibility" test due to 
parsing errors with testing versions. The current implementation fails to parse 
versions when encountering new paths like `spark-connect-swift-0.1.0/` and 
`spark-kubernetes-operator-0.1.0/ in 
https://dist.apache.org/repos/dist/release/spark/. This leads to 
`PROCESS_TABLES.testingVersions` being empty, and in turn a logError: 
"Exception encountered when invoking run on a nested suite - Fail to get the 
latest Spark versions to test". As a result, the condition is not met to run 
the test. 
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   
   Executed local build and test for `HiveExternalCatalogVersionsSuite`:
   
   `build/mvn -pl sql/hive-Dtest=none 
-DwildcardSuites=org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite 
test-compile scalatest:test`
   
   Verified that the reported error no longer appears, "backward compatibility" 
test runs successfully, and `PROCESS_TABLES.testingVersions` now correctly 
contains "3.5.5" when printed out, which was previously empty.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   
   @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to