Re: [PR] [SPARK-XXXX][ML][CONNECT] Avoiding instance creation in ServiceLoader [spark]

via GitHub Mon, 20 Jan 2025 17:05:35 -0800


HyukjinKwon commented on code in PR #49577:
URL: https://github.com/apache/spark/pull/49577#discussion_r1922946621



##########
sql/connect/server/src/main/scala/org/apache/spark/sql/connect/ml/MLUtils.scala:
##########
@@ -50,8 +51,18 @@ private[ml] object MLUtils {
   private def loadOperators(mlCls: Class[_]): Map[String, Class[_]] = {
     val loader = Utils.getContextOrSparkClassLoader
     val serviceLoader = ServiceLoader.load(mlCls, loader)
-    val providers = serviceLoader.asScala.toList
-    providers.map(est => est.getClass.getName -> est.getClass).toMap
+    // Instead of using the iterator, we use the "stream()" method that allows
+    // to iterate over a collection of providers that do not instantiate the 
class
+    // directly. Since there is no good way to convert a Java stream to a 
Scala stream,
+    // we collect the Java stream to a Java map and then convert it to a Scala 
map.
+    serviceLoader
+      .stream()
+      .collect(
+        Collectors.toMap(
+          (est: ServiceLoader.Provider[_]) => est.`type`().getName,
+          (est: ServiceLoader.Provider[_]) => est.`type`()))
+      .asScala

Review Comment:
   This will only be useful if we do some filtering, etc. between `stream()` 
and `collect()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-XXXX][ML][CONNECT] Avoiding instance creation in ServiceLoader [spark]

Reply via email to