Hi, After SPARK-12588 Remove HTTPBroadcast [1], the one and only implementation of BroadcastFactory is TorrentBroadcastFactory. No code in Spark 2 uses BroadcastFactory (but TorrentBroadcastFactory) however the scaladoc says [2]:
/** * An interface for all the broadcast implementations in Spark (to allow * multiple broadcast implementations). SparkContext uses a user-specified * BroadcastFactory implementation to instantiate a particular broadcast for the * entire Spark job. */ which is not correct since there is no way to plug in a custom user-specified BroadcastFactory. My first impression was to remove the seemingly-pluggable interface BroadcastFactory completely since it's no longer pluggable and may imply it is still pluggable. But then I thought you, Spark devs, could argue it's just about fixing the scaladoc (and leaving the interface intact). I'm for removing the BroadcastFactory interface completely and leaving TorrentBroadcastFactory alone (without extending something that's not extendable despite being an interface) or...bringing spark.broadcast.factory Spark property back to life in BroadcastManager so it is indeed possible to plug a custom BroadcastFactory (and hence Broadcast) in. WDYT? [1] https://issues.apache.org/jira/browse/SPARK-12588 [2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/broadcast/BroadcastFactory.scala#L25-L30 Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org