unfortunately this is a known issue: https://issues.apache.org/jira/browse/SPARK-1476
as Sean suggested, you need to think of some other way of doing the same thing, even if its just breaking your one big broadcast var into a few smaller ones On Fri, Feb 13, 2015 at 12:30 PM, Sean Owen <so...@cloudera.com> wrote: > I think you've hit the nail on the head. Since the serialization > ultimately creates a byte array, and arrays can have at most ~2 > billion elements in the JVM, the broadcast can be at most ~2GB. > > At that scale, you might consider whether you really have to broadcast > these values, or want to handle them as RDDs and join and so on. > > Or consider whether you can break it up into several broadcasts? > > > On Fri, Feb 13, 2015 at 6:24 PM, soila <skavu...@gmail.com> wrote: > > I am trying to broadcast a large 5GB variable using Spark 1.2.0. I get > the > > following exception when the size of the broadcast variable exceeds 2GB. > Any > > ideas on how I can resolve this issue? > > > > java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE > > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:829) > > at > org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:123) > > at > org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:132) > > at > > org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:99) > > at > > org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:147) > > at > > org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:114) > > at > > org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787) > > at > > org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638) > > at > > org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:992) > > at > > > org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:98) > > at > > > org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:84) > > at > > > org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34) > > at > > > org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29) > > at > > > org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62) > > at > org.apache.spark.SparkContext.broadcast(SparkContext.scala:945) > > > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Size-exceeds-Integer-MAX-VALUE-exception-when-broadcasting-large-variable-tp21648.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >