Re: Error in collecting RDD as a Map - IOException in collectAsMap

Marco Mistroni Sat, 23 Jul 2016 09:05:15 -0700

Hi vg I believe the error msg is misleading. I had a similar one with
pyspark yesterday after calling a count on a data frame, where the real
error was with an incorrect user defined function being applied .
Pls send me some sample code with a trimmed down version of the data and I
see if i can reproduce
Kr


On 23 Jul 2016 4:57 pm, "Pedro Rodriguez" <ski.rodrig...@gmail.com> wrote:

Have you changed spark-env.sh or spark-defaults.conf from the default? It
looks like spark is trying to address local workers based on a network
address (eg 192.168……) instead of on localhost (localhost, 127.0.0.1,
0.0.0.0,…). Additionally, that network address doesn’t resolve correctly.
You might also check /etc/hosts to make sure that you don’t have anything
weird going on.

Last thing to try perhaps is that are you running Spark within a VM and/or
Docker? If networking isn’t setup correctly on those you may also run into
trouble.

What would be helpful is to know everything about your setup that might
affect networking.

—
Pedro Rodriguez
PhD Student in Large-Scale Machine Learning | CU Boulder
Systems Oriented Data Scientist
UC Berkeley AMPLab Alumni

pedrorodriguez.io | 909-353-4423
github.com/EntilZha | LinkedIn
<https://www.linkedin.com/in/pedrorodriguezscience>

On July 23, 2016 at 9:10:31 AM, VG (vlin...@gmail.com) wrote:

Hi pedro,

Apologies for not adding this earlier.

This is running on a local cluster set up as follows.
JavaSparkContext jsc = new JavaSparkContext("local[2]", "DR");

Any suggestions based on this ?

The ports are not blocked by firewall.

Regards,



On Sat, Jul 23, 2016 at 8:35 PM, Pedro Rodriguez <ski.rodrig...@gmail.com>
wrote:

> Make sure that you don’t have ports firewalled. You don’t really give much
> information to work from, but it looks like the master can’t access the
> worker nodes for some reason. If you give more information on the cluster,
> networking, etc, it would help.
>
> For example, on AWS you can create a security group which allows all
> traffic to/from itself to itself. If you are using something like ufw on
> ubuntu then you probably need to know the ip addresses of the worker nodes
> beforehand.
>
> —
> Pedro Rodriguez
> PhD Student in Large-Scale Machine Learning | CU Boulder
> Systems Oriented Data Scientist
> UC Berkeley AMPLab Alumni
>
> pedrorodriguez.io | 909-353-4423
> github.com/EntilZha | LinkedIn
> <https://www.linkedin.com/in/pedrorodriguezscience>
>
> On July 23, 2016 at 7:38:01 AM, VG (vlin...@gmail.com) wrote:
>
> Please suggest if I am doing something wrong or an alternative way of
> doing this.
>
> I have an RDD with two values as follows
> JavaPairRDD<String, Long> rdd
>
> When I execute   rdd..collectAsMap()
> it always fails with IO exceptions.
>
>
> 16/07/23 19:03:58 ERROR RetryingBlockFetcher: Exception while beginning
> fetch of 1 outstanding blocks
> java.io.IOException: Failed to connect to /192.168.1.3:58179
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
> at
> org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:96)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
> at
> org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
> at
> org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:105)
> at
> org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:92)
> at
> org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:546)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:76)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1793)
> at
> org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.net.ConnectException: Connection timed out: no further
> information: /192.168.1.3:58179
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
> at
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)
> at
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> ... 1 more
> 16/07/23 19:03:58 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 1
> outstanding blocks after 5000 ms
>
>
>
>

Re: Error in collecting RDD as a Map - IOException in collectAsMap

Reply via email to