This issue got resolved.
I was able to trace it to the fact that the driver program's pom.xml was
pulling in Spark 2.1.1 which in turn was pulling in Hadoop 2.2.0.
Explicitly adding dependencies on Hadoop libraries 2.7.3 resolves it.
The following API in HDFS : DatanodeManager.getDatanodeStorageI
Hadoop version 2.7.3
On Tue, Jun 20, 2017 at 11:12 PM, yohann jardin
wrote:
> Which version of Hadoop are you running on?
>
> *Yohann Jardin*
> Le 6/21/2017 à 1:06 AM, N B a écrit :
>
> Ok some more info about this issue to see if someone can shine a light on
> what could be going on. I turned o
Which version of Hadoop are you running on?
Yohann Jardin
Le 6/21/2017 à 1:06 AM, N B a écrit :
Ok some more info about this issue to see if someone can shine a light on what
could be going on. I turned on debug logging for
org.apache.spark.streaming.scheduler in the driver process and this is
Ok some more info about this issue to see if someone can shine a light on
what could be going on. I turned on debug logging for
org.apache.spark.streaming.scheduler in the driver process and this is what
gets thrown in the logs and keeps throwing it even after the downed HDFS
node is restarted. Usi
BTW, this is running on Spark 2.1.1.
I have been trying to debug this issue and what I have found till now is
that it is somehow related to the Spark WAL. The directory named
/receivedBlockMetadata seems to stop getting
written to after the point of an HDFS node being killed and restarted. I
have
Hi all,
We are running a Standalone Spark Cluster for running a streaming
application. The application consumes data from Flume using a Flume Polling
stream created as such :
flumeStream = FlumeUtils.createPollingStream(streamingContext,
socketAddress.toArray(new InetSocketAddress[socketAddre