I get 2 types of error -
-org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
location for shuffle 0 and
FetchFailedException: Adjusted frame length exceeds 2147483647: 12716268407
- discarded

Spar keeps re-trying to submit the code and keeps getting this error.

My file on which I am finding  the sliding window strings is 500 MB  and I
am doing it with length = 150.
It woks fine till length is 100.

This is my code -
 val hgfasta = sc.textFile(args(0)) // read the fasta file
    val kCount = hgfasta.flatMap(r => { r.sliding(args(2).toInt) })
    val kmerCount = kCount.map(x => (x, 1)).reduceByKey(_ + _).map { case
(x, y) => (y, x) }.sortByKey(false).map { case (i, j) => (j, i) }

      val filtered = kmerCount.filter(kv => kv._2 < 5)
      filtered.map(kv => kv._1 + ", " +
kv._2.toLong).saveAsTextFile(args(1))

  }
It gets stuck and flat map and save as Text file  Throws
-org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
location for shuffle 0 and

org.apache.spark.shuffle.FetchFailedException: Adjusted frame length
exceeds 2147483647: 12716268407 - discarded
        at 
org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.org$apache$spark$shuffle$hash$BlockStoreShuffleFetcher$$unpackBlock$1(BlockStoreShuffleFetcher.scala:67)
        at 
org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$$anonfun$3.apply(BlockStoreShuffleFetcher.scala:83)
        at 
org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$$anonfun$3.apply(BlockStoreShuffleFetcher.scala:83)

Reply via email to