Compression and Hive with Spark 1.3

Ophir Cohen Tue, 21 Apr 2015 03:54:08 -0700

Sadly I'm encounter too many issues migrating my code to Spark 1.3

I wrote one problem on other mail but my main problem is that I can't set
the right compression type.
In Spark 1.2.1 setting the following values was enough:
hc.setConf("hive.exec.compress.output", "true")
    hc.setConf("mapreduce.output.fileoutputformat.compress.codec",
"org.apache.hadoop.io.compress.SnappyCodec")
    hc.setConf("mapreduce.output.fileoutputformat.compress.type", "BLOCK")


Running it the new cluster I:
1. Get the files uncompressed and named *.parquet
2. When trying to explore it using Hive CLI I get the follwing excpetion:


*Failed with exception java.io.IOException:java.io.IOException:
hdfs://10.166.157.97:9000/user/hive/warehouse/core_equity_corp_splits_divs/part-r-00001.parquet
<http://10.166.157.97:9000/user/hive/warehouse/core_equity_corp_splits_divs/part-r-00001.parquet>
not a SequenceFileTime taken: 0.538 seconds*
3. Running from Spark shell the same query yield empty results.

Please advise

Compression and Hive with Spark 1.3

Reply via email to