Sadly I'm encounter too many issues migrating my code to Spark 1.3 I wrote one problem on other mail but my main problem is that I can't set the right compression type. In Spark 1.2.1 setting the following values was enough: hc.setConf("hive.exec.compress.output", "true") hc.setConf("mapreduce.output.fileoutputformat.compress.codec", "org.apache.hadoop.io.compress.SnappyCodec") hc.setConf("mapreduce.output.fileoutputformat.compress.type", "BLOCK")
Running it the new cluster I: 1. Get the files uncompressed and named *.parquet 2. When trying to explore it using Hive CLI I get the follwing excpetion: *Failed with exception java.io.IOException:java.io.IOException: hdfs://10.166.157.97:9000/user/hive/warehouse/core_equity_corp_splits_divs/part-r-00001.parquet <http://10.166.157.97:9000/user/hive/warehouse/core_equity_corp_splits_divs/part-r-00001.parquet> not a SequenceFileTime taken: 0.538 seconds* 3. Running from Spark shell the same query yield empty results. Please advise