Hi all, I'm testing the spark sql module, and I found a problem with one of the test cases.
I think the main problem is that the "add file" command in spark sql (hive?) doesn't work. since conducting an additional test by directly giving the path to the file offers the right answer. The tests are as follows: 1. Original test case: set hive.map.aggr.hash.percentmemory = 0.3; set hive.mapred.local.mem = 384; add file ../../data/scripts/dumpdata_script.py; select count(distinct subq.key) from (FROM src MAP src.key USING 'python dumpdata_script.py' AS key WHERE src.key = 10) subq; returned result: 0 2. Additional test: replace the last sentence as below (adding a path to the file): select count(distinct subq.key) from (FROM src MAP src.key USING 'python ../../data/scripts/dumpdata_script.py' AS key WHERE src.key = 10) subq; returned result: 1000022 ________________________________ best regards, zhenhua
