Hi all,

I'm testing the spark sql module, and I found a problem with one of the test 
cases.

I think the main problem is that the "add file" command in spark sql (hive?) 
doesn't work.
since conducting an additional test by directly giving the path to the file 
offers the right answer.

The tests are as follows:
1. Original test case:

set hive.map.aggr.hash.percentmemory = 0.3;
set hive.mapred.local.mem = 384;
add file ../../data/scripts/dumpdata_script.py;
select count(distinct subq.key) from
(FROM src MAP src.key USING 'python dumpdata_script.py' AS key WHERE src.key = 
10) subq;

returned result: 0

2. Additional test: replace the last sentence as below (adding a path to the 
file):
select count(distinct subq.key) from
(FROM src MAP src.key USING 'python ../../data/scripts/dumpdata_script.py' AS 
key WHERE src.key = 10) subq;

returned result: 1000022


________________________________
best regards,
zhenhua

Reply via email to