[spark sql] "add file" doesn't work

wangzhenhua (G) Tue, 10 Feb 2015 04:41:06 -0800

Hi all,

I'm testing the spark sql module, and I found a problem with one of the test 
cases.


I think the main problem is that the "add file" command in spark sql (hive?) 
doesn't work.
since conducting an additional test by directly giving the path to the file 
offers the right answer.

The tests are as follows:
1. Original test case:

set hive.map.aggr.hash.percentmemory = 0.3;
set hive.mapred.local.mem = 384;
add file ../../data/scripts/dumpdata_script.py;
select count(distinct subq.key) from
(FROM src MAP src.key USING 'python dumpdata_script.py' AS key WHERE src.key = 
10) subq;

returned result: 0

2. Additional test: replace the last sentence as below (adding a path to the 
file):
select count(distinct subq.key) from
(FROM src MAP src.key USING 'python ../../data/scripts/dumpdata_script.py' AS 
key WHERE src.key = 10) subq;

returned result: 1000022


________________________________
best regards,
zhenhua

[spark sql] "add file" doesn't work

Reply via email to