Re: [spark sql] "add file" doesn't work

wangzhenhua (G) Tue, 10 Feb 2015 17:42:07 -0800

[Additional info] I was using the master branch of 9 Feb 2015, the latest 
commit in "git info" is:


commit 0793ee1b4dea1f4b0df749e8ad7c1ab70b512faf
Author: Sandy Ryza <[email protected]>
Date:   Mon Feb 9 10:12:12 2015 +0000

    SPARK-2149. [MLLIB] Univariate kernel density estimation

    Author: Sandy Ryza <[email protected]>

    Closes #1093 from sryza/sandy-spark-2149 and squashes the following com

    5f06b33 [Sandy Ryza] More review comments
    0f73060 [Sandy Ryza] Respond to Sean's review comments
    0dfa005 [Sandy Ryza] SPARK-2149. Univariate kernel density estimation



________________________________
best regards,
zhenhua

From: wangzhenhua (G)<mailto:[email protected]>
Date: 2015-02-10 20:39
To: user<mailto:[email protected]>
Subject: [spark sql] "add file" doesn't work
Hi all,

I'm testing the spark sql module, and I found a problem with one of the test 
cases.

I think the main problem is that the "add file" command in spark sql (hive?) 
doesn't work.
since conducting an additional test by directly giving the path to the file 
offers the right answer.

The tests are as follows:
1. Original test case:

set hive.map.aggr.hash.percentmemory = 0.3;
set hive.mapred.local.mem = 384;
add file ../../data/scripts/dumpdata_script.py;
select count(distinct subq.key) from
(FROM src MAP src.key USING 'python dumpdata_script.py' AS key WHERE src.key = 
10) subq;

returned result: 0

2. Additional test: replace the last sentence as below (adding a path to the 
file):
select count(distinct subq.key) from
(FROM src MAP src.key USING 'python ../../data/scripts/dumpdata_script.py' AS 
key WHERE src.key = 10) subq;

returned result: 1000022


________________________________
best regards,
zhenhua

Re: [spark sql] "add file" doesn't work

Reply via email to