Re: PySpark elasticsearch question

2014-12-09 Thread Mohamed Lrhazi
found a format that worked, kind of accidentally: "es.query" : """{"query":{"match_all":{}},"fields":["title","_source"]}""" Thanks, Mohamed. On Tue, Dec 9, 2014 at 11:27 AM, Mohamed Lrhazi < mohamed.lrh...@georgetown.edu> wrote: > Thanks Nick... still no luck. > > If I use "?q=somerandomchar

Re: PySpark elasticsearch question

2014-12-09 Thread Mohamed Lrhazi
Thanks Nick... still no luck. If I use "?q=somerandomchars&fields=title,_source" I get an exception about empty collection, which seems to indicate it is actually using the supplied es.query, but somehow when I do rdd.take(1) or take(10), all I get is the id and an empty dict, apparently... maybe

PySpark elasticsearch question

2014-12-09 Thread Mohamed Lrhazi
Hello, Following a couple of tutorials, I cant seem to get pysprak to get any "fields" from ES other than the document id? I tried like so: es_rdd = sc.newAPIHadoopRDD(inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",keyClass="org.apache.hadoop.io.NullWritable",valueClass="org.elasti