On Wed, Feb 16, 2011 at 5:07 PM, Vijay <tec...@gmail.com> wrote: > Hi, > > I'm trying this use case: do a simple select from an existing table > and pass the results through a reduce script to do some analysis. The > table has web logs so the select uses a pseudo user ID as the key and > the rest of the data as values. My expectation is that a single reduce > script should receive all logs for a given user so that I can do some > path based analysis. Are there any issues with this idea so far? > > When I try it though, hive is not doing what I'd expect. The > particular query is not generating any reduce tasks at all. Here's a > sample query: > > FROM( > SELECT userid, time, url > FROM weblogs > ) weblogs > reduce weblogs.userid, weblogs.time, weblogs.url > using 'counter.pl' > as user, count; > > Thanks, > Vijay >
It is hard to tell without the script. Is your pl script working on pipes? ie. while (<in>){ echo $_ }