Transform scripts only output text, so Hive has to convert from string to
the column's data type (boolean in this case).  So if you send an empty
string "", that will be converted to boolean FALSE.

FYI, on the way in to a transform script, booleans come through as strings
"true" and "false".

On Tue, Oct 12, 2010 at 12:17 PM, Luke Crouch <lcro...@geek.net> wrote:

> I'm trying to pass a FALSE value thru a custom transform script to another
> table, like so:
>
>         FROM (
>             FROM downloads
>             SELECT project, file, os, FALSE as folder, country, dt
>             WHERE dt='2010-05-14'
>             DISTRIBUTE BY project
>             SORT BY project asc, file asc
>         ) b
>         INSERT OVERWRITE TABLE dl_day PARTITION (dt='2010-05-14', project)
>         SELECT TRANSFORM(file, os, country, folder, dt, project) USING
> 'transformwrap reduce.py  --verbose' AS (file, downloads, os, folder,
> country, project)
>
> > describe dl_day
> ['file', 'string', '']
> ['downloads', 'int', '']
> ['os', 'string', '']
> ['country', 'string', '']
> ['folder', 'boolean', '']
> ['dt', 'string', '']
> ['project', 'string', '']
>
> When I log the 'folder' value from inside reduce.py, it shows:
>
> 2010-10-12 15:32:10,914 - dstat - INFO - reduce to stdout, h[folder]:
>
> i.e., an empty string. But when the INSERT executes, it seems to treat the
> value as TRUE (or string 'true')?
>
> > select folder from dl_day
> ['true']
> ['true']
> ['true']
> ['true']
> ...
>
> How can I preserve the FALSE value thru the transform script?
>
> Thanks,
> -L
>



-- 
Dave Brondsema
Software Engineer
Geeknet

www.geek.net

Reply via email to