I'm fine with the variable placeholder not being removed in cases where the variable is not defined (until I change my mind). When I define var2 and var3, though, their placeholders aren't swapped for their values.
My reasoning for this was that I'm moving from one execution script that defines var1, to a newer script that defines var2 and var3. The goal for this was that I could use the same hive script with either execution script until we've fully migrated. I don't disagree with the placeholders remaining if they're not defined, I'm just unsure that having 1 variable reference that's not defined should prevent the others from working. On Mar 6, 2013, at 4:41 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > It was done like this in hive because that is what hadoops variable > substitution does, namely if it does not understand the variable it does not > replace it. > > On Wed, Mar 6, 2013 at 4:30 PM, Dean Wampler > <dean.wamp...@thinkbiganalytics.com> wrote: >> Even newer versions of Hive do this. Any reason you don't want to provide a >> definition for all of them? You could argue that an undefined variable is a >> bug and leaving the literal text in place makes it easier to notice. >> Although, Unix shells would insert an empty string, so never mind ;) >> >> On Wed, Mar 6, 2013 at 3:13 PM, Matt Tucker <matthewt...@gmail.com> wrote: >>> Using CDH3u3 (Hive 0.7.1), it appears that variable substitution becomes >>> disabled when I use a variable in a query that hasn't been defined. >>> >>> For instance, using the following script: >>> >>> set var2=2013-02-01; >>> set var3=2013-02-10; >>> >>> SELECT clndr_dt FROM calendar WHERE clndr_dt LIKE "${hiveconf:var1}" OR >>> (clndr_dt >= "${hiveconf:var2}" AND clndr_dt <= "${hiveconf:var3}") LIMIT 1; >>> >>> I would expect the results job configuration to list the hive.query.string >>> as: >>> >>> SELECT clndr_dt FROM calendar WHERE clndr_dt LIKE "${hiveconf:var1}" OR >>> (clndr_dt >= "2013-02-01" AND clndr_dt <= "2013-02-10") LIMIT 1; >>> >>> (or with ${hiveconf:var1} removed, leaving an empty string). >>> >>> Instead, it prints the query as-is (with the variable placeholders). If I >>> set var1=2012-01-01, it properly substitutes the variables, but it only >>> returns the '2012-01-01' record (not the 2013 records). >>> >>> SELECT clndr_dt FROM calendar WHERE clndr_dt LIKE "2012-01-01" OR ( >>> clndr_dt >= "2013-02-01" AND clndr_dt <= "2013-02-10" ) LIMIT 1 >>> >>> I was originally planning to use this for partition pruning, but it doesn't >>> appear to be the cause as the calendar table is not partitioned. >>> >>> Is there something that I've overlooked? >>> >>> Thanks! >> >> >> >> -- >> Dean Wampler, Ph.D. >> thinkbiganalytics.com >> +1-312-339-1330 >