If where condition is too complex , selecting specific columns seems simple enough and useful.
On Saturday, July 28, 2012, Namit Jain <nj...@fb.com> wrote: > Currently, hive does not launch map-reduce jobs for the following queries: > > select * from <T> where <condition on partition columns> (limit <n>)? > > This behavior is not configurable, and cannot be altered. > > HIVE-2925 wants to extend this behavior. The goal is not to spawn map-reduce jobs for the following queries: > > Select <expr> from <T> where <any condition> (limit <n>)? > > It is currently controlled by one parameter: hive.aggressive.fetch.task.conversion, based on which it is decided, whether to spawn > map-reduce jobs or not for the queries of the above type. Note that this can be beneficial for certain types of queries, since it is > avoiding the expensive step of spawning map-reduce. However, it can be pretty expensive for certain types of queries: selecting > a very large number of rows, the query having a very selective filter (which is satisfied by a very number of rows, and therefore involves > scanning a very large table) etc. The user does not have any control on this. Note that it cannot be done by hooks, since the pre-semantic > hooks does not have enough information: type of the query, inputs etc. and it is too late to do anything in the post-semantic hook (the > query plan has already been altered). > > I would like to propose the following configuration parameters to control this behavior. > hive.fetch.task.conversion: true, false, auto > > If the value is true, then all queries with only selects and filters will be converted > If the value is false, then no query will be converted > If the value is auto (which should be the default behavior), there should be additional parameters to control the semantics. > > hive.fetch.task.auto.limit.threshold ---> integer value X1 > hive.fetch.task.auto.inputsize.threshold ---> integer value X2 > > If either the query has a limit lower than X1, or the input size is smaller than X2, the queries containing only filters and selects will be converted to not use > map-reudce jobs. > > > Comments… > > -namit > > >