I was thinking of timeout for fetching, 2000msec for example. How about that?
2012년 7월 29일 일요일에 Edward Capriolo<edlinuxg...@gmail.com>님이 작성: > If where condition is too complex , selecting specific columns seems simple > enough and useful. > > On Saturday, July 28, 2012, Namit Jain <nj...@fb.com> wrote: >> Currently, hive does not launch map-reduce jobs for the following queries: >> >> select * from <T> where <condition on partition columns> (limit <n>)? >> >> This behavior is not configurable, and cannot be altered. >> >> HIVE-2925 wants to extend this behavior. The goal is not to spawn > map-reduce jobs for the following queries: >> >> Select <expr> from <T> where <any condition> (limit <n>)? >> >> It is currently controlled by one parameter: > hive.aggressive.fetch.task.conversion, based on which it is decided, > whether to spawn >> map-reduce jobs or not for the queries of the above type. Note that this > can be beneficial for certain types of queries, since it is >> avoiding the expensive step of spawning map-reduce. However, it can be > pretty expensive for certain types of queries: selecting >> a very large number of rows, the query having a very selective filter > (which is satisfied by a very number of rows, and therefore involves >> scanning a very large table) etc. The user does not have any control on > this. Note that it cannot be done by hooks, since the pre-semantic >> hooks does not have enough information: type of the query, inputs etc. > and it is too late to do anything in the post-semantic hook (the >> query plan has already been altered). >> >> I would like to propose the following configuration parameters to control > this behavior. >> hive.fetch.task.conversion: true, false, auto >> >> If the value is true, then all queries with only selects and filters will > be converted >> If the value is false, then no query will be converted >> If the value is auto (which should be the default behavior), there should > be additional parameters to control the semantics. >> >> hive.fetch.task.auto.limit.threshold ---> integer value X1 >> hive.fetch.task.auto.inputsize.threshold ---> integer value X2 >> >> If either the query has a limit lower than X1, or the input size is > smaller than X2, the queries containing only filters and selects will be > converted to not use >> map-reudce jobs. >> >> >> Comments… >> >> -namit >> >> >> >