I have created simple table as follow *CREATE TABLE events_details( id int, event_id int, user_id BIGINT, event_date string, intval_1 int , intval_2 int, intval_3 int, intval_4 int, intval_5 int, intval_6 int, intval_7 int, intval_8 int, intval_9 int, intval_10 int, intval_11 int, intval_12 int, intval_13 int, intval_14 int, intval_15 int, intval_16 int, intval_17 int, intval_18 int, intval_19 int, intval_20 int, intval_21 int, intval_22 int, intval_23 int, intval_24 int, intval_25 int, intval_26 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;* * * and populate it using sqoop. Then i create index on it which is as follow
* create INDEX idx_event_date ON TABLE events_details (event_date) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; * * *and *ALTER INDEX idx_event_date ON events_details REBUILD; * after index created successfully i executed following command according to a document *INSERT OVERWRITE DIRECTORY "/tmp/idx_event_date" SELECT `_bucketname` , `_offsets` FROM nydus.nydus__events_details_idx_event_date__ where to_date(event_date) >='2012-06-24' AND to_date(event_date) <= '2012-06-29' ; * * *when this executes completely, i executes following commands *SET hive.index.compact.file=/tmp/idx_event_date; SET hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat; * now i run same old query ****SELECT COUNT(DISTINCT ed.user_id) as Value,min(to_date(event_date)) as event_date FROM nydus.events_details ed where intval_14 = 3 AND event_id = 1 AND ( to_date(event_date) >='2012-06-24' AND to_date(event_date) <= '2012-06-29' ) GROUP BY weekofyear(event_date) **** but its execution time have not optimized (same 770 sec as before). What am i doing wrong? Please help me out -- *Muhammad Hamza Asad*