[ https://issues.apache.org/jira/browse/HIVE-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-24911: -------------------------------- Description: While investigating HIVE-24870, we found that during a long incremental replication, an SDS.CD_ID can improve the performance. It was tested by postgres like below: {code} CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID"); EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE; DROP INDEX IF EXISTS "SDS_N50"; EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE; {code} Further results can be found in: [^command-output.txt] After some investigation, I found that this index is also part of the schemas for a very long time: Oracle: HIVE-2928 Mysql: HIVE-2246 ...except Postgres. was: While investigating HIVE-24870, we found that during a long incremental replication, an SDS.CD_ID can improve the performance. It was tested by postgres like below: {code} CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID"); EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE; DROP INDEX IF EXISTS "SDS_N50"; EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE; {code} Further results can be found in: [^command-output.txt] > Metastore: Create index on SDS.CD_ID for Postgres > ------------------------------------------------- > > Key: HIVE-24911 > URL: https://issues.apache.org/jira/browse/HIVE-24911 > Project: Hive > Issue Type: Improvement > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Attachments: command-output.txt > > > While investigating HIVE-24870, we found that during a long incremental > replication, an SDS.CD_ID can improve the performance. > It was tested by postgres like below: > {code} > CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID"); > EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where > "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE; > DROP INDEX IF EXISTS "SDS_N50"; > EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where > "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE; > {code} > Further results can be found in: [^command-output.txt] > After some investigation, I found that this index is also part of the schemas > for a very long time: > Oracle: HIVE-2928 > Mysql: HIVE-2246 > ...except Postgres. -- This message was sent by Atlassian Jira (v8.3.4#803005)