[jira] [Updated] (HIVE-24911) Metastore: Create index on SDS.CD_ID for Postgres

Jira Fri, 19 Mar 2021 07:05:07 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


László Bodor updated HIVE-24911:
--------------------------------
    Description: 
While investigating HIVE-24870, we found that during a long incremental 
replication, an SDS.CD_ID can improve the performance.
It was tested by postgres like below:
{code}
CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID");
EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where 
"CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
DROP INDEX IF EXISTS "SDS_N50";
EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where 
"CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
{code}

Further results can be found in:  [^command-output.txt] 

After some investigation, I found that this index is also part of the schemas 
for a very long time:
Oracle: HIVE-2928
Mysql: HIVE-2246

...except Postgres.


  was:
While investigating HIVE-24870, we found that during a long incremental 
replication, an SDS.CD_ID can improve the performance.
It was tested by postgres like below:
{code}
CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID");
EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where 
"CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
DROP INDEX IF EXISTS "SDS_N50";
EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where 
"CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
{code}

Further results can be found in:  [^command-output.txt] 



> Metastore: Create index on SDS.CD_ID for Postgres
> -------------------------------------------------
>
>                 Key: HIVE-24911
>                 URL: https://issues.apache.org/jira/browse/HIVE-24911
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: command-output.txt
>
>
> While investigating HIVE-24870, we found that during a long incremental 
> replication, an SDS.CD_ID can improve the performance.
> It was tested by postgres like below:
> {code}
> CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID");
> EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where 
> "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
> DROP INDEX IF EXISTS "SDS_N50";
> EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where 
> "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
> {code}
> Further results can be found in:  [^command-output.txt] 
> After some investigation, I found that this index is also part of the schemas 
> for a very long time:
> Oracle: HIVE-2928
> Mysql: HIVE-2246
> ...except Postgres.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24911) Metastore: Create index on SDS.CD_ID for Postgres

Reply via email to