[
https://issues.apache.org/jira/browse/HIVE-28972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raghav Aggarwal updated HIVE-28972:
-----------------------------------
Description:
ENV: Hive master branch (26e6880c9053717c17e5e6416451750e832d46c9) + JAVA 8 +
Datanucleus 5.x, HMS mem: 8GB
Decription: I have a table with 800 columns and 5000 partitions. I ran
{code:java}
alter table test_tbl add columns (col801 string) cascade; {code}
and without HIVE-28909 it took 117.42 sec but with HIVE-28909 it is taking
*986.065 sec*
*Steps to reproduce:*
# Create partitioned table with 800 columns (attaching the create table sql)
# Create 5000 or so empty partitions in hdfs (I used
[https://github.com/Aggarwal-Raghav/Concurrent-Partition-Gen] )
# Run msck repair table test_tbl to load the partitions to HMS
# alter table test_tbl add columns (col801 string) cascade;
Attaching all the captured info i.e. HMS logs, HMS2, beeline screenshots with
and without HIVE-28909
The following log is coming: *4006601 times*
{code:java}
2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@43adb9db" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@29276cc5" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@109992db" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,691 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@69748f31" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,691 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@6ae74875" doesnt
have the owner set. Managing the relation and setting the owner. {code}
was:
ENV: Hive master branch (26e6880c9053717c17e5e6416451750e832d46c9) + JAVA 8 +
Datanucleus 5.x, HMS mem: 8GB
Decription: I have a table with 800 columns and 5000 partitions. I ran
{code:java}
alter table test_tbl add columns (col801 string) cascade; {code}
and without HIVE-28909 it took 117.42 sec but with HIVE-28909 it is taking
*986.065 sec*
*Steps to reproduce:*
# Create partitioned table with 800 columns (attaching the create table sql)
# Create 5000 or so empty partitions in hdfs (I used
[https://github.com/Aggarwal-Raghav/Concurrent-Partition-Gen)]
# Run msck repair table test_tbl to load the partitions to HMS
# alter table test_tbl add columns (col801 string) cascade;
Attaching all the captured info i.e. HMS logs, HMS2, beeline screenshots with
and without HIVE-28909
The following log is coming: *4006601 times*
{code:java}
2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@43adb9db" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@29276cc5" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@109992db" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,691 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@69748f31" doesnt
have the owner set. Managing the relation and setting the owner.
2025-05-27T23:08:02,691 INFO [Metastore-Handler-Pool: Thread-64]
DataNucleus.Persistence: Object
"org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
yet element "org.apache.hadoop.hive.metastore.model.MColumn@6ae74875" doesnt
have the owner set. Managing the relation and setting the owner. {code}
> HMS performace degradation post HIVE-28909 for alter query
> ----------------------------------------------------------
>
> Key: HIVE-28972
> URL: https://issues.apache.org/jira/browse/HIVE-28972
> Project: Hive
> Issue Type: Bug
> Reporter: Raghav Aggarwal
> Assignee: Raghav Aggarwal
> Priority: Major
> Attachments: create_tbl.sql, hivemetastore_without_HIVE-28909.log
>
>
> ENV: Hive master branch (26e6880c9053717c17e5e6416451750e832d46c9) + JAVA 8 +
> Datanucleus 5.x, HMS mem: 8GB
> Decription: I have a table with 800 columns and 5000 partitions. I ran
> {code:java}
> alter table test_tbl add columns (col801 string) cascade; {code}
> and without HIVE-28909 it took 117.42 sec but with HIVE-28909 it is taking
> *986.065 sec*
>
> *Steps to reproduce:*
> # Create partitioned table with 800 columns (attaching the create table sql)
> # Create 5000 or so empty partitions in hdfs (I used
> [https://github.com/Aggarwal-Raghav/Concurrent-Partition-Gen] )
> # Run msck repair table test_tbl to load the partitions to HMS
> # alter table test_tbl add columns (col801 string) cascade;
> Attaching all the captured info i.e. HMS logs, HMS2, beeline screenshots with
> and without HIVE-28909
>
> The following log is coming: *4006601 times*
>
> {code:java}
> 2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
> DataNucleus.Persistence: Object
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@43adb9db" doesnt
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
> DataNucleus.Persistence: Object
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@29276cc5" doesnt
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,690 INFO [Metastore-Handler-Pool: Thread-64]
> DataNucleus.Persistence: Object
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@109992db" doesnt
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,691 INFO [Metastore-Handler-Pool: Thread-64]
> DataNucleus.Persistence: Object
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@69748f31" doesnt
> have the owner set. Managing the relation and setting the owner.
> 2025-05-27T23:08:02,691 INFO [Metastore-Handler-Pool: Thread-64]
> DataNucleus.Persistence: Object
> "org.apache.hadoop.hive.metastore.model.MColumnDescriptor@68097e71" has a
> collection "org.apache.hadoop.hive.metastore.model.MColumnDescriptor.fields"
> yet element "org.apache.hadoop.hive.metastore.model.MColumn@6ae74875" doesnt
> have the owner set. Managing the relation and setting the owner. {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)