[ https://issues.apache.org/jira/browse/HIVE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karthik Manamcheri updated HIVE-21028: -------------------------------------- Summary: get_table_meta should use a fetch plan to avoid race conditions ending up in JDOObjectNotFoundException (was: get_table_meta should use a fetch plan) > get_table_meta should use a fetch plan to avoid race conditions ending up in > JDOObjectNotFoundException > ------------------------------------------------------------------------------------------------------- > > Key: HIVE-21028 > URL: https://issues.apache.org/jira/browse/HIVE-21028 > Project: Hive > Issue Type: Bug > Reporter: Karthik Manamcheri > Assignee: Karthik Manamcheri > Priority: Major > > The {{getTableMeta}} call retrieves the tables, loops through the tables and > during this loop it retrieves the database object to get the containing > database name. DataNuclues does a lazy retrieval and so, when the first call > to get all the tables is done, it does not retrieve the database objects. > When this query is executed > {code}query = pm.newQuery(MTable.class, filterBuilder.toString()); > {code} > it loads all the tables, and when you do > {code} > table.getDatabase().getName() > {code} > it then goes and retrieves the database object. > *However*, there could be another thread which actually has deleted the > database!! If this happens, we end up with exceptions such as > {code} > 2018-12-04 22:25:06,525 INFO DataNucleus.Datastore.Retrieve: > [pool-7-thread-191]: Object with id > "6930391[OID]org.apache.hadoop.hive.metastore.model.MTable" not found ! > 2018-12-04 22:25:06,527 WARN DataNucleus.Persistence: [pool-7-thread-191]: > Exception thrown by StateManager.isLoaded > No such database row > org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database > row > {code} > We see this happen especially with calls which retrieve all the tables in all > the databases (basically a call to get_table_meta with dbNames="\*" and > tableNames="\*"). > To avoid this, we can define a custom fetch plan and activate it only for the > get_table_meta query. This fetch plan would fetch the database object along > with the MTable object. > We would first create a fetch plan on the pmf > {code} > pmf.getFetchGroup(MTable.class, > "mtable_db_fetch_group").addMember("database"); > {code} > Then we use it just before calling the query > {code} > pm.getFetchPlan().addGroup("mtable_db_fetch_group"); > query = pm.newQuery(MTable.class, filterBuilder.toString()); > Collection<MTable> tables = (Collection<MTable>) query.executeWithArray(...); > ... > {code} > Before the API call ends, we can remove the fetch plan by > {code} > pm.getFetchPlan().removeGroup("mtable_db_fetch_group"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)