[ https://issues.apache.org/jira/browse/HIVE-23363?focusedWorklogId=453611&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-453611 ]
ASF GitHub Bot logged work on HIVE-23363: ----------------------------------------- Author: ASF GitHub Bot Created on: 01/Jul/20 19:11 Start Date: 01/Jul/20 19:11 Worklog Time Spent: 10m Work Description: belugabehr edited a comment on pull request #1118: URL: https://github.com/apache/hive/pull/1118#issuecomment-652546609 @ashutoshc Let me see if I can address all of your questions with some background and context. It took me a long time to get these changes to pass the unit tests. So, these mappings, in some respect, don't really matter. When HMS is started, users use the `schema-tool` to create the HMS schema for real. Some of these mappings in the `jdo` file (like indexes) are only applied when unit testing because the unit tests build the schema via DN and `datanucleus.schema.autoCreateAll`. For unit testing, the database backend is Apache Derby. I changed the name of the index to match the Derby schema more closely. In trying to debug these various errors, I was very confused at first about it complaining about "COLUMNS_PK". https://github.com/apache/hive/blob/4942a7c0b4be3a5b0c889a89b903e9a70c57d494/standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql#L364 With that said, when I upgraded to DN 5.x, the unit tests would not pass. I narrowed the issue down to this one table definition. I tried several iterations to get success, but this is the one that worked. I derived this solution by closely examining the docs on this topic. It has an example that very closely aligns with this use case: http://www.datanucleus.org/products/accessplatform/jdo/mapping.html#embedded_collection It is a bit of a wonder looking at the existing JDO definition how this ever worked. ``` <primary-key name="COLUMNS_PK"> <column name="COLUMN_NAME"/> </primary-key> ``` This is not correct, this should be a compound primary key of CD_ID *and* COLUMN_NAME. This is enforced by `SQL110922153006740` in the full schema. As things currently stand, the COLUMN_NAME definition in the `jdo` file says that the COLUMN_NAME is not defined to be non-null. This caused an error with Derby as it didn't allow creating a PRIMARY KEY on a field that could be null. So, putting it all together, I came to the current solution. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 453611) Time Spent: 1.5h (was: 1h 20m) > Upgrade DataNucleus dependency to 5.2 > ------------------------------------- > > Key: HIVE-23363 > URL: https://issues.apache.org/jira/browse/HIVE-23363 > Project: Hive > Issue Type: Improvement > Affects Versions: 4.0.0 > Reporter: Zoltan Chovan > Assignee: Zoltan Chovan > Priority: Critical > Labels: pull-request-available > Attachments: HIVE-23363.2.patch, HIVE-23363.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > Upgrade Datanucleus from 4.2 to 5.2 as based on it's docs 4.2 has been > retired: > [http://www.datanucleus.org/documentation/products.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)