[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769572&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769572 ]
ASF GitHub Bot logged work on HIVE-24484: ----------------------------------------- Author: ASF GitHub Bot Created on: 12/May/22 12:07 Start Date: 12/May/22 12:07 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on code in PR #3279: URL: https://github.com/apache/hive/pull/3279#discussion_r871272145 ########## common/pom.xml: ########## @@ -195,6 +194,11 @@ <artifactId>tez-api</artifactId> <version>${tez.version}</version> </dependency> + <dependency> + <groupId>org.fusesource.jansi</groupId> + <artifactId>jansi</artifactId> + <version>2.3.4</version> Review Comment: move version to root pom ########## itests/pom.xml: ########## @@ -352,6 +352,12 @@ <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-yarn-client</artifactId> <version>${hadoop.version}</version> + <exclusions> + <exclusion> + <groupId>org.jline</groupId> + <artifactId>jline</artifactId> + </exclusion> Review Comment: I'm not sure if this fix will work; it could work for the tests; but you've just excluded the dependency; I think that will not prevent that dep from appearing on the classpath during runtime... have you tested a dist build as well? ########## ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java: ########## @@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, HiveInputFormat.HiveInputSpl JobConf jobConf, Reporter reporter) throws IOException { int headerCount = Utilities.getHeaderCount(tableDesc); int footerCount = Utilities.getFooterCount(tableDesc, jobConf); - RecordReader innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter); + + RecordReader innerReader = null; + try { + innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter); + } catch (InterruptedIOException iioe) { + // If reading from the underlying record reader is interrupted, return a no-op record reader Review Comment: why not simply propagate the `Exception` ? This will hide away the exception ########## ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java: ########## @@ -178,8 +178,7 @@ public void authorize(Database db, Privilege[] readRequiredPriv, Privilege[] wri private static boolean userHasProxyPrivilege(String user, Configuration conf) { try { - if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, - HMSHandler.getIPAddress())) { + if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf, HMSHandler.getIPAddress())) { Review Comment: I think max_linelength should be <=100 ; are you using the `dev-support/eclipse-styles.xml` ? ########## streaming/src/test/org/apache/hive/streaming/TestStreaming.java: ########## @@ -1317,6 +1318,11 @@ public void testTransactionBatchEmptyCommit() throws Exception { connection.close(); } + /** + * Starting with HDFS 3.3.1, the underlying system NOW SUPPORTS hflush so this + * test fails. Review Comment: ok; then I think this test could be probably converted into a test which checks that it works ########## hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java: ########## @@ -315,18 +320,19 @@ public void testOutputFormat() throws Throwable { // Check permisssion on partition dirs and files created for (int i = 0; i < tableNames.length; i++) { - Path partitionFile = new Path(warehousedir + "/" + tableNames[i] - + "/ds=1/cluster=ag/part-m-00000"); - FileSystem fs = partitionFile.getFileSystem(mrConf); - Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct", - fs.getFileStatus(partitionFile).getPermission(), - new FsPermission(tablePerms[i])); - Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct", - fs.getFileStatus(partitionFile.getParent()).getPermission(), - new FsPermission(tablePerms[i])); - Assert.assertEquals("File permissions of table " + tableNames[i] + " is not correct", - fs.getFileStatus(partitionFile.getParent().getParent()).getPermission(), - new FsPermission(tablePerms[i])); + final Path partitionFile = new Path(warehousedir + "/" + tableNames[i] + "/ds=1/cluster=ag/part-m-00000"); + final Path grandParentOfPartitionFile = partitionFile.getParent(); Review Comment: I would expect `grandParent` to be parent-of-parent; I think this change could be revoked - it was more readable earlier; the last assert now checks for the `parent` dir and not for `parent.parent`; the second assert was also clobbered.... ########## itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java: ########## @@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable { put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir); }}, "test_key123"); - List<String> dumpWithClause = Arrays.asList( - "'hive.repl.add.raw.reserved.namespace'='true'", - "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='" - + replica.externalTableWarehouseRoot + "'", - "'distcp.options.skipcrccheck'=''", - "'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname + "'='false'", - "'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='" - + UserGroupInformation.getCurrentUser().getUserName() +"'"); - WarehouseInstance.Tuple tuple = - primary.run("use " + primaryDbName) - .run("create table encrypted_table (id int, value string)") - .run("insert into table encrypted_table values (1,'value1')") - .run("insert into table encrypted_table values (2,'value2')") - .dump(primaryDbName, dumpWithClause); - - replica - .run("repl load " + primaryDbName + " into " + replicatedDbName - + " with('hive.repl.add.raw.reserved.namespace'='true', " - + "'hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', " - + "'hive.exec.copyfile.maxsize'='0', 'distcp.options.skipcrccheck'='')") - .run("use " + replicatedDbName) - .run("repl status " + replicatedDbName) - .verifyResult(tuple.lastReplicationId); - - try { - replica - .run("select value from encrypted_table") - .verifyResults(new String[] { "value1", "value2" }); - Assert.fail("Src EZKey shouldn't be present on target"); - } catch (IOException e) { - Assert.assertTrue(e.getCause().getMessage().contains("KeyVersion name 'test_key@0' does not exist")); - } - //read should pass without raw-byte distcp - dumpWithClause = Arrays.asList( "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='" + List<String> dumpWithClause = Arrays.asList( "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='" + replica.externalTableWarehouseRoot + "'"); - tuple = primary.run("use " + primaryDbName) + WarehouseInstance.Tuple tuple = + primary.run("use " + primaryDbName) .run("create external table encrypted_table2 (id int, value string)") .run("insert into table encrypted_table2 values (1,'value1')") .run("insert into table encrypted_table2 values (2,'value2')") .dump(primaryDbName, dumpWithClause); replica - .run("repl load " + primaryDbName + " into " + replicatedDbName - + " with('hive.repl.replica.external.table.base.dir'='" + replica.externalTableWarehouseRoot + "', " - + "'hive.exec.copyfile.maxsize'='0', 'distcp.options.skipcrccheck'='')") - .run("use " + replicatedDbName) - .run("repl status " + replicatedDbName) - .verifyResult(tuple.lastReplicationId) Review Comment: wasn't this the expected behaviour? ########## storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java: ########## @@ -18,10 +18,10 @@ package org.apache.hadoop.hive.common; -import org.apache.commons.lang.StringUtils; +import org.apache.commons.lang3.StringUtils; Review Comment: these lang/lang3 changes seem unrelated to me; I think they could be done in a separate jira to reduce the amount of work. if you are moving away from the usage of `org.apache.commons.lang` ; could you please also ban it in thr root pom.xml? ########## standalone-metastore/pom.xml: ########## @@ -227,6 +227,10 @@ <artifactId>hadoop-mapreduce-client-core</artifactId> <version>${hadoop.version}</version> <exclusions> + <exclusion> + <groupId>org.jline</groupId> + <artifactId>jline</artifactId> + </exclusion> Review Comment: this jline dep just creeps in from multiple hadoop artifacts; the best would be to upgrade jline and not risk our chances with exclusions ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java: ########## @@ -9361,7 +9362,8 @@ public NotificationEventsCountResponse get_notification_events_count(Notificatio private void authorizeProxyPrivilege() throws TException { // Skip the auth in embedded mode or if the auth is disabled if (!HiveMetaStore.isMetaStoreRemote() || - !MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH)) { + !MetastoreConf.getBoolVar(conf, ConfVars.EVENT_DB_NOTIFICATION_API_AUTH) || conf.getBoolean(HIVE_IN_TEST.getVarname(), + false)) { Review Comment: you are turning a feature off based on this `HIVE_IN_TEST` boolean; which means the feature will not be tested during regular hive test; please find another way; and since it seems like this is being turned off multiple places - can you cover it with a test? ########## itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java: ########## @@ -123,57 +122,24 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable { put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir); }}, "test_key123"); - List<String> dumpWithClause = Arrays.asList( Review Comment: seems like a testcase was removed; I wonder if this it not supported anymore ? ...and why are we removing this case in the scope of a hadoop upgrade? ########## ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java: ########## @@ -18,20 +18,8 @@ package org.apache.hadoop.hive.ql.exec; -import java.io.FileNotFoundException; Review Comment: import order is different in your IDE than in existing code; can you configure it to not reorder the imports in every file? I wonder if we have some agreement what order we want to use.... Issue Time Tracking ------------------- Worklog Id: (was: 769572) Time Spent: 8h 43m (was: 8.55h) > Upgrade Hadoop to 3.3.1 > ----------------------- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement > Reporter: David Mollitor > Assignee: David Mollitor > Priority: Major > Labels: pull-request-available > Time Spent: 8h 43m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)