[GitHub] [hudi] codope commented on a change in pull request #5004: [HUDI-1180] Upgrade HBase to 2.4.9

GitBox Thu, 17 Mar 2022 02:41:26 -0700


codope commented on a change in pull request #5004:
URL: https://github.com/apache/hudi/pull/5004#discussion_r828887365




##########
File path: hudi-client/hudi-spark-client/pom.xml
##########
@@ -110,6 +110,12 @@
         </exclusion>
       </exclusions>
     </dependency>
+    <dependency>
+      <groupId>org.apache.zookeeper</groupId>

Review comment:
       Why is this needed?

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java
##########
@@ -80,45 +85,49 @@
   public HoodieHFileReader(Configuration configuration, Path path, CacheConfig 
cacheConfig) throws IOException {
     this.conf = configuration;
     this.path = path;
-    this.reader = HFile.createReader(FSUtils.getFs(path.toString(), 
configuration), path, cacheConfig, conf);
+    this.reader = HFile.createReader(FSUtils.getFs(path.toString(), 
configuration), path, cacheConfig, true, conf);
   }
 
   public HoodieHFileReader(Configuration configuration, Path path, CacheConfig 
cacheConfig, FileSystem fs) throws IOException {
     this.conf = configuration;
     this.path = path;
     this.fsDataInputStream = fs.open(path);
-    this.reader = HFile.createReader(fs, path, cacheConfig, configuration);
+    this.reader = HFile.createReader(fs, path, cacheConfig, true, 
configuration);
   }
 
   public HoodieHFileReader(byte[] content) throws IOException {
     Configuration conf = new Configuration();
     Path path = new Path("hoodie");
     SeekableByteArrayInputStream bis = new 
SeekableByteArrayInputStream(content);
     FSDataInputStream fsdis = new FSDataInputStream(bis);
-    this.reader = HFile.createReader(FSUtils.getFs("hoodie", conf), path, new 
FSDataInputStreamWrapper(fsdis),
-        content.length, new CacheConfig(conf), conf);
+    FSDataInputStreamWrapper stream = new FSDataInputStreamWrapper(fsdis);
+    FileSystem fs = FSUtils.getFs("hoodie", conf);

Review comment:
       I understand that fs is needed to build the context. Perhaps we can 
provide an actual file or basePath instead of dummy "hoodie".

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java
##########
@@ -250,7 +259,7 @@ public BloomFilter readBloomFilter() {
    */
   public List<Pair<String, R>> readRecords(List<String> keys, Schema schema) 
throws IOException {
     this.schema = schema;
-    reader.loadFileInfo();
+    reader.getHFileInfo();

Review comment:
       I think it's just to keep the behavior consistent. We can probably 
remove it. Also, this method does not exist in hbase 2.2.3.

##########
File path: 
hudi-common/src/test/java/org/apache/hudi/common/fs/inline/TestInLineFileSystem.java
##########
@@ -369,7 +369,8 @@ private Path getRandomInlinePath() {
   private void verifyFileStatus(FileStatus expected, Path inlinePath, long 
expectedLength, FileStatus actual) {
     assertEquals(inlinePath, actual.getPath());
     assertEquals(expectedLength, actual.getLen());
-    assertEquals(expected.getAccessTime(), actual.getAccessTime());
+    // removing below assertion as it is flaky on rare occasion (difference is 
in single-digit ms)

Review comment:
       I think we can remove it as asserting access time does not serve any 
purpose for this test. All we care about is length and block size. Anyway, we 
have the modification time assetion in the end.

##########
File path: docker/compose/docker-compose_hadoop284_hive233_spark244.yml
##########
@@ -184,7 +184,7 @@ services:
   presto-coordinator-1:
     container_name: presto-coordinator-1
     hostname: presto-coordinator-1
-    image: apachehudi/hudi-hadoop_2.8.4-prestobase_0.268:latest
+    image: apachehudi/hudi-hadoop_2.8.4-prestobase_0.271:latest

Review comment:
       Can we do the presto upgrade for docker in a separate PR? Is it needed 
for hbase upgrade?

##########
File path: packaging/hudi-flink-bundle/pom.xml
##########
@@ -147,10 +148,18 @@
 
                   <include>org.apache.hbase:hbase-common</include>
                   <include>org.apache.hbase:hbase-client</include>
+                  <include>org.apache.hbase:hbase-hadoop-compat</include>

Review comment:
       That's a good idea. In fact, i've tried out but it's a very manual 
time-consuming process to verify. I gave up after a few failures. And keep 
future upgrades in mind. But, i would be very happy to reduce the bundle size 
in any way we can and we should take another stab at this idea in future. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] codope commented on a change in pull request #5004: [HUDI-1180] Upgrade HBase to 2.4.9

Reply via email to