Hi HBase Community, We are currently facing an issue in our production environment with HBase replication, and I would greatly appreciate any guidance or suggestions the community may have
We are running HBase version 2.5.8, and in the logs, we consistently encounter the following warning: 024-09-11T15:51:11,468 WARN [RS_CLAIM_REPLICATION_QUEUE-regionserver/rzv-db09-hd:16020-0.replicationSource,replicav3-rzv-db13-hd.xxxx,16020,1684871532555-rzv-db09-hd.xxxx,16020,1696832789107-rzv-db09-hd.xxxx,16020,1696833033289-rzv-db13-hd.xxxx,16020,1722636062425-rzv-db13-hd.xxxx,16020,1722636803794-rzv-db12-hd.xxxx,16020,1722636800268.replicationSource.wal-reader.rzv-db13-hd.xxxx%2C16020%2C1684871532555,replicav3-rzv-db13-hd.xxxx,16020,1684871532555-rzv-db09-hd.xxxx,16020,1696832789107-rzv-db09-hd.xxxx,16020,1696833033289-rzv-db13-hd.xxxx,16020,1722636062425-rzv-db13-hd.xxxx,16020,1722636803794-rzv-db12-hd.xxxx,16020,1722636800268] regionserver.ReplicationSourceWALReader: Failed to read stream of replication entriesjava.io.EOFException: Cannot seek after EOF at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1682) ~[hadoop-hdfs-client-2.10.2.jar:?] at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66) ~[hadoop-common-2.10.2.jar:?] at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.seekOnFs(ProtobufLogReader.java:527) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.seek(ReaderBase.java:130) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.seek(WALEntryStream.java:408) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:339) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:308) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:298) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:102) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.tryAdvanceStreamAndCreateWALBatch(ReplicationSourceWALReader.java:258) ~[hbase-server-2.5.8.jar:2.5.8] at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:145) ~[hbase-server-2.5.8.jar:2.5.8] This error appears to stem from the replication WAL reader, and the "Cannot seek after EOF" message suggests a failure to read the replication entries. We suspect this may be affecting the replication flow between our region servers. Has anyone encountered this problem before, or does anyone have insights into potential causes and solutions? Thank you in advance for your assistance! Hamado Dene