tballison commented on code in PR #2878:
URL: https://github.com/apache/tika/pull/2878#discussion_r3367367601
##########
tika-ml/tika-ml-junkdetect/src/main/java/org/apache/tika/ml/junkdetect/BigramTables.java:
##########
@@ -167,6 +173,28 @@ public static BigramTables readFrom(DataInputStream dis)
throws IOException {
bMin, bMax, uMin, uMax, uFallback, backoffAlpha);
}
+ /** Writes a non-negative long as an unsigned LEB128 varint. */
+ private static void writeVarLong(DataOutputStream dos, long v) throws
IOException {
+ while ((v & ~0x7FL) != 0) {
+ dos.writeByte((int) ((v & 0x7F) | 0x80));
+ v >>>= 7;
+ }
+ dos.writeByte((int) v);
+ }
+
+ /** Reads an unsigned LEB128 varint written by {@link #writeVarLong}. */
+ private static long readVarLong(DataInputStream dis) throws IOException {
+ long v = 0;
+ int shift = 0;
+ int b;
+ do {
+ b = dis.readUnsignedByte();
+ v |= (long) (b & 0x7F) << shift;
+ shift += 7;
+ } while ((b & 0x80) != 0);
+ return v;
+ }
Review Comment:
generally, we only read what we write. The correctness guards are still
worth having. Added in most recent commit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]