drse opened a new issue, #3624: URL: https://github.com/apache/fory/issues/3624
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/fory/issues) and found no similar issues. ### Version - Apache Fory 0.17.0 - JDK 21 - Linux / macOS ### Component(s) Java ### Minimal reproduce step When **all** of the following are true, Fory either throws `IndexOutOfBoundsException` from the generated codec or, worse, silently decodes a different long value with no exception: - pooled Fory (`buildThreadSafeForyPool`) - codegen on (`withCodegen(true)` — the default) - number compression on (`withNumberCompressed(true)`) - a field of type boxed `Long` (not primitive `long`) The bug requires two pool instances. Single-Fory round-trip works. Disabling any one of `withNumberCompressed`, `withCodegen`, the boxing, or the second pool eliminates the bug. ```java import org.apache.fory.Fory; import org.apache.fory.ThreadSafeFory; import org.apache.fory.config.CompatibleMode; import org.apache.fory.config.ForyBuilder; import org.apache.fory.config.Language; public class ForyPoolNumberCompressedBug { public record Payload(Long longValue, String stringValue) {} static ForyBuilder builder() { return Fory.builder() .withLanguage(Language.JAVA) .withCodegen(true) .withAsyncCompilation(true) .requireClassRegistration(false) .suppressClassRegistrationWarnings(true) .withDeserializeUnknownClass(true) .withRefTracking(true) .withCompatibleMode(CompatibleMode.COMPATIBLE) .withStringCompressed(true) .withNumberCompressed(true) // <-- removing this fixes the bug .withRefCopy(true); } public static void main(String[] args) { // Two pools built from the same config — e.g. producer/consumer on // different nodes, or process-restart snapshot restore. ThreadSafeFory writer = builder().buildThreadSafeForyPool(4); ThreadSafeFory reader = builder().buildThreadSafeForyPool(4); Payload original = new Payload( 123_456_789L, "longer string with multibyte: \u00ff\u00fe"); byte[] bytes = writer.serialize(original); Payload roundTrip = (Payload) reader.deserialize(bytes); System.out.println("original = " + original); System.out.println("roundTrip = " + roundTrip); if (!original.equals(roundTrip)) { throw new AssertionError("CORRUPTION: " + original + " != " + roundTrip); } } } ``` ### What did you expect to see? A round-trip across two pools built from the same `ForyBuilder` should produce a value equal to the original. ### What did you see instead? ``` original = Payload[longValue=123456789, stringValue=longer string with multibyte: ÿþ] roundTrip = Payload[longValue=988764757, stringValue=longer string with multibyte: ÿþ] Exception in thread "main" java.lang.AssertionError: CORRUPTION: ... ``` Alternate failure mode (thrown from generated codec) With other payload shapes (concurrent calls on a single pool, multiple records in sequence) the same configuration throws: ``` java.lang.IndexOutOfBoundsException: readerIndex(78) + length(18) exceeds size(84) at org.apache.fory.memory.MemoryBuffer$BoundChecker.fillBuffer(MemoryBuffer.java:189) at org.apache.fory.serializer.StringSerializer.readBytesUnCompressedUTF16(StringSerializer.java:565) at org.apache.fory.serializer.StringSerializer.readCompressedBytesString(StringSerializer.java:259) at <Pkg>$PayloadForyRefCodecMetaShared0_0.read(<Pkg>$PayloadForyRefCodecMetaShared0_0.java:47) at org.apache.fory.context.ReadContext.readDataInternal(ReadContext.java:666) at org.apache.fory.context.ReadContext.readNonRef(ReadContext.java:580) at org.apache.fory.context.ReadContext.readRef(ReadContext.java:518) at org.apache.fory.Fory.deserialize(Fory.java:476) ``` The generated codec name is `*ForyRefCodecMetaShared0_0`, suggesting the writer and reader codecs disagree on the encoded layout for boxed numeric fields. ### Bisection (each toggle applied in isolation against the failing baseline) | Toggle | Result | |---|---| | `withNumberCompressed(false)` | works | | `withCodegen(false)` (interpreted path) | works | | `Payload(long longValue, String stringValue)` (primitive long) | works | | single Fory (no pool) | works | | `withStringCompressed(false)` | **still broken** | | `withRefTracking(false)` + `withRefCopy(false)` | **still broken** | | `CompatibleMode.SCHEMA_CONSISTENT` | **still broken** | So the trigger is codegen + number compression + boxed numeric field + cross-pool deserialization. The bug exists in both `COMPATIBLE` and `SCHEMA_CONSISTENT` modes. ### Anything Else? _No response_ ### Are you willing to submit a PR? - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
