drse opened a new issue, #3624:
URL: https://github.com/apache/fory/issues/3624

   ### Search before asking
   
   - [x] I had searched in the [issues](https://github.com/apache/fory/issues) 
and found no similar issues.
   
   
   ### Version
   
   - Apache Fory 0.17.0
   - JDK 21
   - Linux / macOS
   
   ### Component(s)
   
   Java
   
   ### Minimal reproduce step
   
   When **all** of the following are true, Fory either throws 
`IndexOutOfBoundsException` from the generated codec or, worse, silently 
decodes a different long value with no exception:
   
   - pooled Fory (`buildThreadSafeForyPool`)
   - codegen on (`withCodegen(true)` — the default)
   - number compression on (`withNumberCompressed(true)`)
   - a field of type boxed `Long` (not primitive `long`)
   
   The bug requires two pool instances. Single-Fory round-trip works. Disabling 
any one of `withNumberCompressed`, `withCodegen`, the boxing, or the second 
pool eliminates the bug.
   
   ```java
   import org.apache.fory.Fory;
   import org.apache.fory.ThreadSafeFory;
   import org.apache.fory.config.CompatibleMode;
   import org.apache.fory.config.ForyBuilder;
   import org.apache.fory.config.Language;
   
   public class ForyPoolNumberCompressedBug {
   
       public record Payload(Long longValue, String stringValue) {}
   
       static ForyBuilder builder() {
           return Fory.builder()
                   .withLanguage(Language.JAVA)
                   .withCodegen(true)
                   .withAsyncCompilation(true)
                   .requireClassRegistration(false)
                   .suppressClassRegistrationWarnings(true)
                   .withDeserializeUnknownClass(true)
                   .withRefTracking(true)
                   .withCompatibleMode(CompatibleMode.COMPATIBLE)
                   .withStringCompressed(true)
                   .withNumberCompressed(true) // <-- removing this fixes the 
bug
                   .withRefCopy(true);
       }
   
       public static void main(String[] args) {
           // Two pools built from the same config — e.g. producer/consumer on
           // different nodes, or process-restart snapshot restore.
           ThreadSafeFory writer = builder().buildThreadSafeForyPool(4);
           ThreadSafeFory reader = builder().buildThreadSafeForyPool(4);
   
           Payload original = new Payload(
                   123_456_789L,
                   "longer string with multibyte: \u00ff\u00fe");
   
           byte[] bytes = writer.serialize(original);
           Payload roundTrip = (Payload) reader.deserialize(bytes);
   
           System.out.println("original  = " + original);
           System.out.println("roundTrip = " + roundTrip);
   
           if (!original.equals(roundTrip)) {
               throw new AssertionError("CORRUPTION: " + original + " != " + 
roundTrip);
           }
       }
   }
   ```
   
   ### What did you expect to see?
   
   A round-trip across two pools built from the same `ForyBuilder` should 
produce a value equal to the original.
   
   
   ### What did you see instead?
   
   
   ```
   original  = Payload[longValue=123456789, stringValue=longer string with 
multibyte: ÿþ]
   roundTrip = Payload[longValue=988764757, stringValue=longer string with 
multibyte: ÿþ]
   Exception in thread "main" java.lang.AssertionError: CORRUPTION: ...
   ```
   
   Alternate failure mode (thrown from generated codec)
   
   With other payload shapes (concurrent calls on a single pool, multiple 
records in sequence) the same configuration throws:
   
   ```
   java.lang.IndexOutOfBoundsException: readerIndex(78) + length(18) exceeds 
size(84)
       at 
org.apache.fory.memory.MemoryBuffer$BoundChecker.fillBuffer(MemoryBuffer.java:189)
       at 
org.apache.fory.serializer.StringSerializer.readBytesUnCompressedUTF16(StringSerializer.java:565)
       at 
org.apache.fory.serializer.StringSerializer.readCompressedBytesString(StringSerializer.java:259)
       at 
<Pkg>$PayloadForyRefCodecMetaShared0_0.read(<Pkg>$PayloadForyRefCodecMetaShared0_0.java:47)
       at 
org.apache.fory.context.ReadContext.readDataInternal(ReadContext.java:666)
       at org.apache.fory.context.ReadContext.readNonRef(ReadContext.java:580)
       at org.apache.fory.context.ReadContext.readRef(ReadContext.java:518)
       at org.apache.fory.Fory.deserialize(Fory.java:476)
   ```
   
   The generated codec name is `*ForyRefCodecMetaShared0_0`, suggesting the 
writer and reader codecs disagree on the encoded layout for boxed numeric 
fields.
   
   ### Bisection (each toggle applied in isolation against the failing baseline)
   
   | Toggle | Result |
   |---|---|
   | `withNumberCompressed(false)` | works |
   | `withCodegen(false)` (interpreted path) | works |
   | `Payload(long longValue, String stringValue)` (primitive long) | works |
   | single Fory (no pool) | works |
   | `withStringCompressed(false)` | **still broken** |
   | `withRefTracking(false)` + `withRefCopy(false)` | **still broken** |
   | `CompatibleMode.SCHEMA_CONSISTENT` | **still broken** |
   
   So the trigger is codegen + number compression + boxed numeric field + 
cross-pool deserialization. The bug exists in both `COMPATIBLE` and 
`SCHEMA_CONSISTENT` modes.
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to