emrul opened a new pull request, #3603:
URL: https://github.com/apache/fory/pull/3603

   <!-- Apache Fory™ PR template -->
   
   ## Why?
   
   `@apache-fory/core`'s `NAMED_COMPATIBLE_STRUCT` TypeMeta preamble is not 
byte-compatible with pyfory, fory-java, fory-rust, or fory-go. For the same 
logical struct, the JavaScript binding emits an 8-byte int64 header that no 
other binding can read. I noticed this as part of issue #3602.
   
   With this patch my cross-language tests pass, but I don't know if this is 
entirely correct — I'd appreciate a deeper review from someone who knows the 
TypeMeta spec better than I do (especially around the signed-vs-unsigned hash 
interpretation in `prependHeader`).
   
   ## What does this PR do?
   
   Aligns four constants / behaviours in 
`javascript/packages/core/lib/meta/TypeMeta.ts` with what every other xlang 
binding does at 0.17:
   
   | Constant / behaviour | JS before | python / java / rust / go | Reference |
   |----------------------|-----------|---------------------------|-----------|
   | `NUM_HASH_BITS` | `41` | `50` | `python/pyfory/meta/typedef.py:37`, 
`java/.../meta/TypeDef.java:77`, `rust/fory-core/src/meta/type_meta.rs:76`, 
`go/fory/type_def.go:35` |
   | `COMPRESS_META_FLAG` | `1n << 63n` | `1 << 9` | same files |
   | `HAS_FIELDS_META_FLAG` | `1n << 62n` | `1 << 8` | same files |
   | hash read in `prependHeader` | unsigned `BigInt` built from two `uint32` 
halves via `getUint32(0, false) << 32n \| getUint32(4, false)` | signed `int64` 
| pyfory unpacks `int64_t[0]`, fory-java `murmurhash3_x64_128(...)[0]` returns 
`long`, rust `.0 as i64` |
   
   On the hash read specifically: reading the same 8 bytes as unsigned `BigInt` 
never produces a negative value, so the subsequent `abs()` is effectively a 
no-op. Whenever the hash's high bit is set, the resulting header diverges from 
what the other bindings emit for the same struct. The patch uses 
`hash.getBigInt64(0, false)` (signed read) followed by explicit 
arbitrary-precision `abs()` + 63-bit mask, mirroring pyfory's `abs(hash) & 
0x7FFFFFFFFFFFFFFF`.
   
   Empirical reproduction (fory-core 0.17.0 on every binding, matching config 
`xlang=true, ref=true, compatible=true`, NAMED_COMPATIBLE_STRUCT via 
`(namespace, typename)` registration):
   
   ```python
   # python
   import pyfory, dataclasses
   @dataclasses.dataclass
   class Point:
       x: pyfory.int32 = 0
       y: pyfory.int32 = 0
   f = pyfory.Fory(xlang=True, ref=True, compatible=True)
   f.register_type(Point, namespace='demo', typename='Point')
   print(f.serialize(Point(x=10, y=20)).hex(' '))
   ```
   
   ```java
   // java
   Fory fory = Fory.builder()
       .withLanguage(Language.XLANG)
       .withRefTracking(true)
       .withCompatibleMode(CompatibleMode.COMPATIBLE)
       .build();
   fory.register(Point.class, "demo", "Point");
   byte[] out = fory.serialize(new Point(10, 20));
   ```
   
   ```typescript
   // javascript
   const fory = new Fory({ ref: true, compatible: true });
   const ti = Type.struct({ namespace: 'demo', typeName: 'Point' },
     { x: Type.varInt32(), y: Type.varInt32() },
     { withConstructor: true });
   ti.initMeta(Point);
   const reg = fory.register(Point);
   console.log(Array.from(reg.serialize({ x: 10, y: 20 })));
   ```
   
   Before this PR:
   - python / java / rust / go all produce  
     `02 00 1e 00 10 01 d2 92 ce 5f 2b 73 22 0d 0c 8c 70 13 bd c8 6c c0 40 05 
5c 40 05 60 14 28` (30 bytes)
   - javascript produces  
     `02 ff 1e 00 10 00 00 ad 86 c0 98 d5 23 15 31 12 92 1c d0 2d f6 53 04 e9 
2e c4 92 7b 9b 22 00 58 07 …`  
     The field-descriptor and value bytes align once you get past the preamble, 
but the 8-byte int64 header and the byte-1 reference flag diverge. 
`pyfory.deserialize(jsBytes)` silently returns `Point(x=0, y=0)` (every field 
unmatched, falls through to defaults); `fory.deserialize(jsBytes)` in Java 
throws `DeserializationException: read objects are: [null]`.
   
   After this PR: javascript produces byte-identical output to python / java / 
rust / go, and each binding can decode every other binding's bytes. Ran manual 
round-trip against both pyfory 0.17 and fory-java 0.17 with a Point struct and 
with a richer struct containing strings, a `list<string>`, and int/float fields 
— both succeed.
   
   ## Related issues
   
   - #3602
   
   ## AI Contribution Checklist
   
   - [x] Substantial AI assistance was used in this PR: **no** (a couple of 
lines of constant alignment; no architectural or API decisions)
   - [ ] If `yes`, I included a completed AI Contribution Checklist in this PR 
description and the required `AI Usage Disclosure`.
   - [ ] If `yes`, my PR description includes the required `ai_review` summary 
and screenshot evidence of the final clean AI review results from both fresh 
reviewers on the current PR diff or current HEAD after the latest code changes.
   
   ## Does this PR introduce any user-facing change?
   
   - [x] Does this PR introduce any public API change? — **No.**
   - [x] Does this PR introduce any binary protocol compatibility change? — 
**Yes:** this fixes the JavaScript binding's TypeMeta preamble so it matches 
the canonical wire format the other bindings have been producing. Existing 
`@apache-fory/core` clients communicating only with each other will continue to 
work (same-binding output still round-trips). Any persisted JS-produced bytes, 
or in-flight messages relying on JS-specific preamble, will no longer be 
readable. Given cross-binding interop was broken on 0.17 anyway, practical 
impact should be small.
   
   ## Benchmark
   
   Not applicable — constant alignment with no hot-path change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to