emrul opened a new pull request, #3603:
URL: https://github.com/apache/fory/pull/3603
<!-- Apache Fory™ PR template -->
## Why?
`@apache-fory/core`'s `NAMED_COMPATIBLE_STRUCT` TypeMeta preamble is not
byte-compatible with pyfory, fory-java, fory-rust, or fory-go. For the same
logical struct, the JavaScript binding emits an 8-byte int64 header that no
other binding can read. I noticed this as part of issue #3602.
With this patch my cross-language tests pass, but I don't know if this is
entirely correct — I'd appreciate a deeper review from someone who knows the
TypeMeta spec better than I do (especially around the signed-vs-unsigned hash
interpretation in `prependHeader`).
## What does this PR do?
Aligns four constants / behaviours in
`javascript/packages/core/lib/meta/TypeMeta.ts` with what every other xlang
binding does at 0.17:
| Constant / behaviour | JS before | python / java / rust / go | Reference |
|----------------------|-----------|---------------------------|-----------|
| `NUM_HASH_BITS` | `41` | `50` | `python/pyfory/meta/typedef.py:37`,
`java/.../meta/TypeDef.java:77`, `rust/fory-core/src/meta/type_meta.rs:76`,
`go/fory/type_def.go:35` |
| `COMPRESS_META_FLAG` | `1n << 63n` | `1 << 9` | same files |
| `HAS_FIELDS_META_FLAG` | `1n << 62n` | `1 << 8` | same files |
| hash read in `prependHeader` | unsigned `BigInt` built from two `uint32`
halves via `getUint32(0, false) << 32n \| getUint32(4, false)` | signed `int64`
| pyfory unpacks `int64_t[0]`, fory-java `murmurhash3_x64_128(...)[0]` returns
`long`, rust `.0 as i64` |
On the hash read specifically: reading the same 8 bytes as unsigned `BigInt`
never produces a negative value, so the subsequent `abs()` is effectively a
no-op. Whenever the hash's high bit is set, the resulting header diverges from
what the other bindings emit for the same struct. The patch uses
`hash.getBigInt64(0, false)` (signed read) followed by explicit
arbitrary-precision `abs()` + 63-bit mask, mirroring pyfory's `abs(hash) &
0x7FFFFFFFFFFFFFFF`.
Empirical reproduction (fory-core 0.17.0 on every binding, matching config
`xlang=true, ref=true, compatible=true`, NAMED_COMPATIBLE_STRUCT via
`(namespace, typename)` registration):
```python
# python
import pyfory, dataclasses
@dataclasses.dataclass
class Point:
x: pyfory.int32 = 0
y: pyfory.int32 = 0
f = pyfory.Fory(xlang=True, ref=True, compatible=True)
f.register_type(Point, namespace='demo', typename='Point')
print(f.serialize(Point(x=10, y=20)).hex(' '))
```
```java
// java
Fory fory = Fory.builder()
.withLanguage(Language.XLANG)
.withRefTracking(true)
.withCompatibleMode(CompatibleMode.COMPATIBLE)
.build();
fory.register(Point.class, "demo", "Point");
byte[] out = fory.serialize(new Point(10, 20));
```
```typescript
// javascript
const fory = new Fory({ ref: true, compatible: true });
const ti = Type.struct({ namespace: 'demo', typeName: 'Point' },
{ x: Type.varInt32(), y: Type.varInt32() },
{ withConstructor: true });
ti.initMeta(Point);
const reg = fory.register(Point);
console.log(Array.from(reg.serialize({ x: 10, y: 20 })));
```
Before this PR:
- python / java / rust / go all produce
`02 00 1e 00 10 01 d2 92 ce 5f 2b 73 22 0d 0c 8c 70 13 bd c8 6c c0 40 05
5c 40 05 60 14 28` (30 bytes)
- javascript produces
`02 ff 1e 00 10 00 00 ad 86 c0 98 d5 23 15 31 12 92 1c d0 2d f6 53 04 e9
2e c4 92 7b 9b 22 00 58 07 …`
The field-descriptor and value bytes align once you get past the preamble,
but the 8-byte int64 header and the byte-1 reference flag diverge.
`pyfory.deserialize(jsBytes)` silently returns `Point(x=0, y=0)` (every field
unmatched, falls through to defaults); `fory.deserialize(jsBytes)` in Java
throws `DeserializationException: read objects are: [null]`.
After this PR: javascript produces byte-identical output to python / java /
rust / go, and each binding can decode every other binding's bytes. Ran manual
round-trip against both pyfory 0.17 and fory-java 0.17 with a Point struct and
with a richer struct containing strings, a `list<string>`, and int/float fields
— both succeed.
## Related issues
- #3602
## AI Contribution Checklist
- [x] Substantial AI assistance was used in this PR: **no** (a couple of
lines of constant alignment; no architectural or API decisions)
- [ ] If `yes`, I included a completed AI Contribution Checklist in this PR
description and the required `AI Usage Disclosure`.
- [ ] If `yes`, my PR description includes the required `ai_review` summary
and screenshot evidence of the final clean AI review results from both fresh
reviewers on the current PR diff or current HEAD after the latest code changes.
## Does this PR introduce any user-facing change?
- [x] Does this PR introduce any public API change? — **No.**
- [x] Does this PR introduce any binary protocol compatibility change? —
**Yes:** this fixes the JavaScript binding's TypeMeta preamble so it matches
the canonical wire format the other bindings have been producing. Existing
`@apache-fory/core` clients communicating only with each other will continue to
work (same-binding output still round-trips). Any persisted JS-produced bytes,
or in-flight messages relying on JS-specific preamble, will no longer be
readable. Given cross-binding interop was broken on 0.17 anyway, practical
impact should be small.
## Benchmark
Not applicable — constant alignment with no hot-path change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]