This is an automated email from the ASF dual-hosted git repository. chaokunyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/fory-site.git
commit 0951a42fb9e0e9e7a437b25b819d8ff6321c1481 Author: chaokunyang <[email protected]> AuthorDate: Fri Jun 5 06:01:28 2026 +0000 🔄 synced local 'docs/specification/' with remote 'docs/specification/' --- docs/specification/java_serialization_spec.md | 72 +++++++++++++ docs/specification/xlang_implementation_guide.md | 21 ++++ docs/specification/xlang_serialization_spec.md | 124 +++++++++++++++++++++++ docs/specification/xlang_type_mapping.md | 6 +- 4 files changed, 222 insertions(+), 1 deletion(-) diff --git a/docs/specification/java_serialization_spec.md b/docs/specification/java_serialization_spec.md index 6ca5f9a7d5..2cd160ebaf 100644 --- a/docs/specification/java_serialization_spec.md +++ b/docs/specification/java_serialization_spec.md @@ -221,6 +221,78 @@ local fields against remote ClassDef fields by identifier, read matching fields, and skip unknown fields using the remote field type metadata. Compatible mode is the Java native schema-evolution path. +In compatible mode, a matched field may read between direct top-level scalar +ClassDef schemas when the remote value can be represented by the local scalar +schema without changing the logical value. This is a read adaptation only: +writers keep emitting their local canonical field schema and payload, and +ClassDef metadata, schema-consistent mode, dynamic value serialization, and +unknown-field skipping continue to treat the original field schemas as distinct. + +The rule applies only to the immediate schema of a matched field. It does not +apply to dynamic root values, map keys, map values, collection elements, array +elements, enum values, temporal values, binary values, structs, or nested +generic/container positions. + +The scalar domains are Java boolean/boxed boolean, `String`, Java primitive and +boxed numeric scalar fields, Fory scalar annotations whose ClassDef metadata +identifies a narrower numeric wire domain, and `BigDecimal` as the exact decimal +numeric scalar. Java native-only metadata outside the type IDs shared with xlang +must still use the Java ClassDef metadata to identify the scalar domain. +Compatible scalar conversion applies only when both the remote and local +top-level ClassDef field metadata have `trackingRef = false`; if either matched +field has `trackingRef = true`, scalar type changes are schema/type incompatible +during compatible layout construction. Same scalar ClassDef field types with +matching top-level `trackingRef` and null/optional framing are exact same-schema +direct reads, not compatible scalar conversion. Same scalar ClassDef field types +with different top-level `trackingRef` framing are schema/type incompatible +because the wire framing differs. Same scalar ClassDef field types with +different top-level null/optional framing may still use the nullable/optional +composition rule below when both fields have `trackingRef = false`. + +Compatible scalar conversion follows the xlang scalar conversion contract: + +- `String` to boolean accepts exactly `"0"`, `"1"`, `"false"`, and `"true"`. + Boolean to `String` produces `"false"` or `"true"`. +- numeric to boolean accepts only exact zero and one. Boolean to numeric + produces exact zero or one in the local numeric domain. +- numeric to numeric succeeds only when the local numeric domain represents the + same mathematical value, including range checks, signedness checks, exact + integer/floating round-trip checks, floating signed-zero preservation, and + rejection of `NaN` across different floating type IDs. +- `BigDecimal` participates as an exact numeric scalar. Converted decimal values + use canonical scale: zero and non-zero integers use scale `0`; finite + fractional values use the smallest non-negative scale that preserves the + mathematical value and leaves an unscaled value not divisible by `10`. + Compatible conversion rejects numeric strings longer than `320` bytes before + arbitrary precision parsing. It also rejects converted decimal values whose + canonical exponent or scale work exceeds the `256` digit bound before + constructing large powers of ten or formatting plain decimal text. Same-type + `BigDecimal` reads preserve the ordinary decimal payload. +- `String` to numeric accepts only the finite compatible numeric literal grammar + from the xlang serialization spec and then applies the same lossless + target-domain checks. `NaN`, infinities, whitespace, leading plus signs, + Unicode decimal digits, underscores, grouping separators, non-decimal radices, + and type suffixes fail. +- numeric to `String` emits canonical finite numeric text: integers use plain + decimal text, floating values use exact plain decimal text with a decimal point + and signed-zero preservation, and `BigDecimal` values use exact plain decimal + text without exponent notation or insignificant trailing fractional zeros. + +Nullable fields, boxed carriers, and primitive defaults compose with scalar +conversion when the matched top-level field schemas have `trackingRef = false`. +Readers first consume the remote null/optional framing described by the remote +ClassDef field metadata. Present values are converted and then assigned or +wrapped into the local carrier. Null or absent remote values use the same +compatible-mode missing/null behavior already defined for the local field. +Reference-tracked scalar conversion is not supported. + +Schema pairs outside the scalar conversion matrix remain schema/type +compatibility errors while building the compatible layout. Once a matched field +is accepted as a scalar conversion action, invalid payload values are +deserialization data errors and must be reported as +`org.apache.fory.exception.DeserializationException`, not as schema misses or +registration errors. + ## Field Order Java native object serializers use the same deterministic field-order diff --git a/docs/specification/xlang_implementation_guide.md b/docs/specification/xlang_implementation_guide.md index ba1ee0a9c3..c54eb7687e 100644 --- a/docs/specification/xlang_implementation_guide.md +++ b/docs/specification/xlang_implementation_guide.md @@ -306,6 +306,8 @@ In Dart that internal owner is `StructSerializer`. - caching compatible read layouts - skipping unknown compatible fields - passing compatible read layouts explicitly to generated serializers +- classifying matched compatible fields that require top-level scalar + conversion and routing those fields through cold conversion helpers When `Config.compatible` is enabled and the struct is marked evolving: @@ -314,12 +316,31 @@ When `Config.compatible` is enabled and the struct is marked evolving: - reads map incoming fields by identifier and skip unknown fields - generated serializers apply matched fields directly while preserving their own object construction and default-value rules +- matched scalar fields may use compatible scalar conversion only when the + layout has classified a remote/local top-level scalar pair as lossless + convertible and both field schemas have `trackingRef = false` When `compatible` is disabled and `checkStructVersion` is enabled: - the runtime writes the schema hash for struct payloads - the read side checks that hash before reading fields +Compatible scalar conversion is owned by the compatible struct field reader or +the generated compatible layout action. Root facades, read/write contexts, type +resolvers, class resolvers, xlang type resolvers, and raw buffer utilities must +not expose public conversion APIs or carry conversion state. Resolvers may +provide field schema metadata for layout classification, but the conversion +decision and value adaptation stay with the serializer-owned compatible field +layout. Layout classification must reject top-level scalar conversions when +either matched schema has `trackingRef = true` and must reject same scalar type +pairs whose top-level `trackingRef` framing differs; converters must not add a +reference-table path for scalar mismatches. Same-schema readers with matching +reference and null/optional framing and schema-consistent readers must keep +direct scalar read paths without conversion branches or per-field conversion +objects. Same raw scalar types with different null/optional framing may still +use the compatible nullable/optional composition path when both fields are not +reference-tracked. + ## Meta Strings And Shared Type Metadata Two explicit pieces of state back xlang type metadata: diff --git a/docs/specification/xlang_serialization_spec.md b/docs/specification/xlang_serialization_spec.md index 482de9bdf2..6328b43a41 100644 --- a/docs/specification/xlang_serialization_spec.md +++ b/docs/specification/xlang_serialization_spec.md @@ -205,6 +205,130 @@ as a dense array element value, the local `array<T>` field must raise a compatible-read error. Null list elements must not be coerced to dense-array default values. +In schema-compatible mode only, a matched struct/class field may also read +between direct top-level scalar schemas when the remote value can be represented +by the local scalar schema without changing the logical value. This is a +compatible read adaptation only: writers keep emitting their local canonical +schema and payload, and TypeDef/ClassDef encodings, fingerprints, dynamic root +serialization, schema-consistent mode, unknown-field skipping, and container +element schemas continue to treat the original scalar types as distinct. + +The scalar conversion rule applies only to the immediate schema of the matched +compatible field. It does not apply to dynamic root values, `any`, map keys, map +values, list elements, set elements, array elements, union alternatives, enum +values, time/date/duration values, binary values, structs, ext values, or nested +generic/container positions. It also applies only when both the remote and local +top-level field schemas have `trackingRef = false`; if either matched field +schema has `trackingRef = true`, scalar conversion is outside the compatible +layout matrix and scalar type changes remain schema/type incompatible. Same +scalar type IDs with matching top-level `trackingRef` and null/optional framing +are exact same-schema direct reads, not compatible scalar conversion. Same +scalar type IDs with different top-level `trackingRef` framing are schema/type +incompatible because the wire framing differs. Same scalar type IDs with +different top-level null/optional framing may still use the nullable/optional +composition rule below when both fields have `trackingRef = false`. + +The convertible scalar domains are `bool`, `string`, and numeric scalars. +Numeric scalars are signed integers (`int8`, `int16`, `int32`, `int64`), +unsigned integers (`uint8`, `uint16`, `uint32`, `uint64`), floating point +(`float16`, `bfloat16`, `float32`, `float64`), and `decimal`. Integer encoding +variants are the same semantic domain as their base width: fixed, variable, and +tagged integer encodings do not create additional conversion domains. + +Compatible scalar conversion MUST follow these rules: + +- `string` to `bool` accepts exactly `"0"`, `"1"`, `"false"`, and `"true"`. + The match is byte-for-byte ASCII; readers MUST NOT trim whitespace, accept a + leading sign, accept other letter case, or use locale-specific text. +- `bool` to `string` produces canonical lower-case `"false"` or `"true"`. +- numeric to `bool` accepts only exact numeric zero and exact numeric one. `NaN` + and infinities fail. Negative floating zero is zero. Decimal scale does not + affect the zero/one check. +- `bool` to numeric produces exact zero or one in the local numeric domain. +- numeric to numeric succeeds only when the local numeric domain represents the + same mathematical value. Integer conversions check target range and signedness; + integer-to-floating conversions check exact representability in the target + floating domain; floating-to-integer conversions require a finite integral + value within range; floating-to-floating conversions require exact + preservation after converting to the target and back to the source, including + the sign of zero. Floating infinities may convert only when the target floating + domain preserves the same infinity. `NaN` is not convertible across different + floating type IDs. +- decimal is an exact numeric scalar. Integer-to-decimal conversion uses scale + `0`; decimal-to-integer conversion requires an integral value in range; + floating-to-decimal conversion requires a finite value and converts the exact + binary floating value to canonical decimal form; decimal-to-floating + conversion requires exact representability in the target floating domain. + Same-type decimal reads preserve the ordinary decimal payload. Decimal values + produced by conversion use the canonical converted decimal form below. +- `string` to numeric accepts only the compatible numeric literal grammar below + and then applies the same lossless target-domain checks. `"NaN"`, + `"Infinity"`, `"-Infinity"`, and spelling variants fail because numeric + strings are finite-only. +- numeric to `string` emits canonical finite numeric text. Integer sources emit + decimal text with no leading zeros except `"0"`. Floating sources emit exact + plain decimal text that equals the source value and parses back to the same + source floating type; it includes a decimal point and at least one fractional + digit, preserves negative zero as `"-0.0"`, never uses exponent notation, and + fails for `NaN` and infinities. Decimal sources emit exact plain decimal text + with no exponent and no insignificant trailing fractional zeros; decimal zero + is `"0"`. + +The compatible numeric literal grammar is deliberately stricter than host +language parsers: + +- no leading or trailing whitespace; +- no leading plus sign; +- ASCII grammar only: signs, digits, decimal points, and exponent markers are + the ASCII bytes `-`, `0` through `9`, `.`, `e`, and `E`; +- no Unicode decimal digits, underscores, grouping separators, locale-specific + digits, hexadecimal, octal, binary, or type suffixes; +- integer literal: `-?(0|[1-9][0-9]*)`; +- decimal floating literal: + `-?(0|[1-9][0-9]*)\.[0-9]+([eE]-?(0|[1-9][0-9]*))?` or + `-?(0|[1-9][0-9]*)[eE]-?(0|[1-9][0-9]*)`. + +Readers MUST parse numeric strings with exact decimal, rational, or equivalent +checked algorithms. Parsing through a host floating type and then casting is not +valid unless the implementation also proves exactness against the original +literal. + +Canonical converted decimal form is: + +- zero: `unscaled = 0`, `scale = 0`; +- non-zero integers: `scale = 0` and the integer as `unscaled`; +- finite fractional values: the smallest non-negative scale whose + `unscaled * 10^-scale` equals the value and whose `unscaled` is not divisible + by `10`. + +Compatible scalar conversion MUST reject a numeric string before arbitrary +precision parsing when the raw string length is greater than `320`. It MUST also +reject a converted decimal before constructing large powers of ten or formatting +plain decimal text when its canonical converted form would require an exponent or +scale outside `[-256, 256]`, a positive scale greater than `256`, an unscaled +decimal magnitude with more than `256` significant digits, or a negative scale +whose formatted integer digit count would exceed `256`. These bounds apply only +to values produced by compatible scalar conversion, including string-to-decimal, +decimal-to-string, and floating-to-decimal conversion. Same-type decimal reads +preserve the ordinary decimal payload. A bounded public decimal carrier may +reject smaller values when it cannot represent the value exactly. + +Nullable, boxed, optional, and nullable-field composition is supported for +matched scalar pairs whose top-level field schemas have `trackingRef = false`. +Readers first consume the remote null/optional framing described by the remote +field metadata. If a value is present, the reader converts the unwrapped scalar +value and then assigns or wraps it into the local carrier. If the remote value +is null or absent, the runtime uses the same missing/null compatible-field rule +it already applies for that local field; this feature does not introduce a +second null policy. Reference-tracked scalar conversion is not supported. + +Conversion failures are data errors, not schema misses. A schema pair outside +the conversion matrix remains a schema/type compatibility error when building +the compatible layout. Once a matched field is accepted as a scalar conversion +action, an invalid payload value MUST be reported through the runtime's +data-error owner with enough context to identify the remote type, local type, +and field when that owner has the information. + Users can also provide meta hints for fields of a type, or the type whole. Here is an example in java which use annotation to provide such information. diff --git a/docs/specification/xlang_type_mapping.md b/docs/specification/xlang_type_mapping.md index ba29e27d6c..64ed25ea3d 100644 --- a/docs/specification/xlang_type_mapping.md +++ b/docs/specification/xlang_type_mapping.md @@ -93,7 +93,7 @@ FDL spells them as an encoding modifier plus a semantic integer type. | duration | 37 | Duration | timedelta | Number | duration | Duration | Duration | TimeSpan | Duration | Duration | java.time.Duration | kotlin.time.Duration | | timestamp | 38 | Instant | datetime | Number | std::chrono::nanoseconds | Time | Timestamp | DateTime/DateTimeOffset | Date | Timestamp | java.time.Instant | java.time.Instant | | date | 39 | LocalDate | datetime.date | Date | fory::serialization::Date | fory.Date | Date | DateOnly | LocalDate | LocalDate | java.time.LocalDate | java.time.LocalDate | -| decimal | 40 | BigDecimal | Decimal | Decimal | / | fory.Decimal | fory::Decimal | decimal | Decimal | Decimal | java.math.BigDecimal | java.math.BigDecimal | +| decimal | 40 | BigDecimal | Decimal | Decimal | fory::serialization::Decimal | fory.Decimal | fory::Decimal | decimal | Decimal | Decimal | java.math.BigDecimal | java.math.BigDecimal | | binary | 41 | byte[] | bytes | / | `uint8_t[n]/vector<T>` | `[n]uint8/[]T` | `Vec<u8>` | byte[] | Data | Uint8List | Array[Byte] | ByteArray | | `array<bool>` (bool_array) | 43 | bool[] | BoolArray / ndarray(np.bool\_) | BoolArray / Type.boolArray() | `bool[n]` | `[n]bool/[]T` | `Vec<bool>` | bool[] | [Bool] + @ArrayField | BoolList | Array[Boolean] | BooleanArray | | `array<int8>` (int8_array) | 44 | `@Int8Type byte[]` | Int8Array / ndarray(int8) | Type.int8Array() | `int8_t[n]/vector<T>` | `[n]int8/[]T` | `Vec<i8>` | sbyte[] | [Int8] + @ArrayField | Int8List | Array[Byte] + metadata | ByteArray + @ArrayType | @@ -144,6 +144,10 @@ Notes: not apply inside nested collection, map, array, union, or generic positions. A peer `list<T>` payload that declares nullable or ref-tracked elements must raise a compatible-read error when the local matched field is `array<T>`. +- The table above remains the canonical xlang schema mapping. Compatible readers may apply the + scalar field adaptation rules defined by `xlang_serialization_spec.md` during schema-compatible + struct/class field matching. Those rules do not change TypeDef metadata, dynamic root type + mapping, schema-consistent mode, or nested collection/map/array/union/generic positions. ### Scala IDL Mapping --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
