ayush00git opened a new issue, #3758:
URL: https://github.com/apache/fory/issues/3758

   ## Summary
   
   The C++ row encoder (`cpp/fory/encoder`) passes container element counts 
into the row writers without checking that they fit the integer types the 
writers use. For containers large enough to overflow those types, encoding 
produces out-of-bounds heap writes. These are encode-side bugs, reachable 
whenever an application encodes very large (or attacker-influenced) containers.
   
   ## 1. Container size truncated to `uint32_t` (heap overflow)
   
   `ArrayWriter::reset` takes a `uint32_t` element count, but the encoder 
passes the container's `size()`, which is a `size_t`. On a 64-bit platform, a 
container with `>= 2^32` elements truncates the count. For example, a 
`std::vector<bool>` of `2^32` elements is only about 512 MB of RAM, so this is 
reachable.
   
   After truncation, the writer allocates space for the truncated (small) 
count, but the encoder still loops over **all** the real elements and writes 
each one into the buffer without a bounds check. The result is a heap write far 
past the allocation.
   
   The existing size check inside `ArrayWriter::reset` runs on the 
already-truncated value, so it does not catch this.
   
   ## 2. Signed overflow of the element loop counter
   
   The encoder iterates over container elements with a signed `int` counter. 
For containers with more than `2^31` elements, incrementing that counter is 
signed-overflow undefined behavior, and a wrapped negative index makes the 
writer compute an offset **before** the start of the buffer. This is the same 
class of memory-safety bug as (1), but it triggers at half the element count.
   
   ## Suggested fix
   
   Validate the element count once at the encoder entry points, before any 
writes happen, so an oversized container fails cleanly instead of silently 
truncating. Use an unsigned, `size_t`-wide counter for the element loops (or 
bound the count up front).
   
   ## Environment
   
   - Module: `cpp/fory/encoder` (row encoder) driving `cpp/fory/row` writers.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to