On 5/9/2025 3:08 PM, Burakov, Anatoly wrote:
On 5/9/2025 3:02 PM, Burakov, Anatoly wrote:
On 5/8/2025 3:16 PM, Anatoly Burakov wrote:
Remove custom number parser and use C standard library instead. In order to
keep compatibility with earlier versions of the parser, we have to take
into account a couple of quirks:

- We did not consider "negative" numbers to be valid for anything other
   than base-10 numbers, whereas C standard library does. Adjust the tests
   to match the new behavior.
- We did not consider numbers such as "+4" to be valid, whereas C
   standard library does. Adjust the tests to match the new behavior.
- C standard library's strtoull does not do range checks on negative
   numbers, so we have to parse knowingly-negative numbers as signed.
- C standard library does not support binary numbers, so we keep around the    relevant parts of the custom parser in place to support binary numbers.

Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com>
---

Notes:
     v7 -> v8:
     - Added the commented-out out-of-bounds check back
     - Replaced debug print messages to ensure they don't attempt to
       index the num_help[] array (should fix compile errors)
     v5 -> v6:
     - Allowed more negative numbers (such as negative octals or hex)
     - Updated unit tests to check new cases
     - Small refactoring of code to reduce amount of noise
     - More verbose debug output
     v4 -> v5:
     - Added this commit

There is a unit test failure coming specifically from this commit, that only happens on ARM. Log:

Error: parsing -0b0111010101 as INT16 succeeded!

That is, when confronted with a negative binary string, it seems that strtoll will report success, whereas other platforms report failure. I'm confused, is libc strtoll different on ARM? I don't have an ARM platform available to test this so I don't know why this is happening.


Correction: it seems that newer libc versions have added support for binary formats. I'll therefore amend the tests to account for that.


The specific announcement regarding binary formatted strings:

https://sourceware.org/pipermail/libc-alpha/2023-July/150524.html

Since our binary parser will be a fallback implementation starting from that version of libc, I think it would be easier to just add negative binary support to our custom parser than it would be to differentiate between different libc versions, and support negative binary formats regardless.

--
Thanks,
Anatoly

Reply via email to