IsDBCSLeadByte[Ex] functions take a BYTE argument which is unsigned char, so 
there should be no sign-extension issues. _ismbblead should be affected by 
sign-extension just like isleadbyte.

```
#include <stdio.h>

void as_byte (unsigned char c) {
  printf ("%X\n", c);
}

void as_uint (unsigned c) {
  printf ("%X\n", c);
}

int main (void) {
  char c = (char) 0x80;
  as_byte (c);
  as_uint (c);
  return 0;
}
```

When compiled and run, it prints:

```
80
FFFFFF80
```
________________________________
From: Pali Rohár <[email protected]>
Sent: Thursday, September 25, 2025 5:49 AM
To: [email protected] 
<[email protected]>
Cc: Martin Storsjö <[email protected]>; LIU Hao <[email protected]>; Kirill 
Makurin <[email protected]>
Subject: Re: [PATCH 1/2] crt: Replace _ismbblead() by IsDBCSLeadByte() in 
crtexewin.c

Now I'm thinking about this change. Is not there sign-extend issue too?
Is IsDBCSLeadByte taking as its argument signed char or unsigned char?

According to ms doc:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/ismbblead-ismbblead-l
https://learn.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-isdbcsleadbyte

_ismbblead is taking unsigned int
IsDBCSLeadByte is taking BYTE

I have feeling that explicit cast to (unsigned char) should have been
used in both cases. But could it work without it?

On Wednesday 24 September 2025 18:10:30 Pali Rohár wrote:
> crtexewin.c in non-UNICODE mode parses Windows command line string. This
> string is stored in the ACP. There is no CRT function which guarantees that
> is working in ACP and is checking if the character is a lead byte.
>
> CRT function isleadbyte() uses codepage from CRT's current locale which may
> differs from ACP.
>
> CRT function _ismbblead() uses codepage set by the CRT function _setmbcp()
> which also may differs from ACP (but by default should be ACP).
>
> So when parsing Windows command line arguments, use the WinAPI function
> IsDBCSLeadByte() which always works according to ACP and hence should be
> the right one for this purpose.
> ---
>  mingw-w64-crt/crt/crtexewin.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/mingw-w64-crt/crt/crtexewin.c b/mingw-w64-crt/crt/crtexewin.c
> index af860f3e76da..52d12bd12029 100644
> --- a/mingw-w64-crt/crt/crtexewin.c
> +++ b/mingw-w64-crt/crt/crtexewin.c
> @@ -7,10 +7,6 @@
>  #include <tchar.h>
>  #include <corecrt_startup.h>
>
> -#ifndef _UNICODE
> -#include <mbctype.h>
> -#endif
> -
>  #define SPACECHAR _T(' ')
>  #define DQUOTECHAR _T('\"')
>
> @@ -40,7 +36,7 @@ int _tmain (int      __UNUSED_PARAM(argc),
>            if (*lpCmdLine == DQUOTECHAR)
>              inDoubleQuote = !inDoubleQuote;
>  #ifndef _UNICODE
> -          if (_ismbblead (*lpCmdLine))
> +          if (IsDBCSLeadByte (*lpCmdLine))
>              {
>                if (lpCmdLine[1])
>                  ++lpCmdLine;
> --
> 2.20.1
>

_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to