On Sunday 10 November 2024 00:14:42 Pali Rohár wrote:
> On Saturday 09 November 2024 21:30:53 Lasse Collin wrote:
> > On 2024-11-09 LIU Hao wrote:
> > > Really, I don't think this should be fixed in the CRT. It should be
> > > fixed by sanitizing the result in `GetCommandLineA()`. Reverting the
> > > commit makes sense if Microsoft will fix it sooner or latter.
> > 
> > My point comment approving a revert was in context that then a better
> > fix will be done. For example, including a GUI dialog to display the
> > error in GUI apps.
> > 
> > > I have a crazy idea now. Does it make sense to overwrite `_acmdln`
> > > (for MSVCRT) or `*__p__acmdln()` (for UCRT) with a sanitized string,
> > > so existent argument parsing may be reused?
> 
> It looks like that both _acmdln and _wcmdln are initialized in CRT DLL
> entry point. And these variables are used by all other calls,
> GetCommandLineA() or GetCommandLineW() are not used later.
> 
> So from this quick look, it should be enough to change _acmdln in
> mingw-w64 startup code as early as possible and then __getmainargs()
> should work fine (it also uses _acmdln and not GetCommandLineA(), at
> least in msvcrt.dll).

I looked also on UCRT source code and seems that this should work.
UCRT's _configure_narrow_argv() also takes command line string from
_acmdln. And _acmdln is initialized in UCRT DLL entry point via
GetCommandLineA().

So in my opinion overwriting content of _acmdln in EXE entry point
should be enough and __getmainargs() then would work correctly.

Note that both _acmdln and _wcmdln are always initialized, for both ANSI
and UNICODE builds. So as a simple sanitization, something like should
be enough?

Iterate over (multibyte) string _acmdln and for every found double quote
check that it exists also at _wcmdln[iter] (where iter is iteration in
multibyte _acmdln string). If double quote is in *_acmdln_iter but not
in _wcmdln[iter] then change *_acmdln_iter to some other character.

This could ensure that any code which parses _acmdln will see only the
original double quotes and not best fits of double quotes.

> > If wildcard expansion is enabled, things can still go wrong if a
> > wildcard matches a filename that cannot be converted losslessly to the
> > active code page.
> 
> Maybe stupid question, but what happens when you try to list folder
> which contains files which names in active code page are all same?
> Imagine that you have an application which does not use argv[] at all,
> it list files in the current directory and from every file prints for
> example first byte. What would happen in this case? Is not here same
> problem as with wildcard expansion?
> 
> > If wildcard expansion is disabled, then it should be enough to verify
> > that GetCommandLineW() can be losslessly converted to the active code
> > page. Then the existing narrow parser should be safe.
> > 
> > -- 
> > Lasse Collin


_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to