On 07.07.2025 17:51, Frediano Ziglio wrote:
> On Mon, Jul 7, 2025 at 4:42 PM Jan Beulich <jbeul...@suse.com> wrote:
>>
>> On 07.07.2025 17:11, Frediano Ziglio wrote:
>>> EFI code path split options from EFI LoadOptions fields in 2
>>> pieces, first EFI options, second Xen options.
>>> "get_argv" function is called first to get the number of arguments
>>> in the LoadOptions, second, after allocating enough space, to
>>> fill some "argc"/"argv" variable. However the first parsing could
>>> be different from second as second is able to detect "--" argument
>>> separator. So it was possible that "argc" was bigger that the "argv"
>>> array leading to potential buffer overflows, in particular
>>> a string like "-- a b c" would lead to buffer overflow in "argv"
>>> resulting in crashes.
>>> Using EFI shell is possible to pass any kind of string in
>>> LoadOptions.
>>>
>>> Fixes: 201f261e859e ("EFI: move x86 boot/runtime code to common/efi")
>>
>> This only moves the function, but doesn't really introduce any issue afaics.
>>
> 
> Okay, I'll follow the rename
> 
>>> --- a/xen/common/efi/boot.c
>>> +++ b/xen/common/efi/boot.c
>>> @@ -345,6 +345,7 @@ static unsigned int __init get_argv(unsigned int argc, 
>>> CHAR16 **argv,
>>>                                      VOID *data, UINTN size, UINTN *offset,
>>>                                      CHAR16 **options)
>>>  {
>>> +    CHAR16 **const orig_argv = argv;
>>>      CHAR16 *ptr = (CHAR16 *)(argv + argc + 1), *prev = NULL, *cmdline = 
>>> NULL;
>>>      bool prev_sep = true;
>>>
>>> @@ -384,7 +385,7 @@ static unsigned int __init get_argv(unsigned int argc, 
>>> CHAR16 **argv,
>>>                  {
>>>                      cmdline = data + *offset;
>>>                      /* Cater for the image name as first component. */
>>> -                    ++argc;
>>> +                    ++argv;
>>
>> We're on the argc == 0 and argv == NULL path here. Incrementing NULL is UB,
>> if I'm not mistaken.
> 
> Not as far as I know. Why?

Increment and decrement operators are like additions. For additions the standard
says: "For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have integer
type." Neither of the alternatives is true for NULL.

> Some systems even can use NULL pointers as valid, like mmap.

Right, but that doesn't make the use of NULL C-compliant.

>>> @@ -402,7 +403,7 @@ static unsigned int __init get_argv(unsigned int argc, 
>>> CHAR16 **argv,
>>>          {
>>>              if ( cur_sep )
>>>                  ++ptr;
>>> -            else if ( argv )
>>> +            else if ( orig_argv )
>>>              {
>>>                  *ptr = *cmdline;
>>>                  *++ptr = 0;
>>> @@ -410,8 +411,8 @@ static unsigned int __init get_argv(unsigned int argc, 
>>> CHAR16 **argv,
>>>          }
>>>          else if ( !cur_sep )
>>>          {
>>> -            if ( !argv )
>>> -                ++argc;
>>> +            if ( !orig_argv )
>>> +                ++argv;
>>>              else if ( prev && wstrcmp(prev, L"--") == 0 )
>>>              {
>>>                  --argv;
>>
>> As per this, it looks like that on the 1st pass we may indeed overcount
>> arguments. But ...
>>
> 
> I can use again argc if you prefer, not strong about it.
> 
>>> @@ -428,9 +429,9 @@ static unsigned int __init get_argv(unsigned int argc, 
>>> CHAR16 **argv,
>>>          }
>>>          prev_sep = cur_sep;
>>>      }
>>> -    if ( argv )
>>> +    if ( orig_argv )
>>>          *argv = NULL;
>>> -    return argc;
>>> +    return argv - orig_argv;
>>>  }
>>>
>>>  static EFI_FILE_HANDLE __init get_parent_handle(const EFI_LOADED_IMAGE 
>>> *loaded_image,
>>> @@ -1348,8 +1349,8 @@ void EFIAPI __init noreturn efi_start(EFI_HANDLE 
>>> ImageHandle,
>>>                                    (argc + 1) * sizeof(*argv) +
>>>                                        loaded_image->LoadOptionsSize,
>>>                                    (void **)&argv) == EFI_SUCCESS )
>>> -            get_argv(argc, argv, loaded_image->LoadOptions,
>>> -                     loaded_image->LoadOptionsSize, &offset, &options);
>>> +            argc = get_argv(argc, argv, loaded_image->LoadOptions,
>>> +                            loaded_image->LoadOptionsSize, &offset, 
>>> &options);
>>
>> ... wouldn't this change alone cure that problem? And even that I don't
>> follow. Below here we have
>>
>>         for ( i = 1; i < argc; ++i )
>>         {
>>             CHAR16 *ptr = argv[i];
>>
>>             if ( !ptr )
>>                 break;
>>
>> and the 2nd pass of get_argv() properly terminates the (possibly too large)
>> array with a NULL sentinel. So I wonder what it is that I'm overlooking and
>> that is broken.
> 
> I realized that because I got a crash, not just by looking at the code.
> 
> The string was something like "-- a b c d":

That's in the "plain command line" case or the LOAD_OPTIONS one? In the
former case the image name should come first, aiui. And in the latter case
the 2nd pass sets argv[0] to NULL very early, increments the pointer, and
hence at the bottom of the function argv[1] would also be set to NULL.
Aiui at least, i.e. ...

> - the first get_argv call produces a 5 argc;
> - you allocate space for 6 pointers and length of the entire string to copy;
> - the parser writes a single pointer in argv and returns still 5 as argc;
> - returned argc is ignored;
> - code "for (i = 1; i < argc; ++i)" starts accessing argv[1] which is
> not initialized, in case of garbage you dereference garbage.

... I don't see how argv[1] can hold garbage.

Jan

Reply via email to