mkreloc: fix obtaining PE image base address

Roger Pau Monné Thu, 10 Apr 2025 00:20:44 -0700

On Tue, Apr 08, 2025 at 02:34:48PM +0200, Jan Beulich wrote:
> On 08.04.2025 13:21, Roger Pau Monné wrote:
> > On Wed, Apr 02, 2025 at 09:46:53AM +0200, Jan Beulich wrote:
> >> x86/EFI: correct mkreloc header (field) reading
> >>
> >> With us now reading the full combined optional and NT headers, the
> >> subsequent reading of (and seeking to) NT header fields is wrong. Since
> >> PE32 and PE32+ NT headers are different anyway (beyond the image base
> >> oddity extending across both headers), switch to using a union. This
> >> allows to fetch the image base more directly then.
> >>
> >> Additionally add checking to map_section(), which would have caught at
> >> least the wrong (zero) image size that we previously used.
> >>
> >> Fixes: f7f42accbbbb ("x86/efi: Use generic PE/COFF structures")
> >> Reported-by: Roger Pau Monné <roger....@citrix.com>
> >> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> >> ---
> >> Of the two checks added to map_section(), the 1st ends up being largely
> >> redundant with the 2nd one. Should we use just the latter?
> >>
> >> Also sanity checking the image base would be possible, but more
> >> cumbersome if we wanted to check moret than just "is in high half of
> >> address space). Therefore I've left out doing so.
> > 
> > We could likely check that image_base >= XEN_VIRT_START?  However I'm
> > not sure how easy it is to make XEN_VIRT_START available to mkreloc.
> 
> This is precisely why I said "more cumbersome".
> 
> >> @@ -54,31 +56,40 @@ static unsigned int load(const char *nam
> >>  
> >>      if ( lseek(in, mz_hdr.peaddr, SEEK_SET) < 0 ||
> >>           read(in, &pe_hdr, sizeof(pe_hdr)) != sizeof(pe_hdr) ||
> >> -         read(in, &pe32_opt_hdr, sizeof(pe32_opt_hdr)) != 
> >> sizeof(pe32_opt_hdr) ||
> >> -         read(in, &base, sizeof(base)) != sizeof(base) ||
> >> -         /*
> >> -          * Luckily the image size field lives at the
> >> -          * same offset for both formats.
> >> -          */
> >> -         lseek(in, 24, SEEK_CUR) < 0 ||
> >> -         read(in, image_size, sizeof(*image_size)) != sizeof(*image_size) 
> >> )
> >> +         (read(in, &pe32_opt_hdr.pe, sizeof(pe32_opt_hdr.pe)) !=
> >> +          sizeof(pe32_opt_hdr.pe)) )
> >>      {
> >>          perror(name);
> >>          exit(3);
> >>      }
> >>  
> >>      switch ( (pe_hdr.magic == PE_MAGIC &&
> >> -              pe_hdr.opt_hdr_size > sizeof(pe32_opt_hdr)) *
> >> -              pe32_opt_hdr.magic )
> >> +              pe_hdr.opt_hdr_size > sizeof(pe32_opt_hdr.pe)) *
> >> +              pe32_opt_hdr.pe.magic )
> >>      {
> >>      case PE_OPT_MAGIC_PE32:
> >>          *width = 32;
> >> -        *image_base = base;
> >> +        *image_base = pe32_opt_hdr.pe.image_base;
> >> +        *image_size = pe32_opt_hdr.pe.image_size;
> >>          break;
> >>      case PE_OPT_MAGIC_PE32PLUS:
> >> -        *width = 64;
> >> -        *image_base = ((uint64_t)base << 32) | pe32_opt_hdr.data_base;
> >> -        break;
> >> +        if ( pe_hdr.opt_hdr_size > sizeof(pe32_opt_hdr.pep) )
> >> +        {
> >> +            if ( read(in,
> >> +                      &pe32_opt_hdr.pe + 1,
> >> +                      sizeof(pe32_opt_hdr.pep) - sizeof(pe32_opt_hdr.pe)) 
> >> !=
> >> +                 sizeof(pe32_opt_hdr.pep) - sizeof(pe32_opt_hdr.pe) )
> >> +            {
> >> +                perror(name);
> >> +                exit(3);
> >> +            }
> >> +
> >> +            *width = 64;
> >> +            *image_base = pe32_opt_hdr.pep.image_base;
> >> +            *image_size = pe32_opt_hdr.pep.image_size;
> >> +            break;
> >> +        }
> > 
> > Since you are already refactoring much of this code, won't it be
> > clearer to fetch the header inside of the switch cases.  So that
> > there's a single read call for each header type?
> 
> Except that the switch() itself uses not only pe_hdr, but also
> pe32_opt_hdr. That could be re-arranged, but I'm a little reluctant to
> do so.


Hm, I see, the magic field checked here is in the extended header, so
we would need to fetch it ahead of the switch in any case.  How
unhelpful.

One thing that I find weird about this code is the obfuscation of the
switch condition, won't it be easier to read as:

if ( pe_hdr.magic != PE_MAGIC ||
     pe_hdr.opt_hdr_size < sizeof(pe32_opt_hdr) )
    fprintf(stderr,
            "%s: Wrong PE magic or missing optional header\n", name);
    exit(3);
}

switch ( pe32_opt_hdr.magic )
{
...

I would assume the current arrangement is done as to reuse the
`default` error label, but IMO that switch condition is too hard to
parse.

> >> @@ -108,11 +119,28 @@ static unsigned int load(const char *nam
> >>  static long page_size;
> >>  
> >>  static const void *map_section(const struct section_header *sec, int in,
> >> -                               const char *name)
> >> +                               const char *name, uint_fast32_t image_size)
> >>  {
> >>      const char *ptr;
> >>      unsigned long offs;
> >>  
> >> +    if ( sec->rva > image_size )
> > 
> > Strictly, should this be >=, as rva is a position, and image_size is a
> > size, so the last allowed bit would be image_size - 1?
> 
> Yes and no. No in so far as this would be wrong for zero-size sections.
> Yet see also the first of the two post-commit-message remarks.

Hm, yes, don't have a strong opinion really, just leave it like that
I guess.

Thanks, Roger.

Re: [PATCH v2 02/11] x86/mkreloc: fix obtaining PE image base address

Reply via email to