On 3/15/2024 4:45 PM, Marvin Häuser wrote:
On 15. Mar 2024, at 23:57, Oliver Smith-Denny <o...@linux.microsoft.com> wrote:
I don't think this is what I'm saying. What I am trying to say is that
on MSVC, I see PE images getting created that have VirtualSize set to
the actual number of initialized bytes in that section (not padded to
the section alignment). On ElfConverted binaries, I see the VirtualSize
is padded to the section alignment. I've dropped an example below
Ah, mismatched terminology. Zero-initialized as Ard and I used it refers to
implicitly or explicitly 0-initialized global variables and such, which is not
stored in the file, not the padding. So when you mentioned “real data”, I
assumed you meant strictly the non-0 data from the file. Same misunderstanding
with SizeOfImage, so that’s all fine. Whew. :)
Ah gotcha, thanks I was figuring I was using the wrong terminology :).
What you said makes sense and I see your concern based on what I said.
No, the specific case where I was researching this was explicitly
setting /ALIGN:0x10000 and /FILEALIGN:0x1000 for DXE_RUNTIME_DRIVERs
on ARM64 (a UEFI spec requirement). So I would see the SizeOfRawData
is aligned to the file alignment, as expected, but VirtualSize would
be the actual size of the data. Again, the troubling thing here for
me is that the same binary built with gcc has the VirtualSize aligned
to the section alignment. And I have seen other code that loads PE
images that relies on VirtualSize not including the padding. The spec
is vague here, it says VirtualSize is the size of the section as
loaded in memory (which would lead me to believe this should include
padding) but it does not explicitly say it should be a multiple of
the section alignment (as other fields do). But at a minimum I think
we should have different toolchains doing the same behavior here.
Well, not rounding to pad is somewhat superior in some scenarios. If you round,
you lose the information on what is section data and what is padding, so you
might end up treating padding as data for some reason (because it is
indistinguishable from mentioned 0-initialized data). This shouldn’t matter too
much for executables and libraries, but MSVC/PE have a lot less of a
distinction between object file and executable/library concepts (e.g. no
distinction between sections and segments). That might be why they do it this
way.
I agree, I've seen other environments that will use VirtualSize to set
attributes for that section and then set stricter attributes, like RP
on the padded section (if greater than a page). Point being that there
are use cases and at least some folks relying on that definition of
VirtualSize.
See below for the VirtualSize examples, I'm confused on your comment on
SizeOfImage. I agree that SizeOfImage covers everything as loaded into
memory and I have not seen any issues there.
See first comment.
Do you mind adding your RB to v2? And certainly if you have any other
comments that is greatly appreciated.
Will try to remember tomorrow. :)
Thanks!
Examples of the differences I see between MSVC and gcc binaries:
I originally noticed this on ARM64 on edk2, but wanted to make sure I
saw it on x64 too, so this is with binaries from Project Mu's QemuQ35Pkg
(edk2 doesn't have VS2022 support and I didn't feel like adding it
or reverting back to VS2019). For reference, this is building the
current top of tree at a4dd5d9785f48302c95fec852338d6bd26dd961a.
I dumped ReportStatusCodeRouterRuntimeDxe.efi from both using dumpbin
(from VS2022) to examine the PE headers.
MSVC selected header values:
Main header:
0x3200 size of code
0x2400 size of initialized data
0x0 size of uninitialized data
0x1000 section alignment
0x200 file alignment
0xB000 size of image
6 sections: .data, .pdata, .rdata, .reloc, .text, .xdata
.text section:
0x30DF virtual size
0x3200 size of raw data
.data section:
0x130 virtual size
0x200 size of raw data
GCC ElfConverted selected header values:
Main header:
0x4000 size of code
0x1000 size of initialized data
0x0 size of uninitialized data
0x1000 section alignment
0x1000 file alignment
0x7000 size of image
3 sections: .data, .text, .reloc
.text section:
0x4000 virtual size
0x4000 size of raw data
.data section:
0x1000 virtual size
0x1000 size of raw data
So my concern here is that ElfConvert takes a
different view of VirtualSize, that is should be
section aligned, whereas MSVC binaries take
VirtualSize to be the actual size without padding.
I think the correct thing to do would be change
ElfConvert to do what MSVC does (although the spec
is vague and so I can understand the confusion).
I don’t think it really matters, but it wouldn’t hurt either. Both kinds of
binaries are in the wild, so you cannot really leverage any of the choices’
advantages either way. Adjusting to MSVC’s behaviour would be right though, as
you can at least properly distinguish between padding and 0-data with new
binaries.
Yeah, I agree. I may take a look at this, but to your
point from an earlier email, it may be safer to leave
it as is, considering we have all these binaries in the
wild already and changes in a broken environment are
risky.
Thanks,
Oliver
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#116823): https://edk2.groups.io/g/devel/message/116823
Mute This Topic: https://groups.io/mt/104610770/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-