> -----Original Message----- > From: Laszlo Ersek [mailto:ler...@redhat.com] > Sent: Thursday, April 18, 2019 5:39 PM > To: devel@edk2.groups.io; Justen, Jordan L <jordan.l.jus...@intel.com>; > Andrew Fish <af...@apple.com> > Cc: Kinney, Michael D <michael.d.kin...@intel.com>; Gao, Liming > <liming....@intel.com> > Subject: Re: [edk2-devel] [PATCH 02/10] MdePkg/PiFirmwareFile: fix undefined > behavior in SECTION_SIZE > > On 04/17/19 21:35, Jordan Justen wrote: > > On 2019-04-17 07:59:41, Laszlo Ersek wrote: > >> On 04/17/19 13:44, Andrew Fish wrote: > >> > >>> Sorry I digressed into the C specification discussion, and did not > >>> deal with the patch in general. My point is the original code is legal > >>> C code. If you lookup CWE-119 it is written as a restriction on what > >>> the C language allows. > >>> > >>> As I mentioned casting to specific alignment is legal, as is defining > >>> a structure that is pragma pack(1) that can make a UINT32 not be 4 > >>> byte aligned. Thus the cast created a legal UINT32 value. A cast to > >>> *(UINT32 *) is different that a cast to (UINT32 *). The rules you > >>> quote a triggered by the = and not the cast. > >>> > >>> Thus this is undefined behavior in C: > >>> UINT32 *Ub = (UINT32 *)gSection.Sec.Size; > >>> Size = *Ub & 0x00ffffff; > >>> > >>> And this is not Undefined behavior: > >>> UINT32 NotUb = *(UINT32 *)gSection.Sec.Size & 0x00ffffff; > >> > >> I agree the 2nd snippet may not be UB due to alignment violations > >> *alone*. > >> > >> It is still UB due to violating the effective type rules. > >> > >>> I also had a hard time interpreting what C spec was saying, but > >>> talking to the people who write the compiler and ubsan cleared it up > >>> for me. It also makes sense when you think about it. If you tell the > >>> compiler *(UINT32 *) it can know to generate byte reads if the > >>> hardware requires aligned access. If you do a (UINT32 *) that new > >>> pointer no longer carries the information about the alignment > >>> requirement. Thus the *(UINT32 *) cast is like making a packed > >>> structure. > >> > >> Yes, I think I'm clear on how the alignment information is carried > >> around, and when it is lost. In your first example above, due to us > >> forming a separate (standalone) pointer, we lose the alignment > >> information, and then the assignment is undefined due to an alignment > >> violation. While in the second example, the alignment information is not > >> lost, and the assignment is not undefined on an alignment basis *alone*. > >> > >> However: the second assignment is *still* undefined, because it violates > >> the effective type rules. Here's a more direct example for the same: > >> > >> STATIC UINT64 mUint64; > >> > >> int main(void) > >> { > >> UINT16 *Uint16Ptr; > >> > >> Uint16Ptr = (UINT16 *)&mUint64; > >> *Uint16Ptr = 1; > >> return 0; > >> } > >> > >> The assignment to (*Uint16Ptr) is fine from the alignment perspective, > >> but it is nevertheless undefined, because it breaks the effective type > >> rules. Namely, UINT16 (the type used for the access) is not compatible > >> with UINT64 (the effective type of mUint64). > >> > >> Normally, we don't care about this situation in edk2 -- in fact we write > >> loads of code like the above --, but we get away with that only because > >> we force the toolchains to ignore the effective type rules. For GCC in > >> particular, the option is "-fno-strict-aliasing". > >> > >> The only reason I've posted this patch is that "cppcheck" (invoked as a > >> part of "RH covscan") doesn't care about "-fno-strict-aliasing". And > >> while "cppcheck" happens to report the overrun, and not the type > >> punning, the way to remove the warning is to adhere to all the C rules > >> in the expression, even though we have "-fno-strict-aliasing" in place. > >> > >>> I agree the union is a good solution to CWE-119 and it better matches > >>> the alignment requirement in the PI spec. > >> > >> Thank you. > >> > >> I'll wait a bit longer to see if Jordan accepts this 02/10 patch based > >> on the most recent comments, and whether Liming or Mike accepts 04/10. > >> > >> If Jordan remains unconvinced on SECTION_SIZE (in this 02/10 patch), and > >> Liming or Mike are fine with 04/10, I can rework 02/10 to follow 04/10. > >> > >> If Jordan remains unconvinced but Mike or Liming prefers the union, then > >> we have a stalemate and I'll abandon the patch set. > > > > I did have a (slight) concern about adding a typedef to a public > > header that wasn't in the spec. It seemed like something that someone > > somewhere might not like in case it could interfere with future > > versions of the spec. According to Liming, we don't have to worry > > about that. > > > > Regarding the UINT32* discussion, I didn't think the union really > > would make a difference vs skipping the union and casting the original > > struct pointer directly to a UINT32*. > > The gcc docs on "-fstrict-aliasing" seem to suggest the same, using an > example -- but to me that example looks wrong (it seems well-defined, > based on the same rules I quoted from the standard); i.e. that looks > like a bug to me in the gcc docs. I've contacted a colleague at RH in > the toolchain team about it. > > > I can see Andrew's point that the union may add some alignment > > assumptions to the dereference, so I can see how that does potentially > > change something subtle. Maybe on some machines it will allow for more > > efficient reading of the data with the (valid) alignment assumption. > > > > I was also not seeing why you were saying it produced *undefined* > > results. I don't think it does in our case, but when you point out > > that we are aliasing data access, I can see how that quickly gets into > > *undefined* territory from a compiler's perspective. > > Right, I approached this purely from the standard's perspective. The > standard says, in "4. Conformance": > > 1 In this International Standard, ‘‘shall’’ is to be interpreted as a > requirement on an implementation or on a program; conversely, ‘‘shall > not’’ is to be interpreted as a prohibition. > > 2 If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a > constraint is violated, the behavior is undefined. Undefined behavior > is otherwise indicated in this International Standard by the words > ‘‘undefined behavior’’ or by the omission of any explicit definition > of behavior. There is no difference in emphasis among these three; > they all describe ‘‘behavior that is undefined’’. > > In this particular case, 6.5p7 goes "An object *shall* have its stored > value accessed *only* by [...]" (emphasis mine), and that paragraph > doesn't fall in a Constraints section, so 4p2 clearly applies. > > That's why I called it undefined -- it might work perfectly well in > practice on all the edk2 platforms, possibly due to our carefully > selected compiler switches, but that's not what static analyzers care > about. AIUI they only care about the standard, hence my focus on the > standard for the fix. > > ( > > Side note: the classification "outside of a constraint" in 4p2 above is > relevant because we also have: > > 5.1.1.3 Diagnostics > > 1 A conforming implementation shall produce at least one diagnostic > message (identified in an implementation-defined manner) if a > preprocessing translation unit or translation unit contains a > violation of any syntax rule or constraint, even if the behavior is > also explicitly specified as undefined or implementation-defined. > [...] > > IOW, if we break a "shall" or "shall not" that is under a Constraint, > then the compiler *must* yell at us; it cannot allow the UB to slip our > attention. > > ) > > > Anyway, given Liming's feedback that it is ok to add the union, I'm ok > > with this patch. > > Funnily enough, in parallel, Mike has expressed a preference for the > byte-wise access plus the shifting (without helper macros) :) > > I don't see why Liming would dislike that, and I gather you too would > still prefer that to the union, so I'm going to incorporate it in v2. > I am OK for both solution. I don't think union breaks something. I agree byte access way is the consistent solution in section header and ffs header. So, I am OK for them both.
> Thank you! > Laszlo > > > > > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#39304): https://edk2.groups.io/g/devel/message/39304 Mute This Topic: https://groups.io/mt/31070302/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-