> -----Original Message-----
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: Thursday, April 18, 2019 5:39 PM
> To: devel@edk2.groups.io; Justen, Jordan L <jordan.l.jus...@intel.com>; 
> Andrew Fish <af...@apple.com>
> Cc: Kinney, Michael D <michael.d.kin...@intel.com>; Gao, Liming 
> <liming....@intel.com>
> Subject: Re: [edk2-devel] [PATCH 02/10] MdePkg/PiFirmwareFile: fix undefined 
> behavior in SECTION_SIZE
> 
> On 04/17/19 21:35, Jordan Justen wrote:
> > On 2019-04-17 07:59:41, Laszlo Ersek wrote:
> >> On 04/17/19 13:44, Andrew Fish wrote:
> >>
> >>> Sorry I digressed into the C specification discussion, and did not
> >>> deal with the patch in general. My point is the original code is legal
> >>> C code. If you lookup CWE-119 it is written as a restriction on what
> >>> the C language allows.
> >>>
> >>> As I mentioned casting to specific alignment is legal, as is defining
> >>> a structure that is pragma pack(1) that can make a UINT32 not be 4
> >>> byte aligned. Thus the cast created a legal UINT32 value. A cast to
> >>> *(UINT32 *) is different that a cast to (UINT32 *). The rules you
> >>> quote a triggered by the = and not the cast.
> >>>
> >>> Thus this is undefined behavior in C:
> >>>   UINT32 *Ub = (UINT32 *)gSection.Sec.Size;
> >>>   Size = *Ub & 0x00ffffff;
> >>>
> >>> And this is not Undefined behavior:
> >>>   UINT32 NotUb = *(UINT32 *)gSection.Sec.Size & 0x00ffffff;
> >>
> >> I agree the 2nd snippet may not be UB due to alignment violations
> >> *alone*.
> >>
> >> It is still UB due to violating the effective type rules.
> >>
> >>> I also had a hard time interpreting what C spec was saying, but
> >>> talking to the people who write the compiler and ubsan cleared it up
> >>> for me. It also makes sense when you think about it. If you tell the
> >>> compiler *(UINT32 *) it can know to generate byte reads if the
> >>> hardware requires aligned access. If you do a (UINT32 *) that new
> >>> pointer no longer carries the information about the alignment
> >>> requirement. Thus the  *(UINT32 *) cast is like making a packed
> >>> structure.
> >>
> >> Yes, I think I'm clear on how the alignment information is carried
> >> around, and when it is lost. In your first example above, due to us
> >> forming a separate (standalone) pointer, we lose the alignment
> >> information, and then the assignment is undefined due to an alignment
> >> violation. While in the second example, the alignment information is not
> >> lost, and the assignment is not undefined on an alignment basis *alone*.
> >>
> >> However: the second assignment is *still* undefined, because it violates
> >> the effective type rules. Here's a more direct example for the same:
> >>
> >> STATIC UINT64 mUint64;
> >>
> >> int main(void)
> >> {
> >>   UINT16 *Uint16Ptr;
> >>
> >>   Uint16Ptr = (UINT16 *)&mUint64;
> >>   *Uint16Ptr = 1;
> >>   return 0;
> >> }
> >>
> >> The assignment to (*Uint16Ptr) is fine from the alignment perspective,
> >> but it is nevertheless undefined, because it breaks the effective type
> >> rules. Namely, UINT16 (the type used for the access) is not compatible
> >> with UINT64 (the effective type of mUint64).
> >>
> >> Normally, we don't care about this situation in edk2 -- in fact we write
> >> loads of code like the above --, but we get away with that only because
> >> we force the toolchains to ignore the effective type rules. For GCC in
> >> particular, the option is "-fno-strict-aliasing".
> >>
> >> The only reason I've posted this patch is that "cppcheck" (invoked as a
> >> part of "RH covscan") doesn't care about "-fno-strict-aliasing". And
> >> while "cppcheck" happens to report the overrun, and not the type
> >> punning, the way to remove the warning is to adhere to all the C rules
> >> in the expression, even though we have "-fno-strict-aliasing" in place.
> >>
> >>> I agree the union is a good solution to CWE-119 and it better matches
> >>> the alignment requirement in the PI spec.
> >>
> >> Thank you.
> >>
> >> I'll wait a bit longer to see if Jordan accepts this 02/10 patch based
> >> on the most recent comments, and whether Liming or Mike accepts 04/10.
> >>
> >> If Jordan remains unconvinced on SECTION_SIZE (in this 02/10 patch), and
> >> Liming or Mike are fine with 04/10, I can rework 02/10 to follow 04/10.
> >>
> >> If Jordan remains unconvinced but Mike or Liming prefers the union, then
> >> we have a stalemate and I'll abandon the patch set.
> >
> > I did have a (slight) concern about adding a typedef to a public
> > header that wasn't in the spec. It seemed like something that someone
> > somewhere might not like in case it could interfere with future
> > versions of the spec. According to Liming, we don't have to worry
> > about that.
> >
> > Regarding the UINT32* discussion, I didn't think the union really
> > would make a difference vs skipping the union and casting the original
> > struct pointer directly to a UINT32*.
> 
> The gcc docs on "-fstrict-aliasing" seem to suggest the same, using an
> example -- but to me that example looks wrong (it seems well-defined,
> based on the same rules I quoted from the standard); i.e. that looks
> like a bug to me in the gcc docs. I've contacted a colleague at RH in
> the toolchain team about it.
> 
> > I can see Andrew's point that the union may add some alignment
> > assumptions to the dereference, so I can see how that does potentially
> > change something subtle. Maybe on some machines it will allow for more
> > efficient reading of the data with the (valid) alignment assumption.
> >
> > I was also not seeing why you were saying it produced *undefined*
> > results. I don't think it does in our case, but when you point out
> > that we are aliasing data access, I can see how that quickly gets into
> > *undefined* territory from a compiler's perspective.
> 
> Right, I approached this purely from the standard's perspective. The
> standard says, in "4. Conformance":
> 
> 1 In this International Standard, ‘‘shall’’ is to be interpreted as a
>   requirement on an implementation or on a program; conversely, ‘‘shall
>   not’’ is to be interpreted as a prohibition.
> 
> 2 If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
>   constraint is violated, the behavior is undefined. Undefined behavior
>   is otherwise indicated in this International Standard by the words
>   ‘‘undefined behavior’’ or by the omission of any explicit definition
>   of behavior. There is no difference in emphasis among these three;
>   they all describe ‘‘behavior that is undefined’’.
> 
> In this particular case, 6.5p7 goes "An object *shall* have its stored
> value accessed *only* by [...]" (emphasis mine), and that paragraph
> doesn't fall in a Constraints section, so 4p2 clearly applies.
> 
> That's why I called it undefined -- it might work perfectly well in
> practice on all the edk2 platforms, possibly due to our carefully
> selected compiler switches, but that's not what static analyzers care
> about. AIUI they only care about the standard, hence my focus on the
> standard for the fix.
> 
> (
> 
> Side note: the classification "outside of a constraint" in 4p2 above is
> relevant because we also have:
> 
> 5.1.1.3 Diagnostics
> 
> 1 A conforming implementation shall produce at least one diagnostic
>   message (identified in an implementation-defined manner) if a
>   preprocessing translation unit or translation unit contains a
>   violation of any syntax rule or constraint, even if the behavior is
>   also explicitly specified as undefined or implementation-defined.
>   [...]
> 
> IOW, if we break a "shall" or "shall not" that is under a Constraint,
> then the compiler *must* yell at us; it cannot allow the UB to slip our
> attention.
> 
> )
> 
> > Anyway, given Liming's feedback that it is ok to add the union, I'm ok
> > with this patch.
> 
> Funnily enough, in parallel, Mike has expressed a preference for the
> byte-wise access plus the shifting (without helper macros) :)
> 
> I don't see why Liming would dislike that, and I gather you too would
> still prefer that to the union, so I'm going to incorporate it in v2.
> 
I am OK for both solution. I don't think union breaks something. I agree byte 
access way is the consistent solution in section header and ffs header. So, I 
am OK for them both.

> Thank you!
> Laszlo
> 
> > 
> >


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#39304): https://edk2.groups.io/g/devel/message/39304
Mute This Topic: https://groups.io/mt/31070302/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to