On June 21, 2021 5:11:30 PM GMT+02:00, Qing Zhao <qing.z...@oracle.com> wrote: >HI, Richard, > >> On Jun 21, 2021, at 2:53 AM, Richard Biener <rguent...@suse.de> >wrote: >> >>> >>> >>> This is for the compatibility with CLANG. -:). >(https://reviews.llvm.org/D54604) >> >> I don't care about functional 1:1 "compatibility" with CLANG. > >Okay. -:) > >> >>> 1. Pattern initialization >>> >>> This is the recommended initialization approach. Pattern >initialization's >> >> But elsewhere you said pattern initialization is only for debugging, >> not production … > >Yes. Pattern initialization is only for debugging purpose during >development phase. > >> >>> Use a pattern that fits them all. I mean memory allocation >hardening >>> fills allocated storage with a repeated (byte) pattern and people >are >>> happy with that. It also makes it easy to spot uninitialized >storage >>> from a debugger. So please, do not over-design this, it really >doesn't >>> make any sense and the common case you are inevitably chasing here >>> would already be fine with a random repeated pattern. >>> >>> So, My question is: >>> >>> If we want to pattern initialize with the single repeated pattern >for all types, with one is better to use: “0xAAAAAAAA” >>> or “0xFFFFFFFF” , or other pattern that our current glibc used? >What’s that pattern? >> >> It's set by the user. > >Yes, looks like that glibc uses a byte-repeated pattern that is set by >the user through environment variable. > >> >>> Will “0xAAAAAAAA” in a floating type auto variable crash the >program? >>> Will “0xFFFFFFFF” in a pointer type auto variable crash the program? >(Might crash?) >>> >>> >>> (thus also my suggestion to split out >>> padding handling - now we can also split out pattern init handling, >>> maybe somebody else feels like reviewing and approving this, who >knows). >>> >>> I am okay with further splitting pattern initialization part to a >separate patch. Then we will >>> have 4 independent patches in total: >>> >>> 1. -fauto-var-init=zero and all the handling in other passes to the >new added call to .DEFERRED_INIT. >>> 2. Add -fauto-var-init=pattern >>> 3. Add -fauto-var-init-padding >>> 4. Add -ftrivial-auto-var-init for CLANG compatibility. >>> >>> Are the above the correct understanding? >> >> I think we can drop -fauto-var-init=pattern and just go with block >> initializing which will cover padding as well which means we can >> stay with the odd -ftrivial-auto-var-init name used by CLANG and >> add no additional options. > >Yes, this is a good idea. > >block initializing will cover all paddings automatically. > >Shall we do block initializing for both “zero initialization” and >“pattern initialization”? > >Currently, for zero initialization, I used the following: > >>>> + case AUTO_INIT_ZERO: >>>> + init = build_zero_cst (TREE_TYPE (var)); >>>> + expand_assignment (var, init, false); >>>> + break; > >Looks like that the current “expand_assignment” does not initialize >paddings with zeroes. >Shall I also use “memset” for “zero initialization”?
I'd say so, yes. >> >>> As said, block-initializing with a repeated pattern is OK and I can >see >>> that being useful. Trying to produce "nicer" values for floats, >bools >>> and pointers on 32bit platforms is IMHO not going to fix anything >and >>> introduce as many problems as it will "fix". >>> >>> Yes, I agree, if we can find a good repeated pattern for all types’s > >>> pattern initialization, that will be much easier and simpler to >>> implement, I am happy to do that. (Honestly, the part of >implementation >>> that took me most of the time is pattern-initialization.. and I am >still >>> not very comfortable with this part Of the code myself. -:) >> >> There's no "safe" pattern besides all-zero for all "undefined" uses >> (note that uses do not necessarily use declared types). Which is why >> recommending pattern init is somewhat misguided. There's maybe >> some useful pattern that more readily produces crashes, those that >> produce a FP sNaN for all of the float types. > >So, pattern value as 0xFF might be better than 0xAA since 0xFFFFFFFF >will be a NaN value for floating type? I think for debugging NaNs are quite nice, yes. >> >>> And if you block-initialize stuff you then automagically cover >padding. >>> I call this a win-win, no? >>> >>> Yes, this will also initialize paddings with patterns (Not zeroes as >CLANG did). >>> Shall we compatible with CLANG on this? >> >> No, why? > >Okay. > >>> in my example code (untested) you then still need >>> >>> expand_assignment (var, ctor, false); >>> >>> it would be the easiest way to try pattern init with a pattern >that's >>> bigger than a byte (otherwise of course the memset path is optimal). >>> >>> If the pattern that is used to initialize all types is >byte-repeatable, for example, 0xA or 0xF, then >>> We can use memset to initialize all types, however, the potential >problem is, if later we decide >>> To change to another pattern that might not be byte-repeatable, then >the memset implementation >>> is not proper at that time. >>> >>> Is it possible that we might change the pattern later? >> >> The pattern should be documented as an implementation detail unless >> we want to expose it to the user via, say, -fpattern-init=0xdeadbeef. > >Not sure whether it’s necessary to expose this to user. > >One question that is important to the implementation is: > >Shall we use “byte-repeated” or “word-repeated” pattern? >Is “word-repeated” pattern better than “byte-repeated” pattern? > >For implementation, “byte-repeated” pattern will make the whole >implementation much simpler since both “zero initialization” >and “pattern initialization” can be implemented with “memset” with >different “value”. > >So, if “word-repeated” pattern will not have too much more benefit, I >will prefer “byte-repeated” pattern. > >Let me know your comments here. I have no strong opinion and prefer byte repetition for simplicity. But I would document this as implementation detail that can change. Richard. >> >>> >>> >>> As said, for example glibc allocator hardening with MALLOC_PERTURB_ >>> uses simple byte-init. >>> >>> What’s the pattern glibc used? >> >> The value of the MALLOC_PERTURB_ environment truncated to a byte. > >Okay. > >thanks. > >Qing >> >> Richard. >>