On Fri, Dec 12, 2008 at 3:05 PM, Martin Jambor <mjam...@suse.cz> wrote: > Hi, > > On Fri, Dec 12, 2008 at 12:26:38PM +0100, Richard Guenther wrote: >> On Fri, Dec 12, 2008 at 11:59 AM, Jan Hubicka <hubi...@ucw.cz> wrote: >> >> On Fri, Dec 12, 2008 at 12:29 AM, Martin Jambor <mjam...@suse.cz> wrote: >> >> > Hi, >> >> > >> >> > today I have encountered an unpleasant problem with the function >> >> > get_ref_base_and_extent() when it claimed a known and constant offset >> >> > for the expression insn_4(D)->u.fld[arg.82_3].rt_rtvec. (arg being a >> >> > default_def parameter of the function, insn is an rtx). Moreover, it >> >> > also returned constant size and max_size, all three equal to 64 >> >> > (bits). >> >> > >> >> > This is certainly wrong (I believe the function got confused by >> >> > unions) but after looking at the function and what it did in debugger, >> >> > I grew unsure what the expected behavior is. The two alternatives I >> >> > consider possibly correct are returned offset equal to -1 or, >> >> > alternatively, offset equal to the offset of the array (+ offset of >> >> > rt_rtvec which is zero) and max_size equal either to the size of the >> >> > array or -1 if it is unknown. >> >> > >> >> > At the moment, the function never returns offset equal to -1, the >> >> > comment above it does not even mention such possibility and, from the >> >> > limited research I did, its callers do not expect and check for it. >> >> > However, at the same time, the special handling of non-constants in >> >> > array indices after the label "done" does not trigger in this >> >> > particular case (here I suspect the unions come to play because it is >> >> > the last part of the conjunction that evaluates to false). >> >> > >> >> > Which (or what else) is the correct semantics of the function? Both >> >> > make sense to me (I would prefer the former but I suspect other users >> >> > rely on the latter). What would be the correct fix, when a union >> >> > field is itself a record ending with a variable-length array? How >> >> > much would I pessimize things if I just returned -1 max_size if both a >> >> > non-constant index and a union was encountered? >> >> > >> >> > Or did I miss something else? >> >> >> >> The function is supposed to return -1 for max_size in case the access >> >> accesses a variable array index of an array of unknown size. >> >> >> >> insn_4(D)->u.fld[arg.82_3].rt_rtvec accesses struct rtvec_def -- in which >> >> case the access _does_ have known size (an int plus one rtx pointer). >> >> But maybe I am missing context here? >> > >> > Hmm, isn't this the usual problem with arrays running past the end of >> > sturcture? >> >> No, get_ref_base_and_extent handles those fine (it is supposed to set >> max_size to -1 in this case). >> >> > I guess we need to special case these for needs of SRA as you can't >> > easilly know size of the array even if it is explicitely written in the >> > program. >> >> Of course, just bail out if max_size is -1. The problem Martin is seeing >> is that it isn't -1 for some reason. Martin, are you asking >> get_ref_base_and_extent >> for only a part of an access? > > I don't know what you mean by a partial access, the gimple statement > this expression comes from is: > > D.10261_5 = insn_4(D)->u.fld[arg.82_3].rt_rtvec; > > I simply grabbed the rhs and fed it to get_ref_base_and_extent(). > > I agree the function should return -1 max_size in the case of a > variable indexing into an array that might have unknown size. The > problem is how the function identifies such arrays. The idea is > simple, these are arrays at the end of the whole aggregate and thus > the hitherto calculated offset + max_size give exactly that, the size > of the whole aggregate. > > However, this does not work if we have unions within the structure. A > simple example would be: > > struct a > { > union > { > struct > { > int a; > int var_array[1]; > } x; > struct > { > int a,b,c,d,e,f,g,h,i,j; > } y; > } z; > } a; > > In this particular case, the ofset of var_array + its max_size (the > size of its element, as the function computes it, this is something I > also do not really understand) is not equal to the size of the whole > structure, because the other union component is way larger. Therefore > the final part of the following conjunction evaluates to false: > > if (seen_variable_array_ref > && maxsize != -1 > && host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1) > && bit_offset + maxsize > == (signed)TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp)))) > maxsize = -1; > > Note that exp TREE_TYPE (exp) is now the type of the all-encompassing > most-ouward aggregate. And thus maxsize is not set to -1. > > I believe a simple solution would be to ahve a nother flag to be set > when processing component_ref with a union as its zeroth argument and > always return maxsize of -1 if both seen_variable_array_ref and this > flag is set. It is not optimal because the union field might be a > structure with an array in it but not as the last field and so it > cannot have variable size but well... ...if you think this case really > matters I can try to handle it as well.
Hm, yeah - we probably need to handle all the weird cases somehow. But we can in this case do even better and simply enlarge the max-size of the fld[] access to its _union_ size and then come back and the end and see the access is touching the last bit of the struct. I see you posted a patch - lemme have a look there. Richard. > Thanks, > > Martin >