Problem with instrumenting GIMPLE for adding a global variable
Hi, I am trying to implement a prototype pass that instruments a function to check for safe memory accesses. As a starting point I looked at mudflap1 pass, in tree-mudflap.c and decided that I should write a dummy pass ( very simple, but similar to mudflap) that instruments the code to count the number of functions in the source file. I am unable to add a global variable in the file scope. The global VAR_DECL building function looks like this: { tree decl = build_decl (VAR_DECL, get_identifier ("my_var"), integer_type_node); TREE_PUBLIC (decl) = 0; DECL_EXTERNAL (decl) = 0; TREE_STATIC (decl) = 1; gimplify_stmt (my_variable); lang_hooks.decls.pushdecl (decl); return decl; } It is called from within my call-back function "execute_my_pass" ( as specified by me in the tree_opt_pass structure ). I increment "my_var" in every function. But, the instrumented object file doesnot have a storage allocated for this variable in the .data section. The problem is that DECL_CONTEXT of this VAR_DECL is always getting assigned to a function, and not to global file scope. Why is that? Any help would be appreciated. Thanks, Prateek.
Re: Problem with instrumenting GIMPLE for adding a global variable
Sorry, my mistake. I initialized it with a value, and it appeared in the .data. :) > Hi, > > I am trying to implement a prototype pass that instruments a function > to check for safe memory accesses. As a starting point I looked at > mudflap1 pass, in tree-mudflap.c and decided that I should write a > dummy pass ( very simple, but similar to mudflap) that instruments the > code to count the number of functions in the source file. > > I am unable to add a global variable in the file scope. The global > VAR_DECL building function looks like this: > > { > tree decl = build_decl (VAR_DECL, get_identifier ("my_var"), > integer_type_node); > TREE_PUBLIC (decl) = 0; > DECL_EXTERNAL (decl) = 0; > TREE_STATIC (decl) = 1; > gimplify_stmt (my_variable); > lang_hooks.decls.pushdecl (decl); > return decl; > } > > It is called from within my call-back function "execute_my_pass" ( as > specified by me in the tree_opt_pass structure ). > > I increment "my_var" in every function. But, the instrumented object > file doesnot have a storage allocated for this variable in the .data > section. The problem is that DECL_CONTEXT of this VAR_DECL is always > getting assigned to a function, and not to global file scope. Why is > that? > > Any help would be appreciated. > > Thanks, > Prateek.
question about pass management
Hi, I have to run a pass which modifies the function types of all functions in a a C file, by adding an extra parameter to each function. If this pass runs like all other optimization passes, then this pass runs when each function is being processed. So, for a prog like this: int my_funct(int var, float hell) { printf("%d",var); } int main() { my_funct (4, 5.6); } if I modify the type of my_funct to take 3 args (int, int, float), then the type checker ( which runs before my pass for "main" ) bombs out saying that the call to "my_funct" has lesser than required parameters. Where should I be running this pass? The way it looks is that i need the pass manager to run my pass for all the functions, instead of running all the passes for each function, and then processing the next function. Is this possible to specify to the pass manager? Any help would be appreciated. Thanks, Prateek.
Code getting optimized away after instrumenation for memory analysis
Hello, I am doing some code instrumentation for program memory access validation using gcc-4.1 head. For every assignment, p=q I pass the address of the operands, "&p" and "&q" to an external library function. This works fine at O0, but at O1, some legitimate code gets optimized away. In the dumps generated by my pass( runs immediately after t08.useless), there is a statement *D.2383 = __taint_addr10; The "__taint_addr10" is an artificial variable created by my pass, which replaces the original operand, since the address of the original operand is taken. If this variable is used later in any expression with a binary operator or a COND_EXPR, then it must be replaced by a copy that doesnot have its address taken. So, in this prototype currently I replace its occurance in every rhs expression. This statement gets optimized away in the dead code elemination(t26.dce1) pass, because in the "is_global_hidden_store" function , there is a check : if (!ZERO_SSA_OPERANDS (stmt, SSA_OP_VIRTUAL_DEFS)) which fails, and the function returns false. Thus, this statement is not marked as necessary and gets removed. The program which I am instrumenting is as follows: int main() { int i, k=1, n, *x; printf( "Enter n: "); scanf( "%d", &n ); x = malloc( (n+1) * sizeof(int)); x[1] = 1; while (k) { if ( x[k]== n) { k--; x[k]++; } else { k++; x[k] = x[k-1] + 1; } } return 0; } The dump after my instrumentation for "x[k]++" in the first if block in while looks like - L1: ... ... /* Computes in __taint_addr.121 = k*4 */ __taint_addr.115 = k; D.2381 = __taint_addr.115 * 4; __taint_addr.125 = D.2381; __taint_addr.128 = x; D.2383 = __taint_addr.125 + __taint_addr.128; library_call (D.2383); /* Computes the incremented value in D.2385 */ D.2384 = *D.2383; __taint_addr.135 = D.2384; D.2385 = __taint_addr.135 + 1; __taint_addr.11 = D.2385 *D.2383 = __taint_addr.11; /* ==> gets optimized away */ I would be thankful if someone could give me pointers to what could be going wrong here. Clearly, The reference to *D.2383, is a valid heap address, and hence must not be removed as a dead store. There is something which my instrumentation is messing up. I looked at the alias analysis dumps t20.alias1 . What I noticed is that in function update_alias_info (), there is a place where gcc processes each operand use is seen ,and for pointers, it determines whether they are dereferenced by a given statement or not. So, I added some printfs there, as follows: FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, iter, SSA_OP_USE) { tree op, var; var_ann_t v_ann; struct ptr_info_def *pi; bool is_store, is_potential_deref; unsigned num_uses, num_derefs; op = USE_FROM_PTR (use_p); /* Debug statemets added */ ==> fprintf (stderr, "Stament and uses\n"); ==> debug_generic_stmt (stmt); ==> debug_generic_stmt (op); At this point, the stmt: # VUSE ; *D.2383 = __taint_addr.11D.3119_323; _doesnot_ show any used op like, D.2383. why is this happening? Thanks in advance, Prateek.
Re: Code getting optimized away after instrumenation for memory analysis
ue escapes, points-to anything D.3207_250, points-to anything __taint_addr.204_251, is dereferenced, its value escapes, points-to anything D.3209_252, points-to anything __taint_addr.205_253, is dereferenced, its value escapes, points-to anything D.3211_254, points-to anything __taint_addr.206_255, is dereferenced, its value escapes, points-to anything D.3215_259, points-to anything __taint_addr.209_260, is dereferenced, its value escapes, points-to anything D.3057_261, is dereferenced, its value escapes, points-to anything D.2388_11 D.2387_10 D.2386_9 D.2383_6 D.2382_5 Flow-insensitive alias information for main Aliased symbols HEAP.686, UID 4148, void *, is an alias tag, is addressable, is global, call clobbered TMT.687, UID 4149, void, is addressable, is global, call clobbered, may aliases: { HEAP.686 k n x n.10 D.2375 D.2376 D.2377 D.2378 D.2379 k.11 D.2381 D.2382 D.2383 D.2384 D.2385 D.2386 D.2387 D.2388 D.2389 D.2390 } NMT.688, UID 4150, int, is addressable, is global, call clobbered, may aliases: { HEAP.686 } k, UID 2368, int, is an alias tag, is addressable, call clobbered, default def: k_17 n, UID 2369, int, is an alias tag, is addressable, call clobbered, default def: n_30 x, UID 2370, int *, is an alias tag, is addressable, call clobbered, default def: x_79 n.10, UID 2374, int, is an alias tag, is addressable, call clobbered, default def: n.10_35 D.2375, UID 2375, int, is an alias tag, is addressable, call clobbered, default def: D.2375_46 D.2376, UID 2376, int, is an alias tag, is addressable, call clobbered, default def: D.2376_57 D.2377, UID 2377, unsigned int, is an alias tag, is addressable, call clobbered, default def: D.2377_64 D.2378, UID 2378, void *, is an alias tag, is addressable, call clobbered, default def: D.2378_70 D.2379, UID 2379, int *, is an alias tag, is addressable, call clobbered, default def: D.2379_90 k.11, UID 2380, unsigned int, is an alias tag, is addressable, call clobbered, default def: k.11_105 D.2381, UID 2381, unsigned int, is an alias tag, is addressable, call clobbered, default def: D.2381_104 D.2382, UID 2382, int *, is an alias tag, is addressable, call clobbered, default def: D.2382_103 D.2383, UID 2383, int *, is an alias tag, is addressable, call clobbered, default def: D.2383_102 D.2384, UID 2384, int, is an alias tag, is addressable, call clobbered, default def: D.2384_101 D.2385, UID 2385, int, is an alias tag, is addressable, call clobbered, default def: D.2385_100 D.2386, UID 2386, int *, is an alias tag, is addressable, call clobbered, default def: D.2386_99 D.2387, UID 2387, int *, is an alias tag, is addressable, call clobbered, default def: D.2387_98 D.2388, UID 2388, int *, is an alias tag, is addressable, call clobbered, default def: D.2388_97 D.2389, UID 2389, int, is an alias tag, is addressable, call clobbered, default def: D.2389_96 D.2390, UID 2390, int, is an alias tag, is addressable, call clobbered, default def: D.2390_95 Dereferenced pointers __taint_addr.97, UID 3056, void *, type memory tag: TMT.687 D.3057, UID 3057, int *, type memory tag: TMT.687 __taint_addr.108, UID 3071, void *, type memory tag: TMT.687 __taint_addr.109, UID 3073, void *, type memory tag: TMT.687 __taint_addr.127, UID 3097, void *, type memory tag: TMT.687 __taint_addr.130, UID 3101, void *, type memory tag: TMT.687 __taint_addr.131, UID 3103, void *, type memory tag: TMT.687 __taint_addr.133, UID 3106, void *, type memory tag: TMT.687 __taint_addr.137, UID 3111, void *, type memory tag: TMT.687 __taint_addr.138, UID 3113, void *, type memory tag: TMT.687 __taint_addr.139, UID 3115, void *, type memory tag: TMT.687 __taint_addr.142, UID 3119, void *, type memory tag: TMT.687 Type memory tags TMT.687, UID 4149, void, is addressable, is global, call clobbered, may aliases: { HEAP.686 k n x n.10 D.2375 D.2376 D.2377 D.2378 D.2379 k.11 D.2381 D.2382 D.2383 D.2384 D.2385 D.2386 D.2387 D.2388 D.2389 D.2390 } Flow-sensitive alias information for main SSA_NAME pointers D.3024_94, name memory tag: NMT.688, is dereferenced, its value escapes, points-to vars: { HEAP.686 } Name memory tags NMT.688, UID 4150, int, is addressable, is global, call clobbered, may aliases: { HEAP.686 } Symbols to be put in SSA form k n x n.10 D.2375 D.2376 D.2377 D.2378 D.2379 k.11 D.2381 D.2382 D.2383 D.2384 D.2385 D.2386 D.2387 D.2388 D.2389 D.2390 HEAP.686 TMT.687 NMT.688 Incremental SSA update started at block: -1 Number of blocks in CFG: 6 Number of blocks to update: 5 ( 83%) = On 11/18/05, Diego Novillo <[EMAIL PROTECTED]> wrote: > On Friday 18 November 2005 04:13, Prateek Saxena wrote: > > > At this point, the stmt: > > > > # VUSE ; > > *D.2383 = __taint_addr.11D.3119_323; > > > > _doesnot_ show any used op like, > > > > D.2383. > > > > why is this happening? > > > D.2383 is virtual (see the VUSE). Show me the points-to set for D.2383? > That store is generating no V_MAY_DEFs and that is why DCE is removing it. > I'll need to see the .alias1 and .dce1 dumps. >
Accessing the file scope C++ data structs in GIMPLE
Hello, I want to dump out all the c++ classes/structs and global data structures defined in a given file scope or transaltion unit, as part of the debug dumps in C-like syntax, from GIMPLE data structs. Is there a TREE_VEC or BIND_EXPR_VARS that I can traverse through and get all these ? Thanks a lot for the help provided in the past as well. Thanks, Prateek.
Re: Security vulernarability or security feature?
On Thu, Apr 24, 2008 at 2:20 PM, Ralph Loader <[EMAIL PROTECTED]> wrote: > > I am very interested in seeing how this optimization can remove > > arithmetic overflows. > > int foo (char * buf, int n) > { > // buf+n may overflow of the programmer incorrectly passes > // a large value of n. But recent versions of gcc optimise > // to 'n < 100', removing the overflow. > return buf + n < buf + 100; > } This clearly is insecure coding. The optimization to replace "buf + n < buf + 100" with "n < 100" assumes something about the value of buf. I assume that the location of "buf", could change arbitrarily accross platforms depending on the memory layout. I ran foo as follows, getting different outputs : int main() { printf ("%d\n", foo (0xbffc, 0x4010)); return 0; } When "foo" is : int foo (char * buf, int n) { return buf + n < buf + 100; } Result : 1 When foo is (due to optimization, lets say) : int foo (char * buf, int n) { return n < 100; } Result : 0 When such assumptions are made ... the compiler may eliminate the bug in some cases giving the programmer a false feeling that "Oh! My code is bug free". The problem is that when the code is compiled on a different platform, with different switches, the bug may reappear. I just wanted to bring out the point about the assumption ... may be a diagnostic should be issued. Or the compiler is smart enough to figure out the values of "buf" and only optimise on cases which are safe. > > Compiled on i386, gcc-4.3.0 with -O2 gives: > > foo: > xorl%eax, %eax > cmpl$99, 8(%esp) > setle %al > ret > > E.g., calling foo with: > > #include > int main() > { > char buf[100]; > printf ("%d\n", foo (buf, 15)); > return 0; > } > > on my PC (where the stack is just below the 3Gig position). > > > > > > Why is Cert advising people to avoid an optimisation that can --- > > > realistically, although probably rarely --- remove security > > > vulnerabilities? > > > > > If you are referring to VU#694123, this refers to an optimization > > I'm talking about 162289. > > Ralph. > > > > > that removes checks pointer arithmetic wrapping. The optimization > > doesn't actually eliminate the wrapping behavior; this still occurs. > > It does, however, eliminate certain kinds of checks (that depend upon > > undefined behavior). > > > > Thanks, > > rCs >