Re: [PATCH, PowerPC] Fix PR57949 (ABI alignment issue)
Isn't mixing and matching and mismatching somewhat inevitable? Libffi & gcc don't always come along with each other? One must never change the ABI? - Jay On Sep 11, 2013, at 5:55 AM, Bill Schmidt wrote: > On Wed, 2013-09-11 at 21:08 +0930, Alan Modra wrote: >> On Wed, Aug 14, 2013 at 10:32:01AM -0500, Bill Schmidt wrote: >>> This fixes a long-standing problem with GCC's implementation of the >>> PPC64 ELF ABI. If a structure contains a member requiring 128-bit >>> alignment, and that structure is passed as a parameter, the parameter >>> currently receives only 64-bit alignment. This is an error, and is >>> incompatible with correct code generated by the IBM XL compilers. >> >> This caused multiple failures in the libffi testsuite: >> libffi.call/cls_align_longdouble.c >> libffi.call/cls_align_longdouble_split.c >> libffi.call/cls_align_longdouble_split2.c >> libffi.call/nested_struct5.c >> >> Fixed by making the same alignment adjustment in libffi to structures >> passed by value. Bill, I think your patch needs to go on all active >> gcc branches as otherwise we'll need different versions of libffi for >> the next gcc releases. > > Hm, the libffi case is unfortunate. :( > > The alternative is to leave libffi alone, and require code that calls > these interfaces with "bad" structs passed by value to be built using > -mcompat-align-parm, which was provided for such compatibility issues. > Hopefully there is a small number of cases where this can happen, and > this could be documented with libffi and gcc. What do you think? > > Thanks, > Bill > >> >> The following was bootstrapped and regression checked powerpc64-linux. >> OK for mainline, and the 4.7 and 4.8 branches when/if Bill's patch >> goes in there? >> >>* src/powerpc/ffi.c (ffi_prep_args64): Align FFI_TYPE_STRUCT. >>(ffi_closure_helper_LINUX64): Likewise. >> >> Index: libffi/src/powerpc/ffi.c >> === >> --- libffi/src/powerpc/ffi.c(revision 202428) >> +++ libffi/src/powerpc/ffi.c(working copy) >> @@ -462,6 +462,7 @@ ffi_prep_args64 (extended_cif *ecif, unsigned long >> double **d; >> } p_argv; >> unsigned long gprvalue; >> + unsigned long align; >> >> stacktop.c = (char *) stack + bytes; >> gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - >> NUM_GPR_ARG_REGISTERS64; >> @@ -532,6 +533,10 @@ ffi_prep_args64 (extended_cif *ecif, unsigned long >> #endif >> >>case FFI_TYPE_STRUCT: >> + align = (*ptr)->alignment; >> + if (align > 16) >> +align = 16; >> + next_arg.ul = ALIGN (next_arg.ul, align); >> words = ((*ptr)->size + 7) / 8; >> if (next_arg.ul >= gpr_base.ul && next_arg.ul + words > gpr_end.ul) >>{ >> @@ -1349,6 +1354,7 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure, >> long i, avn; >> ffi_cif *cif; >> ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64; >> + unsigned long align; >> >> cif = closure->cif; >> avalue = alloca (cif->nargs * sizeof (void *)); >> @@ -1399,6 +1405,10 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure, >> break; >> >>case FFI_TYPE_STRUCT: >> + align = arg_types[i]->alignment; >> + if (align > 16) >> +align = 16; >> + pst = ALIGN (pst, align); >> #ifndef __LITTLE_ENDIAN__ >> /* Structures with size less than eight bytes are passed >> left-padded. */ >
Re: [gofrontend-dev] Re: [PATCH 00/13] Go closures, libffi, and the static chain
I worked on what I suspect is similar stuff. I ran into the problem..pardon me if my terminology is wrong..PLT thunks for nested functions trashed registers that were in use. My solution was to mark them "hidden" or whatever is the term for not replaceable...also not exported but I recall not replaceable is more important. - Jay On Nov 6, 2014, at 11:38 PM, Richard Henderson wrote: > On 11/06/2014 06:45 PM, Ian Taylor wrote: >> On Thu, Nov 6, 2014 at 5:04 AM, Richard Henderson wrote: >>> >>> That said, this *may* not actually be a problem. It's not the direct >>> (possibly >>> lazy bound) call into libffi that needs a static chain, it's the indirect >>> call >>> that libffi produces. And the indirect calls that Go produces. >>> >>> I'm pretty sure that there are no dynamically linked Go calls that require >>> the >>> static chain. They're used for closures, which are either fully indirect >>> from >>> a different translation unit, or locally bound closures through which the >>> optimizer has seen the construction, and optimized to a direct call. >>> >>> Ian, have I missed a case where a closure could wind up with a direct call >>> to a >>> lazy bound function? >> >> I think you've covered all the cases. The closure value is only >> required when calling a nested function. There is no way to refer >> directly to a nested function defined in a different shared library. >> The only way you can get such a reference is if some function in that >> shared library returns it. > > Sorry, I wasn't clear. I know nested functions must be local. > > I'm asking about Go closures, supposing we go ahead with the change to > make them use the static chain register. > > I'm merely pretty sure that calling a closure is either fully indirect > or local direct. > > Certainly there are cases in the testsuite where -O3 is able to look > through the creation of a closure and have a direct call to the function. > > Given that closures are custom created for the data at the creation > site, it seems unlikely that the optimizer could look through that and > come up with a dynamically bound function. > > > r~
Re: [PATCH, libffi, alpha]: Use FFI_ASSERT in ffi_closure_osf_inner
On Sep 20, 2014, at 3:04 AM, Anthony Green wrote: > > Why not just pass an FFI_TYPE_STRUCT with zero members? My information may be old or irrelevant but I have used structs with no members with gcc backend, but with nonzero size and alignment, and ran into backend problems, particularly on sparc64, passing them as parameters. Is that what is being used here? Maybe best to add some members to achieve equivalent size/alignment? - Jay
Re: [PATCH] Do not mark pseudo-copies decomposable during first lower-subreg pass
On 26 September 2012 13:11, Ulrich Weigand wrote: > ChangeLog: > > * lower-subreg.c (enum classify_move_insn): Rename > SIMPLE_PSEUDO_REG_MOVE to DECOMPOSABLE_SIMPLE_MOVE. > (find_decomposable_subregs): Update. > (decompose_multiword_subregs): Add DECOMPOSE_COPIES parameter. > Only mark pseudo-to-pseudo copies as DECOMPOSABLE_SIMPLE_MOVE > if that parameter is true. > (rest_of_handle_lower_subreg): Call decompose_multiword_subregs > with DECOMPOSE_COPIES false. > (rest_of_handle_lower_subreg2): Call decompose_multiword_subregs > with DECOMPOSE_COPIES true. This patch seems to have caused a slight regression in ARM register allocation: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58166 Jay.
Fwd: libcpp/C FE source range patch committed (r230331).
(Resending as plain text. Sorry for the HTML!) On 14 November 2015 at 14:50, David Edelsohn wrote: > > This patch causes numerous new testsuite failure on AIX caused by the > compiler crashing during compilation, e.g. r230331 also seems to be causing this on x86_64-pc-linux-gnu: $ cat x.c #define P(b) b&&4 int a[]=0; int f() { X||P(d); } $ ~/gcc/build/gcc/cc1 -quiet -Wall x.c [...] x.c:3:1: internal compiler error: in contains_point, at diagnostic-show-locus.c:335 int f() { X||P(d); } ^~~ 0x1268fc9 contains_point /home/jay/svn/gcc/trunk/gcc/diagnostic-show-locus.c:335 0x1268fc9 get_state_at_point /home/jay/svn/gcc/trunk/gcc/diagnostic-show-locus.c:612 0x12696e2 print_source_line /home/jay/svn/gcc/trunk/gcc/diagnostic-show-locus.c:533 0x12696e2 diagnostic_show_locus(diagnostic_context*, diagnostic_info const*) /home/jay/svn/gcc/trunk/gcc/diagnostic-show-locus.c:710 0x69b210 c_diagnostic_finalizer /home/jay/svn/gcc/trunk/gcc/c-family/c-opts.c:167 0x1267220 diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*) /home/jay/svn/gcc/trunk/gcc/diagnostic.c:800 0x1267b07 warning_at(unsigned int, int, char const*, ...) /home/jay/svn/gcc/trunk/gcc/diagnostic.c:1029 0x607e58 parser_build_binary_op(unsigned int, tree_code, c_expr, c_expr) /home/jay/svn/gcc/trunk/gcc/c/c-typeck.c:3514 0x61855a c_parser_binary_expression /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:6539 0x618a18 c_parser_conditional_expression /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:6182 0x619100 c_parser_expr_no_commas /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:6099 0x6198d2 c_parser_expression /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:8230 0x61a3a9 c_parser_expression_conv /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:8263 0x633431 c_parser_statement_after_labels /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:5172 0x635065 c_parser_compound_statement_nostart /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:4757 0x6358ae c_parser_compound_statement /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:4594 0x6314e3 c_parser_declaration_or_fndef /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:2015 0x63d31d c_parser_external_declaration /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:1459 0x63dbe9 c_parser_translation_unit /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:1346 0x63dbe9 c_parse_file() /home/jay/svn/gcc/trunk/gcc/c/c-parser.c:17622 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. Jay.
Re: Fwd: libcpp/C FE source range patch committed (r230331).
On 24 November 2015 at 11:34, Marek Polacek wrote: > On Tue, Nov 24, 2015 at 11:24:38AM +0000, Jay Foad wrote: >> r230331 also seems to be causing this on x86_64-pc-linux-gnu: >> >> $ cat x.c >> #define P(b) b&&4 >> int a[]=0; >> int f() { X||P(d); } >> $ ~/gcc/build/gcc/cc1 -quiet -Wall x.c >> [...] >> x.c:3:1: internal compiler error: in contains_point, at >> diagnostic-show-locus.c:335 >> int f() { X||P(d); } > > Could you please open a PR? I see now that it's already reported as PR c/68473. Jay. Jay.
Re: [RFC][ARM] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook
On 2 May 2014 10:04, Kugan wrote: > Thanks for spotting it. Here is the updated patch that changes it to > ARM_FE_*. > +2014-05-02 Kugan Vivekanandarajah > + > + * config/arm/arm.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New define. > + (arm_builtins) : Add ARM_BUILTIN_GET_FPSCR and ARM_BUILTIN_SET_FPSCR. > + (bdesc_2arg) : Add description for builtins __builtins_arm_set_fpscr > + and __builtins_arm_get_fpscr. s/__builtins/__builtin/g > + (arm_init_builtins) : Initialize builtins __builtins_arm_set_fpscr and > + __builtins_arm_get_fpscr. s/__builtins/__builtin/g This doesn't match the code, which initializes builtins "...ldfscr" and "...stfscr" (with no "p" in "fscr"). > + (arm_expand_builtin) : Expand builtins __builtins_arm_set_fpscr and > + __builtins_arm_ldfpscr. s/__builtins/__builtin/g Did you mean "and __builtin_arm_get_fpscr"? > +#define FP_BUILTIN(L, U) \ > + {0, CODE_FOR_##L, "__builtin_arm_"#L, ARM_BUILTIN_##U, \ > + UNKNOWN, 0}, > + > + FP_BUILTIN (set_fpscr, GET_FPSCR) > + FP_BUILTIN (get_fpscr, SET_FPSCR) > +#undef FP_BUILTIN This looks like a typo: you have mapped set->GET and get->SET. Jay.
Re: [RFC 0/6] Flags outputs for asms
On 8 May 2015 at 16:23, Richard Henderson wrote: > Yes, the i386 backend has not implemented conditional sibcalls. See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60159 Jay.
Re: [PATCH] Make IPA-CP propagate alignment information of pointers
> Index: src/gcc/ipa-cp.c > === > --- src.orig/gcc/ipa-cp.c > +++ src/gcc/ipa-cp.c > @@ -262,6 +262,9 @@ public: >ipcp_lattice ctxlat; >/* Lattices describing aggregate parts. */ >ipcp_agg_lattice *aggs; > + /* Alignment information. Very basic one value lattice where !known means > + TOP and zero alignment bottom. */ > + ipa_alignment alignment; >/* Number of aggregate lattices */ >int aggs_count; >/* True if aggregate data were passed by reference (as opposed to by > @@ -444,6 +447,13 @@ print_all_lattices (FILE * f, bool dump_ > plats->itself.print (f, dump_sources, dump_benefits); > fprintf (f, " ctxs: "); > plats->ctxlat.print (f, dump_sources, dump_benefits); > + if (plats->alignment.known && plats->alignment.align > 0) > + fprintf (f, " Alignment %u, misaglignment %u\n", "misalignment" > +plats->alignment.align, plats->alignment.misalign); > + else if (plats->alignment.known) > + fprintf (f, " Alignment unusable\n"); > + else > + fprintf (f, " Alignment unknown\n"); > if (plats->virt_call) > fprintf (f, "virt_call flag set\n"); > > @@ -761,6 +771,27 @@ set_agg_lats_contain_variable (struct ip >return ret; > } > > +/* Return true if alignemnt informatin in PLATS is known to be unusable. */ "alignment", "information" > + > +static inline bool > +alignment_bottom_p (ipcp_param_lattices *plats) > +{ > + return plats->alignment.known && (plats->alignment.align == 0); > +} > + > +/* Set alignment information in PLATS to unusable. Return true if it > + previously was usable or unknown. */ > + > +static inline bool > +set_alignment_to_bottom (ipcp_param_lattices *plats) > +{ > + if (alignment_bottom_p (plats)) > +return false; > + plats->alignment.known = true; > + plats->alignment.align = 0; > + return true; > +} > + > /* Mark bot aggregate and scalar lattices as containing an unknown variable, > return true is any of them has not been marked as such so far. */ > > @@ -771,6 +802,7 @@ set_all_contains_variable (struct ipcp_p >ret = plats->itself.set_contains_variable (); >ret |= plats->ctxlat.set_contains_variable (); >ret |= set_agg_lats_contain_variable (plats); > + ret |= set_alignment_to_bottom (plats); >return ret; > } > > @@ -807,6 +839,7 @@ initialize_node_lattices (struct cgraph_ > plats->itself.set_to_bottom (); > plats->ctxlat.set_to_bottom (); > set_agg_lats_to_bottom (plats); > + set_alignment_to_bottom (plats); > } > else > set_all_contains_variable (plats); > @@ -1369,6 +1402,77 @@ propagate_context_accross_jump_function >return ret; > } > > +/* Propagate alignments accross jump function JFUNC that is associated with "across" > + edge CS and update DEST_LAT accordingly. */ > + > +static bool > +propagate_alignment_accross_jump_function (struct cgraph_edge *cs, > + struct ipa_jump_func *jfunc, > + struct ipcp_param_lattices > *dest_lat) > +{ > + if (alignment_bottom_p (dest_lat)) > +return false; > + > + ipa_alignment cur; > + cur.known = false; > + if (jfunc->alignment.known) > +cur = jfunc->alignment; > + else if (jfunc->type == IPA_JF_PASS_THROUGH > + || jfunc->type == IPA_JF_ANCESTOR) > +{ > + struct ipa_node_params *caller_info = IPA_NODE_REF (cs->caller); > + struct ipcp_param_lattices *src_lats; > + HOST_WIDE_INT offset = 0; > + int src_idx; > + > + if (jfunc->type == IPA_JF_PASS_THROUGH) > + { > + enum tree_code op = ipa_get_jf_pass_through_operation (jfunc); > + if (op != NOP_EXPR) > + { > + if (op != POINTER_PLUS_EXPR > + && op != PLUS_EXPR > + && op != MINUS_EXPR) > + goto prop_fail; > + tree operand = ipa_get_jf_pass_through_operand (jfunc); > + if (!tree_fits_shwi_p (operand)) > + goto prop_fail; > + offset = tree_to_shwi (operand); > + } > + src_idx = ipa_get_jf_pass_through_formal_id (jfunc); > + } > + else > + { > + src_idx = ipa_get_jf_ancestor_formal_id (jfunc); > + offset = ipa_get_jf_ancestor_offset (jfunc); > + } > + > + src_lats = ipa_get_parm_lattices (caller_info, src_idx); > + if (!src_lats->alignment.known > + || alignment_bottom_p (src_lats)) > + goto prop_fail; > + > + cur = src_lats->alignment; > + cur.misalign = (cur.misalign + offset) % cur.align; > +} > + > + if (cur.known) > +{ > + if (!dest_lat->alignment.known) > + { > + dest_lat->alignment = cur; > + return true; >
Optimisation of std::binary_search of the header
Hey, I am Jay. I have written code for an optimised version of the binary_search algorithm of the algorithm header file of the standard template library. I have implemented it for the integer data type, but it can be implemented for any other data type without any changes in the algorithm as such. You are requested to give me feedback on my implementation. #include #include #include using namespace std; bool binary_search(int arr[], int size, int ele, bool *fun_ptr); bool binary_search(int arr[] , int size , int ele); /* This version of the binary_search finds an element ele in the array arr of length size. */ /* Assumptions: 1) Elements are sorted in increasing order in the array 2) The parameter size is the number of elements in the array not the index of the last element, i.e. the search space is arr[0,size) 3) Though , I have implemented this version with the int data type, it is also compatible with other data types like strings , or user defined data types. In those cases , only the comparison operation will change without any changes in the algorithm. */ /* If the specified element is in the array, then the function returns 1 else returns 0. */ bool binary_search(int arr[] , int size , int ele) { /* The element cannot be in the array if it is larger than the largest element or smaller than the smallest element */ if( ele < arr[0]||arr[size-1] < ele) { return bool(0); } /* After this statement , the element is definitely not outside the array, but it may or may not be in the array. */ int t=sqrt(size); int start =0,end=t*t; /* The algorithm uses the technique of square root decomposition. It breaks down the problem (the array of length size) to smaller problems each of length sqrt(size). It does a kind of binary_search on the first element of each subarray to find the subproblem to which the element belongs. */ /* The if part of the if else loop below finds which subproblem the element ele belongs to. The last subproblem won't be of length sqrt(size) if size is not a perfect square. So , as to handle this corner case , we need the else part of the loop. */ if(ele < arr[end]) { while(end-start>t) { int mid = (start+end)/2; mid -= (mid%t); if(arr[mid]==ele) { return bool(1); } else if(arr[mid] < ele) { start = mid; } else { end = mid; } } } else { start = end; end = size; } /* After this if else loop , [start , end] will have a portion of the array of maximum length sqrt(size) which may contain the element. */ /* The loop below does a normal binary_search in the range [start,end] */ while(end>=start) { int mid = start + ((end-start)/2); if(ele ==arr[mid]) { return bool(1); } else if(arr[mid] < ele) { start = mid+1; } else { end = mid-1; } } return bool(0); } // Same Code as above just with a comparator specified bool binary_search(int arr[] , int size , int ele, bool (*fun_ptr)(int ,int)) { /* The element cannot be in the array if it is larger than the largest element or smaller than the smallest element */ if(fun_ptr(ele,arr[0])||fun_ptr(arr[size-1],ele)) { return bool(0); } /* After this statement , the element is definitely not outside the array, but it may or may not be in the array. */ int t=sqrt(size); int start =0,end=t*t; /* The algorithm uses the technique of square root decomposition. It breaks down the problem (the array of length size) to smaller problems each of length sqrt(size). It does a kind of binary_search on the first element of each subarray to find the subproblem to which the element belongs. */ /* The if part of the if else loop below finds which subproblem the element ele belongs to. The last subproblem won't be of length sqrt(size) if size is not a perfect square. So , as to handle this corner case , we need the else part of the loop. */ if(fun_ptr(ele,arr[end])) { while(end-start>t) { int mid = (start+end)/2; mid -= (mid%t); if(!fun_ptr(arr[mid],ele)&&!fun_ptr(ele,arr[mid])) /* This is equivalent to checking the condition ele == arr[mid] */ { return bool(1); } else if(fun_ptr(arr[mid],ele)) { start = mid; } else { end = mid; } } } else { start = end; end = size; } /* After this if else construct , [start , end] will have a portion of the array of maximum length sqrt(size) which may contain the element */ /* The loop below does a normal binary_search in the range [start,end] */ while(end>=start) { int mid = start + ((end-start)/2); if(!fun_ptr(arr[mid],ele)&&!fun_ptr(ele,arr[mid])) /* This is equivalent to checking the condition ele == arr[mid] */ { return bool(1); } else if(fun_ptr(arr
Re: Optimisation of std::binary_search of the header
Respected Sir, I am sorry , for the use of wrong language in the previous mail. I wanted to convey that c++ has generalised the algorithm on various data structures , which is not required due to low performance. Could you give me the contact of the standard committee? Regards, Jay Pokarna On Mon, May 29, 2017 at 1:13 PM, Tim Song wrote: > I'm not sure if you forgot to CC the lists or intended to direct the > email to me alone. > > On Mon, May 29, 2017 at 2:41 AM, jay pokarna wrote: >> I know that cpp wants to generalise its methods so that they can be >> used with various data structures. But the cost of generalisation is >> that we have to compromise a lot on performance. > > That's neither here nor there. binary_search's performance as applied > to forward iterators has nothing to do with its performance as to > random access iterators. > >> I would like to recommend cpp to allow the use of binary_search only >> on data structures that use random access models. > > C++ has an international standard and GCC/libstdc++ is an > implementation of that standard. Proposal for changes to the standard > should be directed to the standards committee, not GCC's mailing > lists. > >> The technique that I have used is square root decomposition . I think >> that it will be better than the one that is implemented. > > And here's the problem: you *think* it will be better. Just thinking > is not enough. You need to *prove* it with benchmarks that show that > your technique is in fact faster than the current one. > > If there is in fact substantial improvement, *then* it's the time to > consider generalization: when is this technique faster than the stock > one? Always? Only random access? Only pointers? Only built-in types? > Again, this needs to be shown with appropriate benchmark numbers. > > Then, finally, you can write a patch proposing that libstdc++'s > binary_search be modified to use this technique in situations when > it's shown to be faster. > > For a recent example of how something like this should look like, see > https://gcc.gnu.org/ml/libstdc++/2016-12/msg00051.html and its > LLVM/libc++ counterpart, https://reviews.llvm.org/D27068. Note the > copious benchmark numbers showing that the proposed change was indeed > (much) better than the previous. -- Regards, Jay Pokarna CS Sophomore Wordpress | Linkedin Birla Institute of Technology and Science, Pilani Pilani Campus Rajasthan - 333031.
Re: Optimisation of std::binary_search of the header
Respected Sir, Could you give me the contact of the standard committee which handles changes to the c++ standard. Regards, Jay Pokarna On Mon, May 29, 2017 at 2:17 PM, Tim Shen wrote: > On Mon, May 29, 2017 at 1:05 AM, jay pokarna wrote: >>>> The technique that I have used is square root decomposition . I think >>>> that it will be better than the one that is implemented. >>> >>> And here's the problem: you *think* it will be better. Just thinking >>> is not enough. You need to *prove* it with benchmarks that show that >>> your technique is in fact faster than the current one. > > Agreed. > > Jay, specifically, your algorithm has the roughly the same running time: > > T(n) = log (sqrt(n) + 1) + log sqrt(n) > > 2 log (n ^ 0.5) > = 2 * 0.5 * log n > = log n > > It's unclear to me whether it's better than the normal binary search > or not. Detailed and representative benchmarks may convince more > people. > > > -- > Regards, > Tim Shen -- Regards, Jay Pokarna CS Sophomore Wordpress | Linkedin Birla Institute of Technology and Science, Pilani Pilani Campus Rajasthan - 333031.
Re: Optimisation of std::binary_search of the header
Hey, Could you tell the way as to how can I measure the time taken by my algorithm and compare it with the inbuilt functions ? My algorithm is similar to std::binary_search in working. Also , could you recommend some data that could be helpful to help the comparison between the function and the std::binary_search? Thanks, Jay Pokarna On Wed, May 31, 2017 at 3:50 AM, Mike Stump wrote: > On May 29, 2017, at 1:05 AM, jay pokarna wrote: >> >> Could you give me the contact of the standard committee? > > https://isocpp.org/std/the-committee > -- Regards, Jay Pokarna CS Sophomore Wordpress | Linkedin Birla Institute of Technology and Science, Pilani Pilani Campus Rajasthan - 333031.
Re: [PATCH] Make IPA-CP propagate alignment information of pointers
On 3 December 2014 at 14:36, Martin Jambor wrote: > On Wed, Dec 03, 2014 at 10:53:54AM +0000, Jay Foad wrote: >> > Index: src/gcc/ipa-prop.h >> > === >> > --- src.orig/gcc/ipa-prop.h >> > +++ src/gcc/ipa-prop.h >> > @@ -144,6 +144,17 @@ struct GTY(()) ipa_agg_jump_function >> > >> > typedef struct ipa_agg_jump_function *ipa_agg_jump_function_p; >> > >> > +/* Info about poiner alignments. */ >> >> "pointer" >> >> > +struct GTY(()) ipa_alignment >> > +{ >> > + /* The data fields below are valid only if known is true. */ >> > + bool known; >> >> Just curious: why is the "known" flag necessary? The comments for >> ptr_info_def say that align=0 means unknown. > > It is necessary. In IPA-CP, when know is false, this means the > lattice is in TOP state (i.e. once we learn something about the > parameter, let's overwrite this), whereas when it is true and > alignment is 0, it means it is in BOTTOM state (i.e. we know we cannot > rely on this and never will be able to). Can't you use align=1, misalign=0 for TOP ? This means that we don't know anything useful about the pointer yet, just that it's a multiple of 1 (which is trivially true for all pointers, isn't it?). When you have vectors of these struct they will pack MUCH more nicely without the "bool known" field. Thanks, Jay.
Re: [PATCH][RFC] Move TREE_VEC length and SSA_NAME version into tree_base
On 21 August 2012 10:58, Richard Guenther wrote: > Index: trunk/gcc/tree.h > === > *** trunk.orig/gcc/tree.h 2012-08-20 12:47:47.0 +0200 > --- trunk/gcc/tree.h2012-08-21 10:32:47.717394657 +0200 > *** enum omp_clause_code > *** 417,423 > so all nodes have these fields. > > See the accessor macros, defined below, for documentation of the > !fields. */ > > struct GTY(()) tree_base { > ENUM_BITFIELD(tree_code) code : 16; > --- 417,424 > so all nodes have these fields. > > See the accessor macros, defined below, for documentation of the > !fields, and the table below which connects the fileds and the > !accessor macros. */ Typo "fileds". Jay.
Re: VxWorks Patches Back from the Dead!
On 23 August 2012 09:24, Paolo Bonzini wrote: > Il 23/08/2012 04:27, rbmj ha scritto: >> + c_fix_arg = "%0\n" >> + "#define ioctl(fd, func, arg) ((ioctl)((fd), (func), >> ((int)(arg\n"; > > This can be simply > > #define ioctl(fd, func, arg) ioctl(fd, func, (int)arg) "(int)(arg)", surely. Jay.
RE: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
> CC: ebotcazou gcc-patches gingold rth joseph jay.krell > From: mikestump > To: palves > > On Jun 15, 2012, at 2:22 AM, Pedro Alves wrote: > > It's not about example, but the fact that host compilers have been > > compiling that code as part of building gcc for years, without anyone > > complaining > > Yeah, I think we should just jump to c++ 11 and not look back... Fighting > against using a 10 year old language standard I think is silly; and I like > have the old obsolete ports in gcc. 64bit integer might not be called "long long", it could be "long" or "__int64", size_t/ptrdiff_t, etc.. I do find gcc's portability impressive, and one might suggest multiple precision arithmetic, a pair of longs, but indeed compilers lacking some 64bit integer by some name are rare, and one could always bootstrap via older gcc or take advantage of "biarch/multiarch" and first build "native 32bit" and then "native 64bit" with the native 32bit gcc as the bootstrap compiler. (I relatively recently bootstrapped hppa-hpux-gcc-4.x via K&R cc via gcc 3.x (3.3?). Obviously it is more time and work, but it does work, and frees mainline gcc from caring.) Heck, one could even automate this like how there is a multi-pass bootstrap, adding earlier stages that go via e.g. gcc 3.3. The earlier compiler stages could be stripped down, e.g. no optimizer, no debug info output, no LTO. - Jay
FW: long long availability in host compiler (Re: constant that doesn't fit in 32bits in alpha.c)
[this time as plain text, sorry] > Date: Fri, 15 Jun 2012 19:58:23 + > From: joseph > To: tromey > CC: ebotcazou palves gcc-patches gingold rth mikestump > Subject: Re: long long availability in host compiler (Re: constant that > doesn't fit in 32bits in alpha.c) > > On Fri, 15 Jun 2012, Tom Tromey wrote: > > > HOST_WIDE_INT is also not very persuasive to me. We did many things in > > Although HOST_WIDE_INT is used for too many different things (see Diego's > and my architectural goals documents for more discussion, specifically > "HOST_WIDE_INT, HOST_WIDEST_INT and associated concepts" at the bottom of > the conventions document), I don't think we should use "long long" > directly in the compiler (except in limited places such as hwint.h > selecting a type to use for some abstraction) simply because it's not the > right abstraction for saying what the requirements are on the type being > used. If the requirement is "at least 64 bits", int_fast64_t would be > better, for example (gnulib can generate a stdint.h where the host doesn't > have it). If it's "big enough for the target address space" then > HOST_WIDE_INT is what we have at present. If it's "fast on the host, but > size doesn't matter", then HOST_WIDEST_FAST_INT. > > -- > Joseph S. Myers > joseph@ > If it's "fast on the host, but size doesn't matter", then > HOST_WIDEST_FAST_INT That is int, right? I guess sometimes long, 64bit integer might be faster on 64bit host that has 32bit int?? For a local variables, the size difference rarely amounts to much, I think. For data structures that you have many of, size optimizations become interesting. One can easily dream up many abstractions, too many: can at least hold host pointer can at least hold target pointer can hold the size of a target struct, 32 bits is ok with slightly degraded functionality if 64bits aren't available can be a loop index for a certain smallish constant number of iterations -- e.g. what to use for (pass = 0; pass < 2;) or for (i = 0; i < sizeof(integer type);) can hold source file size or offset, or seek delta (possibly negative?) can hold host object file file size or offset, or seek delta (possibly negative?) can hold target object file file size or offset, or seek delta (possibly negative?) can hold host executable file file size or offset, or seek delta (possibly negative?) can hold target executable file file size or offset, or seek delta (possibly negative?) can hold host library/archive file file size or offset, or seek delta (possibly negative?) can hold target library/archive file file size or offset, or seek delta (possibly negative?) can hold the number of members in a library/archive, or seek delta (possibly negative?) can hold the number of files in a directory (e.g. for #include search caching) can hold the number of files seen in preprocessor run number of instructions in a function (held in memory or in a file?) number of basic blocks in a function (held in memory or in a file?) number of that is held in a file (same as file size generally) number of that is held in memory (size_t) number of cycles measured or estimated number of bytes allocated number of bytes allocated minus number of bytes freed length of an in-memory string (size_t strlen(), but rarely does 32bits not suffice) One can even imagine a 53bit-mantissa double being used...but after some thought in my own code, I'd really rather depend on their being a 64bit integer. It is tempting to "throw up one's hands" in disgust and just smush all the abstractions down to almost nothing. Otherwise you have to worry about if the types interoperate well, which one is larger/smaller than the other, how do I safely convert? Are their symbols for the min/max of each type? int is always at least 32bits on modern hosts and "reasonable" if not theoretical max for many things. ditto long, but is really definitely at least 32bits, and often larger similarly HOST_WIDE_INT is pretty fast, maybe slower, often 64bits, and 64bits is usually vastly sufficient for vastly most things..unless manipulating floating point pieces One can check for overflow so that if a 32bit integer proves too small, there is a clear error instead of silent wraparound and crash or bug. One could encode such overflow checks into a C++ integer-like class with operator overloading. It's not that difficult.. Or just provide the stdint.h fast/atleast/exact types and let every "section" of code make its own typedefs thereof. Establish a naming convention perhaps such that when I see foo_t, I know at a glance that is "just some integer type". Maybe by always putting "size" or "count" in the name? But some things are pervasive -- host/target address sizes/offsets. I need to go read the document.. - Jay
Re: Turn check macros into functions. (issue6188088)
On 18 May 2012 12:46, Diego Novillo wrote: > On 12-05-18 06:14 , Richard Guenther wrote: > >> As you retain the macros anyway you can simply not return anything >> from the C++ checking functions define to a stmt expression >> ({ check_in_cxx (t); t; }) > > > Sure, but that takes us back to the original gdb issue: it does not > understand statement expressions. What's wrong with: (check_in_cxx(t), t) ? Jay.
RE: constant that doesn't fit in 32bits in alpha.c
Thank you. I like it. May I have another? book2:gcc jay$ grep -i epoch vms* vmsdbgout.c:/* Difference in seconds between the VMS Epoch and the Unix Epoch */ vmsdbgout.c:static const long long vms_epoch_offset = 3506716800ll; vmsdbgout.c:#define VMS_EPOCH_OFFSET 350671680 vmsdbgout.c: + VMS_EPOCH_OFFSET; :) - Jay > Date: Mon, 11 Jun 2012 16:06:03 -0700 > From: r...@redhat.com > To: jay.kr...@cornell.edu > CC: gcc-patches@gcc.gnu.org > Subject: Re: constant that doesn't fit in 32bits in alpha.c > > Bah. Wrong patch. > > > r~
RE: constant that doesn't fit in 32bits in alpha.c
Oops, agreed, shift missing. Also, I've been bitten, unable to find stuff (grep) due to token pasting, so I am a little slower to use it. But I understand it is useful in general for reuse. - Jay > Subject: Re: constant that doesn't fit in 32bits in alpha.c > From: mikest...@comcast.net > Date: Mon, 11 Jun 2012 16:23:57 -0700 > CC: jay.kr...@cornell.edu; gcc-patches@gcc.gnu.org > To: r...@redhat.com > > On Jun 11, 2012, at 4:06 PM, Richard Henderson wrote: > > Bah. Wrong patch. > > > > > > r~ > > > > > Hum, I'm trying to see how this patch works... I feel like there is something > I'm missing, like a shift?
Re: [PATCH] Extend VRP BIT_IOR_EXPR to handle sign-bit
On 9 August 2011 13:23, Richard Guenther wrote: > 2011-08-08 Richard Guenther > > * tree-vrp.c (zero_nonzero_bits_from_vr): Also return precise > information for with only negative values. "for *ranges* with" ? Jay.
Re: [PATCH] Support for known unknown alignment
On 20 April 2012 16:54, Martin Jambor wrote: > two days ago I talked to Richi on IRC about the functions to determine > the expected alignment of objects and pointers we have and he > suggested that get_object_alignment_1 and get_pointer_alignment_1 > should return whether the alignment is actually known Can you explain how returning "unknown" is different from returning some minimal known alignment? Comments like: > ! /* Compute values M and N such that M divides (address of EXP - N) and > ! such that N < M. Store N in *BITPOSP and return M. suggest that M=1, N=0 is always a valid conservative thing to return. If there is a difference, the comment should explain what it means. Thanks, Jay.
Re: [PATCH] Support for known unknown alignment
On 23 April 2012 14:30, Richard Guenther wrote: > Well, CCP simply tracks known-bits and derives the alignment > value from that. If tem & -tem computes as zero that means > val->mask.low is all zeros. Doesn't that mean that all bits are known? So you could set: pi->align = 1 << 32; // or some suitably large power of two pi->misalign = val->value; Jay.
Re: [PATCH][RFC][1/2] Bitfield lowering, add BIT_FIELD_EXPR
> BIT_FIELD_EXPR is equivalent to computing > a & ~((1 << C1 - 1) << C2) | ((b << C2) & (1 << C1 = 1)), a & ~(((1 << C1) - 1) << C2) | ((b & ((1 << C1) - 1)) << C2) ? Jay. thus > inserting b of width C1 at the bitfield position C2 in a, returning > the new value. This allows translating > BIT_FIELD_REF = b; > to > a = BIT_FIELD_EXPR ; > which avoids partial definitions of a (thus, BIT_FIELD_EXPR is > similar to COMPLEX_EXPR). BIT_FIELD_EXPR is supposed to work > on registers only. > > Comments welcome, esp. on how to avoid introducing quaternary > RHS on gimple stmts (or using a GIMPLE_SINGLE_RHS as the patch does). > > Folders/combiners are missing to handle some of the easy > BIT_FIELD_REF / BIT_FIELD_EXPR cases, as well as eventually > re-writing shift/mask operations to BIT_FIELD_REF/EXPR. > > Richard. > > 2011-06-16 Richard Guenther > > * expr.c (expand_expr_real_1): Handle BIT_FIELD_EXPR. > * fold-const.c (operand_equal_p): Likewise. > (build_bit_mask): New function. > (fold_quaternary_loc): Likewise. > (fold): Call it. > (fold_build4_stat_loc): New function. > * gimplify.c (gimplify_expr): Handle BIT_FIELD_EXPR. > * tree-inline.c (estimate_operator_cost): Likewise. > * tree-pretty-print.c (dump_generic_node): Likewise. > * tree-ssa-operands.c (get_expr_operands): Likewise. > * tree.def (BIT_FIELD_EXPR): New tree code. > * tree.h (build_bit_mask): Declare. > (fold_quaternary): Define. > (fold_quaternary_loc): Declare. > (fold_build4): Define. > (fold_build4_loc): Likewise. > (fold_build4_stat_loc): Declare. > * gimple.c (gimple_rhs_class_table): Handle BIT_FIELD_EXPR. > > Index: trunk/gcc/expr.c > === > *** trunk.orig/gcc/expr.c 2011-06-15 13:27:40.0 +0200 > --- trunk/gcc/expr.c 2011-06-15 15:08:41.0 +0200 > *** expand_expr_real_1 (tree exp, rtx target > *** 8680,8685 > --- 8680,8708 > > return expand_constructor (exp, target, modifier, false); > > + case BIT_FIELD_EXPR: > + { > + unsigned bitpos = (unsigned) TREE_INT_CST_LOW (TREE_OPERAND (exp, > 3)); > + unsigned bitsize = (unsigned) TREE_INT_CST_LOW (TREE_OPERAND (exp, > 2)); > + tree bits, mask; > + if (BYTES_BIG_ENDIAN) > + bitpos = TYPE_PRECISION (type) - bitsize - bitpos; > + /* build a mask to mask/clear the bits in the word. */ > + mask = build_bit_mask (type, bitsize, bitpos); > + /* extend the bits to the word type, shift them to the right > + place and mask the bits. */ > + bits = fold_convert (type, TREE_OPERAND (exp, 1)); > + bits = fold_build2 (BIT_AND_EXPR, type, > + fold_build2 (LSHIFT_EXPR, type, > + bits, size_int (bitpos)), mask); > + /* switch to clear mask and do the composition. */ > + mask = fold_build1 (BIT_NOT_EXPR, type, mask); > + return expand_normal (fold_build2 (BIT_IOR_EXPR, type, > + fold_build2 (BIT_AND_EXPR, type, > + TREE_OPERAND (exp, 0), mask), > + bits)); > + } > + > case TARGET_MEM_REF: > { > addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (exp)); > Index: trunk/gcc/fold-const.c > === > *** trunk.orig/gcc/fold-const.c 2011-06-15 14:51:31.0 +0200 > --- trunk/gcc/fold-const.c 2011-06-15 15:33:04.0 +0200 > *** operand_equal_p (const_tree arg0, const_ > *** 2667,2672 > --- 2667,2675 > case DOT_PROD_EXPR: > return OP_SAME (0) && OP_SAME (1) && OP_SAME (2); > > + case BIT_FIELD_EXPR: > + return OP_SAME (0) && OP_SAME (1) && OP_SAME (2) && OP_SAME (3); > + > default: > return 0; > } > *** contains_label_p (tree st) > *** 13230,13235 > --- 13233,13251 > (walk_tree_without_duplicates (&st, contains_label_1 , NULL) != > NULL_TREE); > } > > + /* Builds and returns a mask of integral type TYPE for masking out > + BITSIZE bits at bit position BITPOS in a word of type TYPE. > + The mask has the bits set from bit BITPOS to BITPOS + BITSIZE - 1. */ > + > + tree > + build_bit_mask (tree type, unsigned int bitsize, unsigned int bitp
Re: RFA: Fix bogus mode in choose_reload_regs
On 7 July 2011 09:09, Richard Sandiford wrote: > gcc/ > * reload1.c (choose_reload_regs): Use mode sizes to check whether > an old relaod register completely defines the required value. s/relaod/reload/ Jay.
Re: [PATCH][1/n][C] Do not sign-extend sizetypes
On 11 April 2011 15:25, Richard Guenther wrote: > ! set_min_and_max_values_for_integral_type (t, precision, > ! /*is_unsinged=*/true); s/ng/gn/ Jay.
[PATCH] print extended assertion failures to stderr
From: yfeldblum The stdout stream is reserved for output intentionally produced by the application. Assertion failures and other forms of logging must be emitted to stderr, not to stdout. It is common for testing and monitoring infrastructure to scan stderr for errors, such as for assertion failures, and to collect or retain them for analysis or observation. It is a norm that assertion failures match this expectation in practice. While `__builtin_fprintf` is available as a builtin, there is no equivalent builtin for `stderr`. The only option in practice is to use the macro `stderr`, which requires `#include `. It is desired not to add such an include to `bits/c++config` so the solution is to write and export a function which may be called by `bits/c++config`. This is expected to be API-compatible and ABI-compatible with caveats. Code compiled against an earlier libstdc++ will work when linked into a later libstdc++ but the stream to which assertion failures are logged is anybody's guess, and in practice will be determined by the link line and the choice of linker. This fix targets builds for which all C++ code is built against a libstdc++ with the fix. Alternatives: * This, which is the smallest change. * This, but also defining symbols `std::__stdin` and `std::__stdout` for completeness. * Define a symbol like `std::__printf_stderr` which prints any message with any formatting to stderr, just as `std::printf` does to stdout, and call that from `std::__replacement_assert` instead of calling `__builtin_printf`. * Move `std::__replacement_assert` into libstdc++.so and no longer mark it as weak. This allows an application with some parts built against a previous libstdc++ to guarantee that the fix will be applied at least to the parts that are built against a libstdc++ containing the fix. libstdc++-v3/ChangeLog: include/bits/c++config (__glibcxx_assert): print to stderr. --- libstdc++-v3/include/bits/c++config | 8 -- libstdc++-v3/src/c++98/Makefile.am | 1 + libstdc++-v3/src/c++98/Makefile.in | 2 +- libstdc++-v3/src/c++98/stdio.cc | 39 + 4 files changed, 47 insertions(+), 3 deletions(-) create mode 100644 libstdc++-v3/src/c++98/stdio.cc diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config index a64958096718126a49e8767694e913ed96108df2..d821ba09d88dc3e42ff1807200cfece71cc18bd9 100644 --- a/libstdc++-v3/include/bits/c++config +++ b/libstdc++-v3/include/bits/c++config @@ -523,6 +523,10 @@ namespace std # ifdef _GLIBCXX_VERBOSE_ASSERT namespace std { + // Avoid the use of stderr, because we're trying to keep the + // include out of the mix. + extern "C++" void* __stderr() _GLIBCXX_NOEXCEPT; + // Avoid the use of assert, because we're trying to keep the // include out of the mix. extern "C++" _GLIBCXX_NORETURN @@ -531,8 +535,8 @@ namespace std const char* __function, const char* __condition) _GLIBCXX_NOEXCEPT { -__builtin_printf("%s:%d: %s: Assertion '%s' failed.\n", __file, __line, - __function, __condition); +__builtin_fprintf(__stderr(), "%s:%d: %s: Assertion '%s' failed.\n", + __file, __line, __function, __condition); __builtin_abort(); } } diff --git a/libstdc++-v3/src/c++98/Makefile.am b/libstdc++-v3/src/c++98/Makefile.am index b48b57a2945780bb48496d3b5e76de4be61f836e..4032f914ea20344f51f2f219c5575d2a3858c44c 100644 --- a/libstdc++-v3/src/c++98/Makefile.am +++ b/libstdc++-v3/src/c++98/Makefile.am @@ -136,6 +136,7 @@ sources = \ math_stubs_float.cc \ math_stubs_long_double.cc \ stdexcept.cc \ + stdio.cc \ strstream.cc \ tree.cc \ istream.cc \ diff --git a/libstdc++-v3/src/c++98/Makefile.in b/libstdc++-v3/src/c++98/Makefile.in index f9ebb0ff4f4cb86cde7070b5ba6b8bf6a20515b3..e8aeb37d864a0ab7711d763fe8fbd3045db6e00d 100644 --- a/libstdc++-v3/src/c++98/Makefile.in +++ b/libstdc++-v3/src/c++98/Makefile.in @@ -142,7 +142,7 @@ am__objects_7 = bitmap_allocator.lo pool_allocator.lo mt_allocator.lo \ list.lo list-aux.lo list-aux-2.lo list_associated.lo \ list_associated-2.lo locale.lo locale_init.lo locale_facets.lo \ localename.lo math_stubs_float.lo math_stubs_long_double.lo \ - stdexcept.lo strstream.lo tree.lo istream.lo istream-string.lo \ + stdexcept.lo stdio.lo strstream.lo tree.lo istream.lo istream-string.lo \ streambuf.lo valarray.lo $(am__objects_1) $(am__objects_3) \ $(am__objects_6) am_libc__98convenience_la_OBJECTS = $(am__objects_7) diff --git a/libstdc++-v3/src/c++98/stdio.cc b/libstdc++-v3/src/c++98/stdio.cc new file mode 100644 index ..d0acb9117e1728f66f1a72ae3a9f471af72034ef --- /dev/null +++ b/libstdc++-v3/src/c++98/stdio.cc @@ -0,0 +1,39 @@ +// Portability symbols for -*- C++ -*- + +// Copyright (C) 2021-2021 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or