Re: [PATCH] Enable SGX intrinsics
On Thu, Dec 29, 2016 at 10:50 AM, Koval, Julia wrote: > Hi, > > This patch enables Intel SGX instructions (Reference: > https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf > page 4478 in pdf and 3D 41-1 in page numbers) Ok for trunk? I don't like asm macros, but since we can tolerate similar implementation of cpuid, we can also tolerate encls/enclu. One genreal remark: +#define macro_encls_bc(leaf, b, c, retval) \ + __asm__ __volatile__ ("encls\n\t" \ + : "=a" (retval) \ + : "a" (leaf), "b" (b), "c" (c)) These internal macros are user-visible, so please uglify them with a double underscore, like __encls_bc to put them into internal namespace. IMO, there is no need to use "macro" prefix. Uros.
Re: [PATCH] Enable SGX intrinsics
On Fri, Dec 30, 2016 at 09:25:49AM +0100, Uros Bizjak wrote: > On Thu, Dec 29, 2016 at 10:50 AM, Koval, Julia wrote: > > Hi, > > > > This patch enables Intel SGX instructions (Reference: > > https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf > > page 4478 in pdf and 3D 41-1 in page numbers) Ok for trunk? > > I don't like asm macros, but since we can tolerate similar > implementation of cpuid, we can also tolerate encls/enclu. > > One genreal remark: > > +#define macro_encls_bc(leaf, b, c, retval) \ > + __asm__ __volatile__ ("encls\n\t" \ > + : "=a" (retval) \ > + : "a" (leaf), "b" (b), "c" (c)) > > These internal macros are user-visible, so please uglify them with a > double underscore, like > > __encls_bc > > to put them into internal namespace. IMO, there is no need to use > "macro" prefix. I see other issues: +extern __inline int +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_encls_u32 (const int __leaf, size_t *data) +{ + enum encls_type type = (enum encls_type)__leaf; + int retval = 0; + if (!__builtin_constant_p (type)) + { +macro_encls_generic (__leaf, data[0], data[1], data[2], retval); + } + else 1) some parameters and variable names not uglified in inline functions, data, type, retval in this case; using __data etc. or __L, __D, __T, __R would be better; look at other intrin files for what identifiers they are using 2) the indentation is wrong, { should after if/else should be indented 2 more columns to the right, so column 4 in this case, and the body of the block 2 further positions. 3) For a single line body there should be no {}s around, so put the macro without the {}s. Is: enum encls_type {ECREATE, EADD, EINIT, EREMOVE, EDBGRD, EDBGWR, EEXTEND, ELDB, ELDU, EBLOCK, EPA, EWB, ETRACK, EAUG, EMODPR, EMODT}; enum enclu_type {EREPORT, EGETKEY, EENTER, ERESUME, EEXIT, EACCEPT, EMODPE, EACCEPTCOPY}; really part of the official sgxintrin.h ABI? 4) the enum names are not uglified 5) the enumerators use namespace reserved for errno.h extensions, see http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_02.html Header Prefix E[0-9], E[A-Z] Especially because they aren't used for errors in this case, it is a particularly bad choice; and, because the header is unconditionally included in x86intrin.h, this won't affect just users that want SGX, but all users of x86intrin.h, which means huge amount of code in the wild. Jakub
RE: [PATCH] Enable SGX intrinsics
Thank you for your comments, how about this patch? Enums are not part of the intrinsic ABI, they are just meaningful names for constants, taken from reference doc. gcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_SGX_UNSET, OPTION_MASK_ISA_SGX_SET): New. (ix86_handle_option): Handle OPT_msgx. * config.gcc: Added sgxintrin.h. * config/i386/cpuid.h (bit_SGX): New. * config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx. * config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__. * config/i386/i386.c (ix86_target_string): Add -msgx. (PTA_SGX): New. (ix86_option_override_internal): Handle new options. (ix86_valid_target_attribute_inner_p): Add sgx. * config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New. * config/i386/i386.opt: Add msgx. * config/i386/sgxintrin.h: New file. * config/i386/x86intrin.h: Add sgxintrin.h. * testsuite/gcc.target/i386/sgx.c New test libgcc/ config/i386/cpuinfo.c (get_available_features): Handle FEATURE_SGX. config/i386/cpuinfo.h (FEATURE_SGX): New. BR, Julia -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Friday, December 30, 2016 10:19 AM To: Uros Bizjak Cc: Koval, Julia ; gcc-patches@gcc.gnu.org; vaalfr...@gmail.com; Senkevich, Andrew Subject: Re: [PATCH] Enable SGX intrinsics On Fri, Dec 30, 2016 at 09:25:49AM +0100, Uros Bizjak wrote: > On Thu, Dec 29, 2016 at 10:50 AM, Koval, Julia wrote: > > Hi, > > > > This patch enables Intel SGX instructions (Reference: > > https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf > > page 4478 in pdf and 3D 41-1 in page numbers) Ok for trunk? > > I don't like asm macros, but since we can tolerate similar > implementation of cpuid, we can also tolerate encls/enclu. > > One genreal remark: > > +#define macro_encls_bc(leaf, b, c, retval) \ > + __asm__ __volatile__ ("encls\n\t" \ > + : "=a" (retval) \ > + : "a" (leaf), "b" (b), "c" (c)) > > These internal macros are user-visible, so please uglify them with a > double underscore, like > > __encls_bc > > to put them into internal namespace. IMO, there is no need to use > "macro" prefix. I see other issues: +extern __inline int +__attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_encls_u32 (const int __leaf, size_t *data) { + enum encls_type type = (enum encls_type)__leaf; + int retval = 0; + if (!__builtin_constant_p (type)) + { +macro_encls_generic (__leaf, data[0], data[1], data[2], retval); + } + else 1) some parameters and variable names not uglified in inline functions, data, type, retval in this case; using __data etc. or __L, __D, __T, __R would be better; look at other intrin files for what identifiers they are using 2) the indentation is wrong, { should after if/else should be indented 2 more columns to the right, so column 4 in this case, and the body of the block 2 further positions. 3) For a single line body there should be no {}s around, so put the macro without the {}s. Is: enum encls_type {ECREATE, EADD, EINIT, EREMOVE, EDBGRD, EDBGWR, EEXTEND, ELDB, ELDU, EBLOCK, EPA, EWB, ETRACK, EAUG, EMODPR, EMODT}; enum enclu_type {EREPORT, EGETKEY, EENTER, ERESUME, EEXIT, EACCEPT, EMODPE, EACCEPTCOPY}; really part of the official sgxintrin.h ABI? 4) the enum names are not uglified 5) the enumerators use namespace reserved for errno.h extensions, see http://pubs.opengroup.org/onlinepubs/007904975/functions/xsh_chap02_02.html Header Prefix E[0-9], E[A-Z] Especially because they aren't used for errors in this case, it is a particularly bad choice; and, because the header is unconditionally included in x86intrin.h, this won't affect just users that want SGX, but all users of x86intrin.h, which means huge amount of code in the wild. Jakub 0001-Enable-SGX.PATCH Description: 0001-Enable-SGX.PATCH
Re: [PATCH] Enable SGX intrinsics
On Fri, Dec 30, 2016 at 3:17 PM, Koval, Julia wrote: > Thank you for your comments, how about this patch? Enums are not part of the > intrinsic ABI, they are just meaningful names for constants, taken from > reference doc. > > gcc/ > * common/config/i386/i386-common.c >(OPTION_MASK_ISA_SGX_UNSET, OPTION_MASK_ISA_SGX_SET): New. >(ix86_handle_option): Handle OPT_msgx. > * config.gcc: Added sgxintrin.h. > * config/i386/cpuid.h (bit_SGX): New. > * config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx. > * config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__. > * config/i386/i386.c >(ix86_target_string): Add -msgx. >(PTA_SGX): New. >(ix86_option_override_internal): Handle new options. >(ix86_valid_target_attribute_inner_p): Add sgx. > * config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New. > * config/i386/i386.opt: Add msgx. > * config/i386/sgxintrin.h: New file. > * config/i386/x86intrin.h: Add sgxintrin.h. > * testsuite/gcc.target/i386/sgx.c New test > > libgcc/ > config/i386/cpuinfo.c (get_available_features): Handle FEATURE_SGX. > config/i386/cpuinfo.h (FEATURE_SGX): New. As suggested in [1], you should write multi-line enums like: enum foo { a = ... b = ... } OK with the above change(s), but please wait for Jakub, if he has some more comments. [1] https://www.gnu.org/prep/standards/standards.html Thanks, Uros.
[wwwdocs] gcc-3.4/changes.html -- remove broken True64/Alpha link to hp.com
This atrocious link is now broken (or kind of, redirecting to a generic page talking about software support and roadmap/future) of True64. ;-) Applied Gerald Index: gcc-3.4/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.4/changes.html,v retrieving revision 1.162 diff -u -r1.162 changes.html --- gcc-3.4/changes.html28 Jun 2014 20:35:50 - 1.162 +++ gcc-3.4/changes.html30 Dec 2016 14:46:13 - @@ -768,8 +768,7 @@ allow utilizing the more obscure instructions of the CPU. Parameter passing of complex arguments has changed to match the -http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V51A_HTML/ARH9MBTE/DTMNPLTN.HTM#normal-argument-list-structure";> -ABI. This change is incompatible with previous GCC versions, but +ABI. This change is incompatible with previous GCC versions, but does fix compatibility with the Tru64 compiler and several corner cases where GCC was incompatible with itself.
[PATCH, i386]: Do not reject registers without upper parts in ext_register_operand predicate
Hello! We only have to copy registers without upper parts to a pseudo in named patterns (extv, extzv and insv). Nowadays, it is the job of TARGET_LEGITIMATE_COMBINED_INSN target hook to prevent propagation of unwanted hard registers to a combined insn. So, only check the supported mode in the predicate and let the target hook do its job. 2016-12-30 Uros Bizjak * config/i386/predicates.md (ext_register_operand): Do not reject registers without upper parts here. * config/i386/i386.md (extv): Copy registers without upper parts in operand 1 to a pseudo. (extzv): Ditto. (insv): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. The patch will be committed soon. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 243969) +++ config/i386/i386.md (working copy) @@ -2766,7 +2766,10 @@ if (INTVAL (operands[2]) != 8 || INTVAL (operands[3]) != 8) FAIL; - if (! ext_register_operand (operands[1], VOIDmode)) + unsigned int regno = reg_or_subregno (operands[1]); + + /* Be careful to expand only with registers having upper parts. */ + if (regno <= LAST_VIRTUAL_REGISTER && !QI_REGNO_P (regno)) operands[1] = copy_to_reg (operands[1]); }) @@ -2794,7 +2797,10 @@ if (INTVAL (operands[2]) != 8 || INTVAL (operands[3]) != 8) FAIL; - if (! ext_register_operand (operands[1], VOIDmode)) + unsigned int regno = reg_or_subregno (operands[1]); + + /* Be careful to expand only with registers having upper parts. */ + if (regno <= LAST_VIRTUAL_REGISTER && !QI_REGNO_P (regno)) operands[1] = copy_to_reg (operands[1]); }) @@ -2878,11 +2884,14 @@ if (INTVAL (operands[1]) != 8 || INTVAL (operands[2]) != 8) FAIL; - dst = operands[0]; - - if (!ext_register_operand (dst, VOIDmode)) -dst = copy_to_reg (dst); + unsigned int regno = reg_or_subregno (operands[0]); + /* Be careful to expand only with registers having upper parts. */ + if (regno <= LAST_VIRTUAL_REGISTER && !QI_REGNO_P (regno)) +dst = copy_to_reg (operands[0]); + else +dst = operands[0]; + emit_insn (gen_insv_1 (dst, operands[3])); /* Fix up the destination if needed. */ Index: config/i386/predicates.md === --- config/i386/predicates.md (revision 243969) +++ config/i386/predicates.md (working copy) @@ -85,21 +85,14 @@ (and (match_code "reg") (match_test "REGNO (op) == FLAGS_REG"))) -;; Match an SI or HImode register for a zero_extract. +;; Match a DI, SI or HImode register for a zero_extract. (define_special_predicate "ext_register_operand" - (match_operand 0 "register_operand") -{ - if ((!TARGET_64BIT || GET_MODE (op) != DImode) - && GET_MODE (op) != SImode && GET_MODE (op) != HImode) -return false; - if (SUBREG_P (op)) -op = SUBREG_REG (op); + (and (match_operand 0 "register_operand") + (ior (and (match_test "TARGET_64BIT") +(match_test "GET_MODE (op) == DImode")) + (match_test "GET_MODE (op) == SImode") + (match_test "GET_MODE (op) == HImode" - /* Be careful to accept only registers having upper parts. */ - return (REG_P (op) - && (REGNO (op) > LAST_VIRTUAL_REGISTER || QI_REGNO_P (REGNO (op; -}) - ;; Match register operands, but include memory operands for TARGET_SSE_MATH. (define_predicate "register_ssemem_operand" (if_then_else
[doc] doc/standards.texi: remove objc.toodarkpark.net reference
Applied (as revision 243975), and I plan on backporting to the GCC 6 and GCC 5 branches later. Gerald 2016-12-30 Gerald Pfeifer * doc/standards.texi (Standards): Remove broken reference to objc.toodarkpark.net and avoid list with now just one item. Index: doc/standards.texi === --- doc/standards.texi (revision 243974) +++ doc/standards.texi (working copy) @@ -261,14 +261,8 @@ There is no formal written standard for Objective-C or Objective-C++@. The authoritative manual on traditional Objective-C (1.0) is ``Object-Oriented Programming and the Objective-C Language'': -@itemize -@item @uref{http://www.gnustep.org/@/resources/@/documentation/@/ObjectivCBook.pdf} -is the original NeXTstep document; -@item -@uref{http://objc.toodarkpark.net} -is the same document in another format. -@end itemize +is the original NeXTstep document. The Objective-C exception and synchronization syntax (that is, the keywords @code{@@try}, @code{@@throw}, @code{@@catch},
Go patch committed: use correct backend type for Type::gc_symbol_pointer
This patch by Than McIntosh fixes the Go frontend to wrap the return from Type::gc_symbol_pointer with a type conversion to uintptr, since the values returned are stored into structure fields with that type. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 243974) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -d9be5f5d7907cbc169424fe2b8532cc3919cad5b +ebe9d824adca053066837b8b19461048ced34aff The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: gcc/go/gofrontend/types.cc === --- gcc/go/gofrontend/types.cc (revision 243899) +++ gcc/go/gofrontend/types.cc (working copy) @@ -2138,7 +2138,10 @@ Type::gc_symbol_pointer(Gogo* gogo) Location bloc = Linemap::predeclared_location(); Bexpression* var_expr = gogo->backend()->var_expression(t->gc_symbol_var_, VE_rvalue, bloc); - return gogo->backend()->address_expression(var_expr, bloc); + Bexpression* addr_expr = + gogo->backend()->address_expression(var_expr, bloc); + Btype* ubtype = Type::lookup_integer_type("uintptr")->get_backend(gogo); + return gogo->backend()->convert_expression(ubtype, addr_expr, bloc); } // A mapping from unnamed types to GC symbol variables.
Re: [PATCH] Enable SGX intrinsics
On Fri, Dec 30, 2016 at 03:37:14PM +0100, Uros Bizjak wrote: > As suggested in [1], you should write multi-line enums like: > > enum foo > { > a = ... > b = ... > } Sure. Plus it depends on if users of the APIs should just write the operands on their own as numbers, or as __SGX_E*, or as E*. In the first case the patch sans formatting is reasonable, in the second case the enums should be moved to file scope, in the last case we have to live with the namespace pollution. The pdf you've referenced in the thread doesn't list the _encls_u32 and _enclu_u32 intrinsics, so I think it depends on what ICC does (if it has been shipped with such a support already, or on coordination with ICC if not). Jakub
Re: [PATCH] Fix exgettext to handle multi-line help texts from *.opt files (PR translation/78745)
On Thu, 29 Dec 2016, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, the option handling for multi-line help texts > concatenates those lines with spaces in between (essentially replaces > newlines with spaces), but exgettext extracts just the first line from the > multiline help text and throws away the rest. > > With this patch, there are changes like: > #: config/i386/i386.opt:583 > -msgid "Do dispatch scheduling if processor is bdver1, bdver2, bdver3, bdver4" > +msgid "" > +"Do dispatch scheduling if processor is bdver1, bdver2, bdver3, bdver4 or " > +"znver1 and Haifa scheduling is selected." > msgstr "" > in gcc.pot. > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. -- Joseph S. Myers jos...@codesourcery.com
[PATCH, i386]: Merge *testqi_ext_3 insn pattern with its splitter
Also, use wi::shifted_mask helper function instead of hardcoding it. No functional changes. 2016-12-30 Uros Bizjak * config/i386/i386.md (*testqi_ext_3): Merge insn pattern and corresponding splitter to define_insn_and_split. Use wi::shifted_mask helper function to calculate mask. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 243976) +++ config/i386/i386.md (working copy) @@ -7924,42 +7924,33 @@ (set_attr "mode" "QI")]) ;; Combine likes to form bit extractions for some tests. Humor it. -(define_insn "*testqi_ext_3" - [(set (reg FLAGS_REG) - (compare (zero_extract:SWI248 - (match_operand 0 "nonimmediate_operand" "rm") - (match_operand 1 "const_int_operand" "n") - (match_operand 2 "const_int_operand" "n")) -(const_int 0)))] +(define_insn_and_split "*testqi_ext_3" + [(set (match_operand 0 "flags_reg_operand") +(match_operator 1 "compare_operator" + [(zero_extract:SWI248 +(match_operand 2 "nonimmediate_operand" "rm") +(match_operand 3 "const_int_operand" "n") +(match_operand 4 "const_int_operand" "n")) + (const_int 0)]))] "ix86_match_ccmode (insn, CCNOmode) - && ((TARGET_64BIT && GET_MODE (operands[0]) == DImode) - || GET_MODE (operands[0]) == SImode - || GET_MODE (operands[0]) == HImode - || GET_MODE (operands[0]) == QImode) + && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) + || GET_MODE (operands[2]) == SImode + || GET_MODE (operands[2]) == HImode + || GET_MODE (operands[2]) == QImode) /* Ensure that resulting mask is zero or sign extended operand. */ - && INTVAL (operands[2]) >= 0 - && ((INTVAL (operands[1]) > 0 - && INTVAL (operands[1]) + INTVAL (operands[2]) <= 32) + && INTVAL (operands[4]) >= 0 + && ((INTVAL (operands[3]) > 0 + && INTVAL (operands[3]) + INTVAL (operands[4]) <= 32) || (mode == DImode - && INTVAL (operands[1]) > 32 - && INTVAL (operands[1]) + INTVAL (operands[2]) == 64))" - "#") - -(define_split - [(set (match_operand 0 "flags_reg_operand") -(match_operator 1 "compare_operator" - [(zero_extract -(match_operand 2 "nonimmediate_operand") -(match_operand 3 "const_int_operand") -(match_operand 4 "const_int_operand")) - (const_int 0)]))] - "ix86_match_ccmode (insn, CCNOmode)" + && INTVAL (operands[3]) > 32 + && INTVAL (operands[3]) + INTVAL (operands[4]) == 64))" + "#" + "&& 1" [(set (match_dup 0) (match_op_dup 1 [(match_dup 2) (const_int 0)]))] { rtx val = operands[2]; HOST_WIDE_INT len = INTVAL (operands[3]); HOST_WIDE_INT pos = INTVAL (operands[4]); - HOST_WIDE_INT mask; machine_mode mode, submode; mode = GET_MODE (val); @@ -7990,13 +7981,10 @@ val = gen_lowpart (QImode, val); } - if (len == HOST_BITS_PER_WIDE_INT) -mask = -1; - else -mask = (HOST_WIDE_INT_1 << len) - 1; - mask <<= pos; + wide_int mask += wi::shifted_mask (pos, len, false, GET_MODE_PRECISION (mode)); - operands[2] = gen_rtx_AND (mode, val, gen_int_mode (mask, mode)); + operands[2] = gen_rtx_AND (mode, val, immed_wide_int_const (mask, mode)); }) ;; Convert HImode/SImode test instructions with immediate to QImode ones.
[PATCH], PR target/78900, Fix PowerPC __float128 signbit
The signbit-3.c test explicitly tests for the value coming from memory, a vector register, or a GPR. Unfortunately, the code did not handle splitting up the registers when the value was in a GPR. These patches add teh GPR support. While I was editing the code, I also did some cleanup. I removed the Fsignbit mode attribute, since the only two modes used both use the same attribute. This is a relic of the original code generation that also provided optimized signbit support for DFmode/SFmode. Since the DFmode/SFmode got dropped (GCC 6 was in stage 3, and we needed to get signbit working for __float128 -- it already worked for DFmode/SFmode, but the code generation could be improved). I also noticed that use of signbit tended to generate sign or zero extension. Since the function only returns 0/1, I added combiner insns to eliminate the extra zero/sign extend. I have tested this on both big endian and little endian power8 systems. The bootstrap and make check had no regressions. Is this ok to put into the trunk? The same error appears on GCC 6 as well. Assuming the patch applys cleanly and fixes the problem, can I install it on the GCC 6 branch as well after a burn in period? 2016-12-30 Michael Meissner PR target/78900 * config/rs6000/rs6000.c (rs6000_split_signbit): Change some assertions. Add support for doing the signbit if the IEEE 128-bit floating point value is in a GPR. * config/rs6000/rs6000.md (Fsignbit): Delete. (signbit2_dm): Delete using and just use "wa". Update the length attribute if the value is in a GPR. (signbit2_dm_ext): Add combiner pattern to eliminate the sign or zero extension instruction, since the value is always 0/1. (signbit2_dm2): Delete using . -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 243966) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -25170,9 +25170,7 @@ rs6000_split_signbit (rtx dest, rtx src) rtx dest_di = (d_mode == DImode) ? dest : gen_lowpart (DImode, dest); rtx shift_reg = dest_di; - gcc_assert (REG_P (dest)); - gcc_assert (REG_P (src) || MEM_P (src)); - gcc_assert (s_mode == KFmode || s_mode == TFmode); + gcc_assert (FLOAT128_IEEE_P (s_mode) && TARGET_POWERPC64); if (MEM_P (src)) { @@ -25184,17 +25182,20 @@ rs6000_split_signbit (rtx dest, rtx src) else { - unsigned int r = REGNO (src); + unsigned int r = reg_or_subregno (src); - /* If this is a VSX register, generate the special mfvsrd instruction -to get it in a GPR. Until we support SF and DF modes, that will -always be true. */ - gcc_assert (VSX_REGNO_P (r)); + if (INT_REGNO_P (r)) + shift_reg = gen_rtx_REG (DImode, r + (BYTES_BIG_ENDIAN == 0)); - if (s_mode == KFmode) - emit_insn (gen_signbitkf2_dm2 (dest_di, src)); else - emit_insn (gen_signbittf2_dm2 (dest_di, src)); + { + /* Generate the special mfvsrd instruction to get it in a GPR. */ + gcc_assert (VSX_REGNO_P (r)); + if (s_mode == KFmode) + emit_insn (gen_signbitkf2_dm2 (dest_di, src)); + else + emit_insn (gen_signbittf2_dm2 (dest_di, src)); + } } emit_insn (gen_lshrdi3 (dest_di, shift_reg, GEN_INT (63))); Index: gcc/config/rs6000/rs6000.md === --- gcc/config/rs6000/rs6000.md (revision 243966) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -518,9 +518,6 @@ (define_mode_iterator FLOAT128 [(KF "TAR (define_mode_iterator SIGNBIT [(KF "FLOAT128_VECTOR_P (KFmode)") (TF "FLOAT128_VECTOR_P (TFmode)")]) -(define_mode_attr Fsignbit [(KF "wa") -(TF "wa")]) - ; Iterator for ISA 3.0 supported floating point types (define_mode_iterator FP_ISA3 [SF DF @@ -4744,7 +4741,7 @@ (define_expand "copysign3" (define_insn_and_split "signbit2_dm" [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r,r") (unspec:SI -[(match_operand:SIGNBIT 1 "input_operand" ",m,r")] +[(match_operand:SIGNBIT 1 "input_operand" "wa,m,r")] UNSPEC_SIGNBIT))] "TARGET_POWERPC64 && TARGET_DIRECT_MOVE" "#" @@ -4754,7 +4751,24 @@ (define_insn_and_split "signbit2_d rs6000_split_signbit (operands[0], operands[1]); DONE; } - [(set_attr "length" "8,8,12") + [(set_attr "length" "8,8,4") + (set_attr "type" "mftgpr,load,integer")]) + +(define_insn_and_split "*signbit2_dm_ext" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r,r,r") + (any_extend:DI +(unspec:SI + [(match_operand:SIGNBIT 1 "input_operand" "wa,m,r")] + UNSPEC_SIGNBIT)))] +
[doc, committed] reorder entries in cppopts.texi
I've checked in the attached patch to reorder entries in cppopts.texi. I put the most-commonly-used options (e.g. -D and -U) first, and the options for debugging cpp last, and tried to group the things in between a little better. There's no change to actual content here, just the ordering. -Sandra 2016-12-30 Sandra Loosemore gcc/ * doc/cppopts.texi: Reorder table entries to put the most commonly-used options first and debug options last. Index: gcc/doc/cppopts.texi === --- gcc/doc/cppopts.texi (revision 243983) +++ gcc/doc/cppopts.texi (working copy) @@ -39,6 +39,28 @@ are given on the command line. All @opt Cancel any previous definition of @var{name}, either built in or provided with a @option{-D} option. +@item -include @var{file} +@opindex include +Process @var{file} as if @code{#include "file"} appeared as the first +line of the primary source file. However, the first directory searched +for @var{file} is the preprocessor's working directory @emph{instead of} +the directory containing the main source file. If not found there, it +is searched for in the remainder of the @code{#include "@dots{}"} search +chain as normal. + +If multiple @option{-include} options are given, the files are included +in the order they appear on the command line. + +@item -imacros @var{file} +@opindex imacros +Exactly like @option{-include}, except that any output produced by +scanning @var{file} is thrown away. Macros it defines remain defined. +This allows you to acquire all the macros from a header without also +processing its declarations. + +All files specified by @option{-imacros} are processed before all files +specified by @option{-include}. + @item -undef @opindex undef Do not predefine any system-specific or GCC-specific macros. The @@ -177,57 +199,21 @@ a dependency output file as a side-effec Like @option{-MD} except mention only user header files, not system header files. -@ifclear cppmanual -@item -fpch-deps -@opindex fpch-deps -When using precompiled headers (@pxref{Precompiled Headers}), this flag -will cause the dependency-output flags to also list the files from the -precompiled header's dependencies. If not specified only the -precompiled header would be listed and not the files that were used to -create it because those files are not consulted when a precompiled -header is used. - -@item -fpch-preprocess -@opindex fpch-preprocess -This option allows use of a precompiled header (@pxref{Precompiled -Headers}) together with @option{-E}. It inserts a special @code{#pragma}, -@code{#pragma GCC pch_preprocess "@var{filename}"} in the output to mark -the place where the precompiled header was found, and its @var{filename}. -When @option{-fpreprocessed} is in use, GCC recognizes this @code{#pragma} -and loads the PCH@. +@item -fpreprocessed +@opindex fpreprocessed +Indicate to the preprocessor that the input file has already been +preprocessed. This suppresses things like macro expansion, trigraph +conversion, escaped newline splicing, and processing of most directives. +The preprocessor still recognizes and removes comments, so that you can +pass a file preprocessed with @option{-C} to the compiler without +problems. In this mode the integrated preprocessor is little more than +a tokenizer for the front ends. -This option is off by default, because the resulting preprocessed output -is only really suitable as input to GCC@. It is switched on by +@option{-fpreprocessed} is implicit if the input file has one of the +extensions @samp{.i}, @samp{.ii} or @samp{.mi}. These are the +extensions that GCC uses for preprocessed files created by @option{-save-temps}. -You should not write this @code{#pragma} in your own code, but it is -safe to edit the filename if the PCH file is available in a different -location. The filename may be absolute or it may be relative to GCC's -current directory. -@end ifclear - -@item -include @var{file} -@opindex include -Process @var{file} as if @code{#include "file"} appeared as the first -line of the primary source file. However, the first directory searched -for @var{file} is the preprocessor's working directory @emph{instead of} -the directory containing the main source file. If not found there, it -is searched for in the remainder of the @code{#include "@dots{}"} search -chain as normal. - -If multiple @option{-include} options are given, the files are included -in the order they appear on the command line. - -@item -imacros @var{file} -@opindex imacros -Exactly like @option{-include}, except that any output produced by -scanning @var{file} is thrown away. Macros it defines remain defined. -This allows you to acquire all the macros from a header without also -processing its declarations. - -All files specified by @option{-imacros} are processed before all files -specified by @option{-include}. - @item -fdirectives-only @opindex fdirectives-only When preprocessing, handle dir
[patch] [libobjc] allow default/fallback in --with-target-bdw-gc-include configure option
This addresses PR libobjc/78697, allowing a common include dir for all multilib variants. Tested with a libgc installation in /opt/gcc/include, /opt/gcc/lib32, /opt/gcc/lib64 and configured with --prefix=/opt/gcc7 --enable-languages=c,c++,objc --disable-shared --enable-objc-gc=yes --with-multilib-list=m32,m64 --enable-checking=release --disable-bootstrap --with-target-bdw-gc-include=/opt/gcc/include --with-target-bdw-gc-lib=/opt/gcc/lib64,32=/opt/gcc/lib32 Ok for the trunk? Matthias libobjc/ 2016-12-24 Matthias Klose PR libobjc/78697 * configure.ac: Allow default for --with-target-bdw-gc-include. * configure: Regenerate. libobjc/ 2016-12-24 Matthias Klose PR libobjc/78697 * configure.ac: Allow default for --with-target-bdw-gc-include. * configure: Regenerate. Index: libobjc/configure.ac === --- libobjc/configure.ac (revision 243987) +++ libobjc/configure.ac (working copy) @@ -256,16 +256,19 @@ for i in `echo $with_target_bdw_gc_include | tr ',' ' '`; do case "$i" in *=*) sd=${i%%=*}; d=${i#*=} ;; - *) sd=.; d=$i ;; + *) sd=.; d=$i; fallback=$i ;; esac if test "$mldir" = "$sd"; then bdw_val=$d fi done - if test "x$bdw_val" = x; then + if test "x$bdw_val" = x && test "x$bdw_inc_dir" = x && test "x$fallback" != x; then +bdw_inc_dir="$fallback" + elif test "x$bdw_val" = x; then AC_MSG_ERROR([no multilib path ($mldir) found in --with-target-bdw-gc-include]) + else +bdw_inc_dir="$bdw_val" fi - bdw_inc_dir="$bdw_val" fi bdw_val= if test "x$with_target_bdw_gc_lib" != x; then
[patch] [libobjc] fix build with --disable-shared and .la files
This addresses PR libobjc/78698, fixing the build with --disable-shared and .la files available. As mentioned in the bug report I'm not aware of a configure check to write a check using LIBTOOL_LINK and LIBTOOL_COMPILE, because these commands (using libtool) are only created by running the configure file, therefore checking for the presence of a .la file. Tested with a libgc installation in /opt/gcc/include, /opt/gcc/lib32, /opt/gcc/lib64 and configured with --prefix=/opt/gcc7 --enable-languages=c,c++,objc --disable-shared --enable-objc-gc=yes --with-multilib-list=m32,m64 --enable-checking=release --disable-bootstrap --with-target-bdw-gc-include=/opt/gcc/include --with-target-bdw-gc-lib=/opt/gcc/lib64,32=/opt/gcc/lib32 Ok for the trunk? Matthias libobjc/ 2016-12-24 Matthias Klose PR libobjc/78698 * configure.ac: Use the libgc.la file when available. * configure: Regenerate. gcc/ 2016-12-31 Matthias Klose * doc/install.texi: Allow default for --with-target-bdw-gc-include. libobjc/ 2016-12-24 Matthias Klose PR libobjc/78698 * configure.ac: Use the libgc.la file when available. * configure: Regenerate. gcc/ 2016-12-31 Matthias Klose * doc/install.texi: Allow default for --with-target-bdw-gc-include. Index: gcc/doc/install.texi === --- gcc/doc/install.texi (revision 243987) +++ gcc/doc/install.texi (working copy) @@ -2203,8 +2203,12 @@ The options @option{--with-target-bdw-gc-include} and @option{--with-target-bdw-gc-lib} must always be specified together for each multilib variant and they take precedence over -@option{--with-target-bdw-gc}. If none of these options are -specified, the library is assumed in default locations. +@option{--with-target-bdw-gc}. If @option{--with-target-bdw-gc-include} +is missing values for a multilib, then the value for the default +multilib is used. (e.g. @samp{--with-target-bdw-gc-include=/opt/bdw-gc/include} +@samp{--with-target-bdw-gc-lib=/opt/bdw-gc/lib64,32=/opt-bdw-gc/lib32}). +If none of these options are specified, the library is assumed in +default locations. @end table @html Index: libobjc/configure.ac === --- libobjc/configure.ac (revision 243987) +++ libobjc/configure.ac (working copy) @@ -290,45 +293,55 @@ AC_MSG_ERROR([no multilib path ($mldir) found in --with-target-bdw-gc-lib]) fi BDW_GC_CFLAGS="-I$bdw_inc_dir" -BDW_GC_LIBS="-L$bdw_lib_dir -lgc" +if test -f $bdw_lib_dir/libgc.la; then + BDW_GC_LIBS="$bdw_lib_dir/libgc.la" +else + BDW_GC_LIBS="-L$bdw_lib_dir -lgc" +fi AC_MSG_RESULT([found]) fi - AC_MSG_CHECKING([for system boehm-gc]) - save_CFLAGS=$CFLAGS - save_LIBS=$LIBS - CFLAGS="$CFLAGS $BDW_GC_CFLAGS" - LIBS="$LIBS $BDW_GC_LIBS" - dnl the link test is not good enough for ARM32 multilib detection, - dnl first check to link, then to run - AC_LINK_IFELSE( -[AC_LANG_PROGRAM([#include ],[GC_init()])], -[ - AC_RUN_IFELSE([AC_LANG_SOURCE([[ -#include -int main() { - GC_init(); - return 0; -} -]])], -[system_bdw_gc_found=yes], -[system_bdw_gc_found=no], -dnl assume no system boehm-gc for cross builds ... -[system_bdw_gc_found=no] - ) -], -[system_bdw_gc_found=no]) - CFLAGS=$save_CFLAGS - LIBS=$save_LIBS - if test x$enable_objc_gc = xauto && test x$system_bdw_gc_found = xno; then -AC_MSG_WARN([system bdw-gc not found, not building libobjc_gc]) -use_bdw_gc=no - elif test x$enable_objc_gc = xyes && test x$system_bdw_gc_found = xno; then -AC_MSG_ERROR([system bdw-gc required but not found]) - else + case "$BDW_GC_LIBS" in + *libgc.la) use_bdw_gc=yes -AC_MSG_RESULT([found]) - fi +;; + *) +AC_MSG_CHECKING([for system boehm-gc]) +save_CFLAGS=$CFLAGS +save_LIBS=$LIBS +CFLAGS="$CFLAGS $BDW_GC_CFLAGS" +LIBS="$LIBS $BDW_GC_LIBS" +dnl the link test is not good enough for ARM32 multilib detection, +dnl first check to link, then to run +AC_LINK_IFELSE( + [AC_LANG_PROGRAM([#include ],[GC_init()])], + [ +AC_RUN_IFELSE([AC_LANG_SOURCE([[ + #include + int main() { +GC_init(); +return 0; + } + ]])], + [system_bdw_gc_found=yes], + [system_bdw_gc_found=no], + dnl assume no system boehm-gc for cross builds ... + [system_bdw_gc_found=no] +) + ], + [system_bdw_gc_found=no]) +CFLAGS=$save_CFLAGS +LIBS=$save_LIBS +if test x$enable_objc_gc = xauto && test x$system_bdw_gc_found = xno; then + AC_MSG_WARN([system bdw-gc not found, not building libobjc_gc]) + use_bdw_gc=no +elif test x$enable_objc_gc = xyes && test x$system_bdw_gc_found = xno; then + AC_MSG_ERROR([system bdw-gc required but not fou
[PATCH/AARCH64] Improve/correct ThunderX 1 cost model for Arith_shift
Hi, Currently for the following function: int f(int a, int b) { return a + (b <<7); } GCC produces: add w0, w0, w1, lsl 7 But for ThunderX 1, it is better if the instruction was split allowing better scheduling to happen in most cases, the latency is the same. I get a small improvement in coremarks, ~1%. Currently the code does not take into account Arith_shift even though the comment: /* Strip any extend, leave shifts behind as we will cost them through mult_cost. */ Say it does not strip out the shift, aarch64_strip_extend does and has always has since the back-end was added to GCC. Once I fixed the code around aarch64_strip_extend, I got a regression for ThunderX 1 as some shifts/extends (left shifts <=4 and/or zero extends) are considered free so I needed to add a new tuning flag. Note I will get an even more improvement for ThunderX 2 CN99XX, but I have not measured it yet as I have not made the change to aarch64-cost-tables.h yet as I am waiting for approval of the renaming patch first before submitting any of the cost table changes. Also I noticed this problem with this tuning first and then looked back at what I needed to do for ThunderX 1. OK? Bootstrapped and tested on aarch64-linux-gnu without any regressions (both with and without --with-cpu=thunderx). Thanks, Andrew ChangeLog: * config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs): Increment Arith_shift and Arith_shift_reg by 1. * config/aarch64/aarch64-tuning-flags.def (easy_shift_extend): New tuning flag. * config/aarch64/aarch64.c (thunderx_tunings): Enable AARCH64_EXTRA_TUNE_EASY_SHIFT_EXTEND. (aarch64_strip_extend): Add new argument and test for it. (aarch64_easy_mult_shift_p): New function. (aarch64_rtx_mult_cost): Call aarch64_easy_mult_shift_p and don't add a cost if it is true. Update calls to aarch64_strip_extend. (aarch64_rtx_costs): Update calls to aarch64_strip_extend. Index: config/aarch64/aarch64-cost-tables.h === --- config/aarch64/aarch64-cost-tables.h(revision 243974) +++ config/aarch64/aarch64-cost-tables.h(working copy) @@ -32,8 +32,8 @@ const struct cpu_cost_table thunderx_ext 0, /* Logical. */ 0, /* Shift. */ 0, /* Shift_reg. */ -COSTS_N_INSNS (1), /* Arith_shift. */ -COSTS_N_INSNS (1), /* Arith_shift_reg. */ +COSTS_N_INSNS (1)+1, /* Arith_shift. */ +COSTS_N_INSNS (1)+1, /* Arith_shift_reg. */ COSTS_N_INSNS (1), /* UNUSED: Log_shift. */ COSTS_N_INSNS (1), /* UNUSED: Log_shift_reg. */ 0, /* Extend. */ Index: config/aarch64/aarch64-tuning-flags.def === --- config/aarch64/aarch64-tuning-flags.def (revision 243974) +++ config/aarch64/aarch64-tuning-flags.def (working copy) @@ -35,4 +35,8 @@ two load/stores are not at least 8 byte pairs. */ AARCH64_EXTRA_TUNING_OPTION ("slow_unaligned_ldpw", SLOW_UNALIGNED_LDPW) +/* Logical shift left <=4 with/without zero extend are considered easy + extended, also zero extends without the shift. */ +AARCH64_EXTRA_TUNING_OPTION ("easy_shift_extend", EASY_SHIFT_EXTEND) + #undef AARCH64_EXTRA_TUNING_OPTION Index: config/aarch64/aarch64.c === --- config/aarch64/aarch64.c(revision 243974) +++ config/aarch64/aarch64.c(working copy) @@ -714,7 +714,8 @@ static const struct tune_params thunderx 0, /* max_case_values. */ 0, /* cache_line_size. */ tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_SLOW_UNALIGNED_LDPW) /* tune_flags. */ + (AARCH64_EXTRA_TUNE_SLOW_UNALIGNED_LDPW + | AARCH64_EXTRA_TUNE_EASY_SHIFT_EXTEND) /* tune_flags. */ }; static const struct tune_params xgene1_tunings = @@ -5918,9 +5919,10 @@ aarch64_strip_shift (rtx x) /* Helper function for rtx cost calculation. Strip an extend expression from X. Returns the inner operand if successful, or the original expression on failure. We deal with a number of possible - canonicalization variations here. */ + canonicalization variations here. If STRIP_SHIFT is true, then + we can strip off a shift also. */ static rtx -aarch64_strip_extend (rtx x) +aarch64_strip_extend (rtx x, bool strip_shift) { rtx op = x; @@ -5944,7 +5946,8 @@ aarch64_strip_extend (rtx x) /* Now handle extended register, as this may also have an optional left shift by 1..4. */ - if (GET_CODE (op) == ASHIFT + if (strip_shift + && GET_CODE (op) == ASHIFT && CONST_INT_P (XEXP (op, 1)) && ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1))) <= 4) op = XEXP (op, 0); @@ -5968,6 +5971,39 @@ aarch64_shift_p (enum rtx_code code) return code == ASHIFT || code == ASHIFTRT || code == LSHIFTRT; } + +/* Return true iff X is an easy shift without a sign extend. */ +