Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR in vectorization.
On Tue, Jan 28, 2014 at 4:17 PM, Bingfeng Mei wrote: > I checked vectorization code, it seems that only relevant place > vec_widen_mult_even/odd & vec_widen_mult_lo/hi are generated is in > supportable_widening_operation. One of these pairs is selected, with priority > given to vec_widen_mult_even/odd if it is a reduction loop. However, lo/hi > pair seems to have wider usage than even/odd pair (non-loop? Non-reduction?). > Maybe that's why AltiVec and x86 still implement both pairs. Is following > patch OK? Ok. Thanks, Richard. > Index: gcc/ChangeLog > === > --- gcc/ChangeLog (revision 207183) > +++ gcc/ChangeLog (working copy) > @@ -1,3 +1,9 @@ > +2014-01-28 Bingfeng Mei > + > + * doc/md.texi: Mention that a target shouldn't implement > + vec_widen_(s|u)mul_even/odd pair if it is less efficient > + than hi/lo pair. > + > 2014-01-28 Richard Biener > > Revert > Index: gcc/doc/md.texi > === > --- gcc/doc/md.texi (revision 207183) > +++ gcc/doc/md.texi (working copy) > @@ -4918,7 +4918,8 @@ the output vector (operand 0). > Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) > are vectors with N signed/unsigned elements of size S@. Multiply the > high/low > or even/odd elements of the two vectors, and put the N/2 products of size 2*S > -in the output vector (operand 0). > +in the output vector (operand 0). A target shouldn't implement even/odd > pattern > +pair if it is less efficient than lo/hi one. > > @cindex @code{vec_widen_ushiftl_hi_@var{m}} instruction pattern > @cindex @code{vec_widen_ushiftl_lo_@var{m}} instruction pattern > > > -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 28 January 2014 12:56 > To: Bingfeng Mei > Cc: gcc@gcc.gnu.org > Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR > in vectorization. > > On Tue, Jan 28, 2014 at 12:08 PM, Bingfeng Mei wrote: >> Thanks, Richard. It is not very clear from documents. >> >> "Signed/Unsigned widening multiplication. The two inputs (operands 1 and 2) >> are vectors with N signed/unsigned elements of size S. Multiply the high/low >> or even/odd elements of the two vectors, and put the N/2 products of size 2*S >> in the output vector (operand 0)." >> >> So I thought that implementing both can help vectorizer to optimize more >> loops. >> Maybe we should improve documents. > > Maybe. But my answer was from the top of my head - so better double-check > in the vectorizer sources. > > Richard. > >> Bingfeng >> >> >> >> -Original Message- >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: 28 January 2014 11:02 >> To: Bingfeng Mei >> Cc: gcc@gcc.gnu.org >> Subject: Re: VEC_WIDEN_MULT_(LO|HI)_EXPR vs. VEC_WIDEN_MULT_(EVEN|ODD)_EXPR >> in vectorization. >> >> On Wed, Jan 22, 2014 at 1:20 PM, Bingfeng Mei wrote: >>> Hi, >>> I noticed there is a regression of 4.8 against ancient 4.5 in vectorization >>> on our port. After a bit investigation, I found following code that prefer >>> even|odd version instead of lo|hi one. This is obviously the case for >>> AltiVec and maybe some other targets. But even|odd (expanding to a series >>> of instructions) versions are less efficient on our target than lo|hi ones. >>> Shouldn't there be a target-specific hook to do the choice instead of >>> hard-coded one here, or utilizing some cost-estimating technique to compare >>> two alternatives? >> >> Hmm, what's the reason for a target to support both? I think the idea >> was that a target only supports either (the more efficient case). >> >> Richard. >> >>> /* The result of a vectorized widening operation usually requires >>> two vectors (because the widened results do not fit into one >>> vector). >>> The generated vector results would normally be expected to be >>> generated in the same order as in the original scalar computation, >>> i.e. if 8 results are generated in each vector iteration, they are >>> to be organized as follows: >>> vect1: [res1,res2,res3,res4], >>> vect2: [res5,res6,res7,res8]. >>> >>> However, in the special case that the result of the widening >>> operation is used in a reduction computation only, the order >>> doesn't >>> matter (because when vectorizing a reduction we change the order of >>> the computation). Some targets can take advantage of this and >>> generate more efficient code. For example, targets like Altivec, >>> that support widen_mult using a sequence of {mult_even,mult_odd} >>> generate the following vectors: >>> vect1: [res1,res3,res5,res7], >>> vect2: [res2,res4,res6,res8]. >>> >>> When vectorizing outer-loops, we execute
Enable debug info
Dear All, We need to support the debug info emit for our private port on gcc 4.8.1. I was in impression using option -g in the commandline by defualt ,will emit the dwarf debugging symbols and the info ,But i was wrong here. Anyone in the group point me some references or through some lights on how do i enable debug options in the compiler. Appreciate your comments and Thank you ~Umesh
Re: Enable debug info
On 01/29/2014 09:36 AM, Umesh Kalappa wrote: > I was in impression using option -g in the commandline by defualt > ,will emit the dwarf debugging symbols and the info ,But i was wrong > here. It usually does. Andrew.
type promotion
Hi All, Was porting gcc 4.8.1 to the private target which has 8 bit regs and can be used as pair for 16bit like AB ,CD but not BC or AD. I was stuck in the type promotion like int i; unsigned char c; int test () { i =c; } defined the zero_extendqihi2 pattern for the above c construct like (define_expand zero_extendqihi2 [(set (operand:hi 0 "" """) (zero_extend:hi (operand:qi 1)))] "" if(!reload_completed) { if(operands[1] != REG) operands[1]= force_reg(QI,operands[1]); /* Here i need to enforce gcc to use the next consective paired reg like B if operands[1] is in A reg or D if operands[1] is in C */ } ) How do i module the above reguirement in the backend ? Thank you ~Umesh
Re: A 404 error in gcc-4.8.2 onlinedocs
Hi Gerald, Thank you for fixing this. I'm glad to hear that a permanent workaround will be there in 4.8.3. Cheers, Gleb On Mon, Jan 27, 2014 at 4:27 AM, Gerald Pfeifer wrote: > Hi Gleb, > > On Sun, 26 Jan 2014, Gleb Smirnov wrote: >> I have been browsing the online docs here: >> http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/ >> Turns out that clicking on "3.6 Options Controlling Objective-C and >> Objective-C++ Dialects" results in a 404 error. >> >> That is, >> http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Objective-C-and-Objective-C_002b_002b-Dialect-Options.html >> does no exist. > > thanks for the report, I just fixed this "manually". > > For a bit more background refer to > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00047.html and > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00135.html ; > in a nutshell, this was due to what I consider broken design in > makeinfo and an incomplete workaround on our side. > > I did not touch the documentation of released versions of GCC > and the docs for GCC 4.8.3 will not exhibit this problem when > created, but since you ran into this and reported it, I now > did that fixup for gcc-4.8.2/gcc. > > Gerald
RE: Enable debug info
> -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of > Umesh Kalappa > Sent: Wednesday, January 29, 2014 4:36 AM > To: gcc@gcc.gnu.org > Subject: Enable debug info > > Dear All, > > We need to support the debug info emit for our private port on gcc 4.8.1. > > I was in impression using option -g in the commandline by defualt ,will emit > the dwarf debugging symbols and the info ,But i was wrong here. > > Anyone in the group point me some references or through some lights on > how do i enable debug options in the compiler. What version of GDB are you using? If you are not using the latest one, the DWARF's version might mismatch. Try to add -g and -gdwarf-2 > > Appreciate your comments and Thank you > ~Umesh
Re: gimple_build_call for a class constructor
I found my problem - I had forgotten about the cloned and generated constructors. For this case, there were 9 constructors generated - 3 sets of three with names of 'CTestClass', '__base_ctor' and '__comp_ctor'. The first two sets of constructors took references as arguments, I'm assuming copy and move constructors, and the third set took no arguments and was the one I wanted. When I used the '__comp_ctor' constructor with no arguments all worked correctly. The following code snippet is the first pass of what I did to identify the right no-arg constructor to insert into the global initialization routine: -- MethodListmethods( TYPE_METHODS( declType ) ); // MethodList is a template class I created to iterate over TREE_LISTs for( MethodList::iterator itrMethod = methods.begin(); itrMethod != methods.end(); ++itrMethod ) { ctorMethod = (const tree&)*itrMethod; if(( strcmp( IDENTIFIER_POINTER( DECL_NAME( ctorMethod )), "__comp_ctor " ) == 0 ) && ( TREE_CHAIN( DECL_ARGUMENTS( ctorMethod ) ) == NULL_TREE )) { break; } } -- For clarity, the FUNCTION_DECL does take an argument, the pointer to the variable to be initialized. Any arguments to the FUNCTION_DECL beyond a pointer to the variable to be initialized would be arguments to the class constructor itself. Thanks, Stephan On Tuesday, January 28, 2014 11:26 PM, Stephan Friedl wrote: I am building a GCC plugin and am trying to create a call to a constructor for a global variable. The class is declared in a .cpp file and I have global instance of the class declared in the file as well. The class declaration for the global instance I am trying to create follows: -- namespace LocalTestNamespace { class CTestClass { public : CTestClass() { std::cout << "Test Class Initialized." << std::endl; } }; } LocalTestNamespace::CTestClasssourceCodeGlobalTestClass; // g++ parser generates the initialization statement for this global In my plugin, I create a global variable for 'CTestClass' and then attempt to invoke the constructor for it in the '__static_initialization_and_destruction_0' function. Below is a snippet of the code to create the gimple statement and insert it into the initialization function. The plugin runs just before the call flow graph generator pass. - treeaddr_var_decl = build_fold_addr_expr( globalDeclaration ); // globalDeclaration points to the VAR_DECL I created treeconstructor = CLASSTYPE_CONSTRUCTORS( declType ); // declType is the tree for CTestClass gimpleinitializationStatement = gimple_build_call( OVL_CURRENT( constructor ), 1, addr_var_decl ); debug_gimple_stmt( initializationStatement ); // the debug outout of the statement looks fine gsi_insert_before( &insertionPoint, initializationStatement, GSI_SAME_STMT ); // insertionPoint is just before the goto following the calls to global initializers -- When I run this code, the statement gets inserted but the assembler fails. Looking at the assembly output reveals the following at the end of the initializer: -- movl$sourceCodeGlobalTestClass, %edi // the global in the source code call_ZN18LocalTestNamespace10CTestClassC1Ev // call to the class constructor created by the g++ parser movl$testCTestClassVar, %edi // the global I created in the plugin call_ZN18LocalTestNamespace10CTestClassC1EOS0_ *INTERNAL* // call to the class constructor generated by the code snippet above and the gcc error -- Using c++filt the names demangle as: _ZN18LocalTestNamespace10CTestClassC1Ev =>> LocalTestNamespace::CTestClass::CTestClass() _ZN18LocalTestNamespace10CTestClassC1EOS0_ =>> LocalTestNamespace::CTestClass::CTestClass(LocalTestNamespace::CTestClass&&) Clearly the call I am building is incorrect and I have tried numerous variations with the same results. If I manually edit the assembly output file and change the 'C1EOS0_' suffix to 'C1Ev' and strip out the '*INTERNAL*', I can run the assembler on the modified file and generate an executable that works perfectly. I have searched for examples of using gimple_build_call() to generate calls to c++ class constructors but haven't tripped over any examples. I would greatly appreciate any suggestions on how to generate the appropriate constructor call. Thanks, Stephan