Dear Martin, Thank you, that makes sense. If I could trouble you with one more question -- I'm having a hard time using AlignedVector, and I can't find any examples of its use online.
I'd like to build the AlignedVector with a series of push backs, but I get compiler errors if I try this: dealii::AlignedVector<typeScalar> scalar_vars; typeScalar var1(data, 0); scalar_vars.push_back(var1); tellling me that: /Applications/deal.II.app/Contents/Resources/include/deal.II/base/aligned_vector.h:640:21: error: no matching constructor for initialization of 'dealii::FEEvaluation<2, 1, 2, 1, double>' new (_end_data) T; and: /Applications/deal.II.app/Contents/Resources/include/deal.II/base/aligned_vector.h:641:16: error: object of type 'dealii::FEEvaluation<2, 1, 2, 1, double>' cannot be assigned because its copy assignment operator is implicitly deleted *_end_data++ = in_data; I also tried: dealii::AlignedVector<typeScalar> scalar_vars; scalar_vars1.reserve(1); typeScalar var1(data, 0); scalar_vars1[0] = var1; but still got the "error: object of type 'dealii::FEEvaluation<2, 1, 2, 1, double>' cannot be assigned because its copy assignment operator is implicitly deleted" error. How should I be making my AlignedVector? Thanks, Steve On Monday, February 20, 2017 at 3:23:09 PM UTC-5, Martin Kronbichler wrote: > > Dear Stephen, > > The problem is data alignment: You create an std::vector<FEEvaluation> > that internally arranges its data under the assumption that the start > address of FEEvaluation is divisible by 32 (the length of the > vectorization). If you put an FEEvaluation object on the stack, the > compiler will automatically do it right. However, inside an std::vector it > would be up to the std::vector to ensure this, but on usual x86-64 machines > it only aligns to 16 byte boundaries. This is also why SSE2 works because > it only needs 16-byte alignment. > > The solution is to use AlignedVector<typeScalar> scalar_vars instead of > std::vector. The alternative is to wait until the pull request 3980 ( > https://github.com/dealii/dealii/pull/3980) gets merged, grab the newest > developer version because with that we will start using an external scratch > data array that always has the correct alignment. > > Best, > Martin > > On 20.02.2017 20:50, Stephen DeWitt wrote: > > Hello, > I recently realized that I should be using "-march=native" flag for > optimal performance of matrix-free codes. The application that I've been > using works fine with just SSE2, but with AVX enabled I'm getting a > segfault. Step-48 works fine, so I don't think it is an installation issue. > > The function where it occurs is similar to the "local_apply" function in > step 48: > template <int dim> > void getRHS(const MatrixFree<dim,double> &data, > std::vector<dealii::parallel::distributed::Vector<double>*> &dst, > const std::vector<dealii::parallel::distributed::Vector<double>*> & > src, > const std::pair<unsigned int,unsigned int> &cell_range) const{ > > //initialize FEEvaulation objects > std::vector<typeScalar> scalar_vars; > > for (unsigned int i=0; i<num_var; i++){ > typeScalar var(data, i); > scalar_vars.push_back(var); > } > > //loop over cells > for (unsigned int cell=cell_range.first; cell<cell_range.second; ++cell > ){ > > // Initialize, read DOFs, and set evaulation flags > scalar_vars[varInfoListRHS[i].index].reinit(cell); > scalar_vars[varInfoListRHS[i].index].read_dof_values_plain(*src[ > varInfoListRHS[i].global_var_index]); > scalar_vars[varInfoListRHS[i].index].evaluate(need_value[i], > need_gradient[i], need_hessian[i]); // <--- segfault happens here! > > } > > unsigned int num_q_points; > num_q_points = scalar_vars[0].n_q_points; > > //loop over quadrature points > for (unsigned int q=0; q<num_q_points; ++q){ > (etc.) > > > The segfault happens during the "evaluate" call. GDB tells me that it > happens on line 5478 of /include/deal.II/matrix_free/fe_evaluation.h, in > EvaluatorTensorProduct::apply: > xp[i] = in[stride*i] - in[stride*(mm-1-i)]; > > Using the debugger to step through EvaluatorTensorProduct::apply, nothing > seems obviously wrong. As expected, all of the vectorized arrays are four > doubles long. The line above evaluates to xp[0]=in[0]-in[1]. > > Has anyone else had this issue? Does anyone have ideas what the problem > could be, or what I should be looking for? > > Thanks! > Steve > > System: Cluster running CentOS 7, with Intel Xeon E5-2670 processors, GCC > v5.4.0 > > > > > -- > The deal.II project is located at http://www.dealii.org/ > For mailing list/forum options, see > https://groups.google.com/d/forum/dealii?hl=en > --- > You received this message because you are subscribed to the Google Groups > "deal.II User Group" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to dealii+un...@googlegroups.com <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.