Sorry, I was wrong. Of course, it is the other way round.
The fast one is 3 times faster.
-Simon
Am So., 23. Okt. 2022 um 10:37 Uhr schrieb Peter Munch <
peterrmue...@gmail.com>:
> Now, I am lost. The fast one is 3 times slower!?
>
> Peter
>
> On Sunday, 23 October 2022 at 10:33:38 UTC+2 Simon w
Now, I am lost. The fast one is 3 times slower!?
Peter
On Sunday, 23 October 2022 at 10:33:38 UTC+2 Simon wrote:
> Certainly.
> When using the slow path, i.e. MappingQ in version 9.3.2, the cpu time is
> about 6.3 seconds.
> In case of the fast path, i.e. MappingQGeneric in version 9.3.2, the
Certainly.
When using the slow path, i.e. MappingQ in version 9.3.2, the cpu time is
about 6.3 seconds.
In case of the fast path, i.e. MappingQGeneric in version 9.3.2, the cpu
time is about 18.7 seconds.
Crudely, the .reinit function associated with the FEPointEvaluation
objects is called about 1
Happy about that! May I ask you to post the results here. I am curious
since I never actually compared timings (and only blindly trusted Martin).
Thanks,
Peter
On Saturday, 22 October 2022 at 16:46:16 UTC+2 Simon wrote:
> Yes, the issue is resolved and the computation time decreased
> signific
Yes, the issue is resolved and the computation time decreased significantly.
Thank you all!
-Simon
Am Sa., 22. Okt. 2022 um 12:57 Uhr schrieb Peter Munch <
peterrmue...@gmail.com>:
> You are right. Release 9.3 uses the slow path for MappingQ. The reason is
> that here
> https://github.com/deali
You are right. Release 9.3 uses the slow path for MappingQ. The reason is
that here
https://github.com/dealii/dealii/blob/ccfaddc2bab172d9d139dabc044d028f65bb480a/include/deal.II/matrix_free/fe_point_evaluation.h#L708-L711
we check for MappingQGeneric. At that time MappingQ and MappingQGeneric
I revised the appendix from my last message a little bit and attache now a
minimal working example (just 140 lines) along with a CMakeLists.txt.
After checking the profiling results from valgrind, the combination of
MappingQ with FE_Q takes *not* the fast path.
For info: I use dealii version 9.3.2
" When you use FEPointEvaluation, you should construct it only once and
re-use the same object for different points. Furthermore, you should also
avoid to create "p_dofs" and the "std::vector" near the I was not clear
with my original message. Anyway, the problem is the FEValues object that
gets u
Dear Simon,
When you use FEPointEvaluation, you should construct it only once and
re-use the same object for different points. Furthermore, you should
also avoid to create "p_dofs" and the "std::vector" near the I was not
clear with my original message. Anyway, the problem is the FEValues
ob
" What type of Mapping are you using? If you take a look at
https://github.com/dealii/dealii/blob/ad13824e599601ee170cb2fd1c7c3099d3d5b0f7/source/matrix_free/fe_point_evaluation.cc#L40-L95
you can see when the fast path of FEPointEvaluation is taken. Indeed, the
slow path is (FEValues). One questio
> FEPointEvaluation creates an FEValues object along with a quadrature
object under the hood.
Closer inspection revealed that all constructors, destructors,...
associated with FEPointEvaluation
need roughly 5000 instructions more (per call!).
That said, FEValues is indeed the faster approach, at
Update:
I profiled my program with valgrind --tool=callgrind and could figure out
that
FEPointEvaluation creates an FEValues object along with a quadrature object
under the hood.
Closer inspection revealed that all constructors, destructors,...
associated with FEPointEvaluation
need roughly 5000 i
Dear Martin and Wolfgang,
" You seem to be looking for FEPointEvaluation. That class is shown in
step-19 and provides, for simple FiniteElement types, a much faster way to
evaluate solutions at arbitrary points within a cell. Do you want to give
it a try? "
I implemented the FEPointEvaluation app
On 10/19/22 08:45, Simon Wiesheier wrote:
What I want to do boils down to the following:
Given the reference co-ordinates of a point 'p', along with the cell on
which 'p' lives,
give me the value and gradient of a finite element function evaluated at
'p'.
My idea was to create a quadrature o
Dear Simon,
You seem to be looking for FEPointEvaluation. That class is shown in
step-19 and provides, for simple FiniteElement types, a much faster way to
evaluate solutions at arbitrary points within a cell. Do you want to give
it a try? The issue you are facing is that FEValues that you are usi
" It's an environment variable. "
I did
$DEAL_II_NUM_THREADS
and the variable is not set.
But if it were set to one, why would this explain the gap between cpu and
wall time?
" My point is the constructor should not be called millions of times. You
are not going to be able to get that function 10
Simon,
Le mer. 19 oct. 2022 à 09:33, Simon Wiesheier a
écrit :
> Thank you for your answer!
>
> " Did you set DEAL_II_NUM_THREADS=1?"
>
> How can I double-check that?
> ccmake .
> only shows my the variables CMAKE_BUILD_TYPE and deal.II_DIR .
> But I do do knot if this is the right place to loo
Thank you for your answer!
" Did you set DEAL_II_NUM_THREADS=1?"
How can I double-check that?
ccmake .
only shows my the variables CMAKE_BUILD_TYPE and deal.II_DIR .
But I do do knot if this is the right place to look for.
" That could explain why CPU and Wall time are different. Finally, if I
Simon,
The best way to profile a code is to use a profiler. It can give a lot more
information than what simple timers can do. You say that your code is not
parallelized but by default deal.II is multithreaded . Did you set
DEAL_II_NUM_THREADS=1? That could explain why CPU and Wall time are
di
19 matches
Mail list logo