------- Comment #10 from paul dot richard dot thomas at gmail dot com
2010-06-05 06:55 -------
Subject: Re: inline matmul for small matrix sizes
Dear Thomas,
> The preferred way would therefore be to state the rank 2 * rank 2 problem as
>
> do i=1,m
> do j=1,n
> c(i,j) = sum(a(i,:) * b(:,j))
> end do
> end do
>
> with the inner dot product borrowed using the scalarizer (borrowing
> from dot_product), and the outer loops using either hand-crafted
> TREE code or calling the DO translation.
Yes that is reasonable. Otherwise, you could borrow a little trick
that I used in allocatable components: trans-array.c:6020
gfc_add_expr_to_block (&loopbody, tmp);
/* Build the loop and return. */
gfc_init_loopinfo (&loop);
loop.dimen = 1;
loop.from[0] = gfc_index_zero_node;
loop.loopvar[0] = index;
loop.to[0] = nelems;
gfc_trans_scalarizing_loops (&loop, &loopbody);
gfc_add_block_to_block (&fnblock, &loop.pre);
tmp = gfc_finish_block (&fnblock);
if (null_cond != NULL_TREE)
tmp = build3_v (COND_EXPR, null_cond, tmp,
build_empty_stmt (input_location));
Here tmp in the first line is the expression or finished block within
the loop. Earlier on, you will find an expression involving the
index.
Cheers
Paul
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37131