------- Comment #10 from paul dot richard dot thomas at gmail dot com 2010-06-05 06:55 ------- Subject: Re: inline matmul for small matrix sizes
Dear Thomas, > The preferred way would therefore be to state the rank 2 * rank 2 problem as > > do i=1,m > do j=1,n > c(i,j) = sum(a(i,:) * b(:,j)) > end do > end do > > with the inner dot product borrowed using the scalarizer (borrowing > from dot_product), and the outer loops using either hand-crafted > TREE code or calling the DO translation. Yes that is reasonable. Otherwise, you could borrow a little trick that I used in allocatable components: trans-array.c:6020 gfc_add_expr_to_block (&loopbody, tmp); /* Build the loop and return. */ gfc_init_loopinfo (&loop); loop.dimen = 1; loop.from[0] = gfc_index_zero_node; loop.loopvar[0] = index; loop.to[0] = nelems; gfc_trans_scalarizing_loops (&loop, &loopbody); gfc_add_block_to_block (&fnblock, &loop.pre); tmp = gfc_finish_block (&fnblock); if (null_cond != NULL_TREE) tmp = build3_v (COND_EXPR, null_cond, tmp, build_empty_stmt (input_location)); Here tmp in the first line is the expression or finished block within the loop. Earlier on, you will find an expression involving the index. Cheers Paul -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37131