------- Comment #10 from roger at eyesopen dot com 2007-04-27 18:20 ------- Paul's fix looks correct to me. It appears that when the "#if 0" was added to disable broken loop shifting at some point in the distant past, the critical functionality.
if (nDepend) break; was accidentally removed. This is similar to the idiom used a few lines earlier "nDepend = 1; break;" I would normally have suggested that the type of nDepend and the return type of gfc_dep_resolver be changed to bool, and the dead "#if 0" code removed to clean-up this code... However, I think a much better longer term strategy is to completely remove gfc_conv_resolve_dependencies, and gfc_dep_resolver and the automatic handling of temporaries in the scalarizer. Instead, in line with the TODO above gfc_conv_resolve_dependencies, I think it's much better to handle this at a higher level using the new/preferred gfc_check_dependency API that's used through-out the front-end. For example, when lhs=rhs has a dependency, this can be transformed into "tmp=rhs; lhs=tmp" during lowering or at the front-end tree-level, which greatly simplifies the complex code in the scalarizer. It also allows the post-assignment copy "lhs=tmp" to be performed efficiently [though I may have already implemented that in the scalarizer already :-)]. One benefit of doing this at a higher level of abstraction is that loop-reversal isn't a simple matter of running our default scalarized loop backwards (as hinted by the comment, and the PRs in bugzilla). Consider the general case of a(s1:e1,s2:e2,s3:e3) = rhs. The dependency analysis in gfc_check_dependency may show that s1:e1 needs to run forwards to avoid a conflict, and that s3:e3 needs to run backwards for the same reason. It's trivial then to treat this almost like a source-to-source transformation (ala KAP) and convert the assignment into a(s1:e1:1,s2:e2,s3:e3:-1) instead of attempting to shoehorn all this into the scalarizer at a low-level and perform book-keeping on the direction vectors. By the time, we've lowered gfc_expr to gfc_ss, we've made things much harder for ourselves. And things get much more complex once we start seriously tackling CSHIFT, TRANSPOSE and friends! Keeping the scalarizer simple also eases support for autovectorization and/or moving it into the middle-end (the topic of Toon's GCC summit presentation). I'm as surprised as Paul that this hasn't been a problem before. I suspect it's because we use the alternate gfc_check_dependency in the vast majority of cases, and the empirical observation in the research literature that most f90 array assignments in real code don't carry a dependency. Roger -- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31711