On Mon, 10 Nov 2003, Joe Wreschnig wrote: > A program that is CPU-bound *and* can be encoded more efficiently will > benefit from compiler optimizations. Some CPU bound things just aren't > going to be helped much by vectorization, instruction reordering, etc. I > mean, integer multiply is integer multiply.
But if the target cpu supports pipelining, and has multiple multiplication units(which means it can do them in parallel), or can do a 128bit multiple, or 1 64 bit multiple, at once, then it's more efficient to do a partial loop unroll, and thereby have faster code, because of more efficient parallization. (sorry, read Dr. Dobbs last week).