On Tue, 2009-12-01 at 05:47 -0800, Tim Prince wrote: > amjad ali wrote: > > Hi, > > thanks T.Prince, > > > > Your saying: > > "I'll just mention that we are well into the era of 3 levels of > > programming parallelization: vectorization, threaded parallel (e.g. > > OpenMP), and process parallel (e.g. MPI)." is a really great new > > learning for me. Now I can perceive better. > > > > > > Can you please explain a bit about: > > > > " This application gains significant benefit from cache blocking, so > > vectorization has more opportunity to gain than for applications which > > have less memory locality." > > > > So now should I conclude from your reply that if we have single core > > processor in a PC, even than we can get benefit of Auto-Vectorization? > > And we do not need free cores for getting benefit of auto-vectorization? > > > > Thank you very much. > Yes, we were using auto-vectorization from before the beginnings of MPI > back in the days of single core CPUs; in fact, it would often show a > greater gain than it did on later multi-core CPUs. > The reason for greater effectiveness of auto-vectorization with cache > blocking and possibly with single core CPUs would be less saturation of > memory buss.
Just for the record, there's a huge difference between "back in the days of single core CPUs" and "before the beginnings of MPI". They're separated by a decade or two. Vectorisation (automatic or otherwise) is useful on pipeline architectures. Pipeline architectures do go back a long way, at least to the 80s. They do predate MPI I think, but not parallel programming and message passing in general. Multi-core chips are Johnny-come-latelys.