Hi, thanks T.Prince, Your saying: "I'll just mention that we are well into the era of 3 levels of programming parallelization: vectorization, threaded parallel (e.g. OpenMP), and process parallel (e.g. MPI)." is a really great new learning for me. Now I can perceive better.
Can you please explain a bit about: " This application gains significant benefit from cache blocking, so vectorization has more opportunity to gain than for applications which have less memory locality." So now should I conclude from your reply that if we have single core processor in a PC, even than we can get benefit of Auto-Vectorization? And we do not need free cores for getting benefit of auto-vectorization? Thank you very much.