On 12/11/2015 03:49, Li, Liang Z wrote: > I am very surprised about the live migration performance result when > I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics to > check the zero pages.
What code were you using? Remember I suggested using only unsigned long checks, like unsigned long *p = ... if (p[0] || p[1] || p[2] || p[3] || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0) return BUFFER_NOT_ZERO; else return BUFFER_ZERO; > The total live migration time increased about > 8%! Not decreased. Although in the unit test your ' > memeqzero4_paolo' has better performance, any idea? You only tested the case of zero pages. But real pages usually are not zero, even if they have a few zero bytes at the beginning. It's very important to optimize the initial check before the memcmp call. Paolo