Re: [Qemu-devel] [PATCH v2 2/3] qed: add zero write detection support

Mark Wu Thu, 08 Dec 2011 06:30:19 -0800

I tried to optimize the zero detecting code with SSE instruction. Theidea comes from Paolo's patch "migration: vectorize is_dup_page". It'sexpected to give us an noticeable improvement. But I didn't find anyimprovement in the qemu-io test even though I increased the image sizeto 5GB. The following is my test patch. Could you please review it tosee if I made any mistake and SSE can help for zero detecting?


Thanks.



diff --git a/block/qed.c b/block/qed.c
index 75a44f3..61e4a27 100644
--- a/block/qed.c
+++ b/block/qed.c

@@ -998,6 +998,14 @@ static void qed_aio_write_l2_update_cb(void*opaque, int ret)

     qed_aio_write_l2_update(acb, ret, acb->cur_cluster);
 }

+#ifdef __SSE2__
+#include <emmintrin.h>
+#define VECTYPE        __m128i
+#define SPLAT(p)       _mm_set1_epi8(*(p))

+#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) ==0xFFFF)

+#define VECTYPE_ZERO   _mm_setzero_si128()
+#endif
+
 /**
  * Determine if we have a zero write to a block of clusters
  *
@@ -1027,6 +1035,19 @@ static bool qed_is_zero_write(QEDAIOCB *acb)
         }

         v = iov->iov_base;
+
+#ifdef __SSE2__
+       if ((iov->iov_len & 0x0f)) {
+            VECTYPE zero = VECTYPE_ZERO;
+            VECTYPE *p = (VECTYPE *)v;
+            for(j = 0; j < iov->iov_len / sizeof(VECTYPE); j++) {
+                 if (!ALL_EQ(p[j], zero)) {
+                    return false;
+                 }
+            }
+            continue;
+        }
+#endif
         for (j = 0; j < iov->iov_len; j += sizeof(v[0])) {
             if (v[j >> 3]) {
                 return false;

Re: [Qemu-devel] [PATCH v2 2/3] qed: add zero write detection support

Reply via email to