Re: [Qemu-devel] [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function

Laszlo Ersek Tue, 15 Apr 2014 02:26:36 -0700

On 04/15/14 01:55, Michael R. Hines wrote:
> On 04/14/2014 05:19 PM, Laszlo Ersek wrote:
>> On 04/14/14 04:27, Amos Kong wrote:
>>> We already have a function buffer_is_zero() in util/cutils.c
>>>
>>> Signed-off-by: Amos Kong <ak...@redhat.com>
>>> ---
>>>   arch_init.c | 9 ++-------
>>>   1 file changed, 2 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch_init.c b/arch_init.c
>>> index 60c975d..342e5dc 100644
>>> --- a/arch_init.c
>>> +++ b/arch_init.c
>>> @@ -152,11 +152,6 @@ int qemu_read_default_config_files(bool userconfig)
>>>       return 0;
>>>   }
>>>   -static inline bool is_zero_range(uint8_t *p, uint64_t size)
>>> -{
>>> -    return buffer_find_nonzero_offset(p, size) == size;
>>> -}
>>> -
>>>   /* struct contains XBZRLE cache and a static page
>>>      used by the compression */
>>>   static struct {
>>> @@ -587,7 +582,7 @@ static int ram_save_block(QEMUFile *f, bool
>>> last_stage)
>>>                           acct_info.dup_pages++;
>>>                       }
>>>                   }
>>> -            } else if (is_zero_range(p, TARGET_PAGE_SIZE)) {
>>> +            } else if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
>>>                   acct_info.dup_pages++;
>>>                   bytes_sent = save_block_hdr(f, block, offset, cont,
>>>                                               RAM_SAVE_FLAG_COMPRESS);
>>> @@ -983,7 +978,7 @@ static inline void
>>> *host_from_stream_offset(QEMUFile *f,
>>>    */
>>>   void ram_handle_compressed(void *host, uint8_t ch, uint64_t size)
>>>   {
>>> -    if (ch != 0 || !is_zero_range(host, size)) {
>>> +    if (ch != 0 || !buffer_is_zero(host, size)) {
>>>           memset(host, ch, size);
>>>       }
>>>   }
>>>
>> This seems to be correct, functionally -- apparently buffer_is_zero()
>> has laxer size requirements than buffer_find_nonzero_offset(). However,
>> I think the latter might be faster.
>>
>> For ram_save_block() I guess the difference is negligible. But
>> ram_handle_compressed() is also called from "migration-rdma.c", where I
>> can't even guess if a little bit of slowdown would count.
>>
>> I'm OK with the patch if Michael (CC'd) is.
>>
>> Thanks
>> Laszlo
>>
> 
> Thanks for the CC.
> 
> Actually, it looks like buffer_is_zero() is calling
> buffer_find_nonzero_offset()
> as a "first try" anyway


I have no idea how I managed to miss that.

> - which is the same thing RDMA is doing. So, all
> calls to ram_handle_compressed() should hit the branch target there in
> buffer_is_zero() for can_use_buffer_find_nonzero_offset() and automatically
> drop into the vectorized-optimization to search for zeros, so there
> shouldn't
> be any change in performance. The same should apply for TCP migration
> as well - it's working on page-granularity, which is always aligned to
> 32 or 64 bits.
> 
> Paolo? I see that some of the block-migration code and the qemu-img code
> is also
> calling buffer_is_zero() - are you guys depending on the performance of any
> buffer_is_zero() calls to use the vector-optimized version like
> migration does?

The patch doesn't change buffer_is_zero() internally, so callers of
buffer_is_zero() are unaffected. The patch turns two indirect callers of
buffer_find_nonzero_offset() into two "differently indirect" callers of
the same (which I now see thanks to your explanation). Hence,

Before:
ram_save_block        -> is_zero_range -> buffer_find_nonzero_offset
ram_handle_compressed -> is_zero_range -> buffer_find_nonzero_offset

After:
ram_save_block        -> buffer_is_zero -> buffer_find_nonzero_offset
ram_handle_compressed -> buffer_is_zero -> buffer_find_nonzero_offset

Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thanks!
Laszlo

Re: [Qemu-devel] [Qemu-trivial] [PATCH] arch_init.c: remove duplicate function

Reply via email to