Am 20.03.2017 um 12:49 schrieb Fam Zheng: > On Mon, 03/20 12:21, Paolo Bonzini wrote: >> >> On 20/03/2017 03:46, Fam Zheng wrote: >>> On Fri, 03/17 12:20, Peter Lieven wrote: >>>> Am 17.03.2017 um 12:16 schrieb Paolo Bonzini: >>>>> On 17/03/2017 12:11, Peter Lieven wrote: >>>>>>>> like VMDK or QCOW2 shouldn't we trust the information from the l2 >>>>>>>> tables in the VMDK or QCOW2? >>>>>>> It provides additional information, for example it works better with >>>>>>> prealloc=metadata. >>>>>> Okay, understood. Can you imagine of a away to conditionally avoid this >>>>>> second callout? In my case we have an additional >>>>>> lseek for each cluster. For a 20GB file this are approx. 327k calls to >>>>>> lseek. And if the file has no preallocated metadata >>>>>> it will likely not improve anything. And even if the metadata is >>>>>> prealloced what is the allocation status of the clusters? >>>>> If the metadata is preallocated, cluster will (or should) show up as >>>>> zero, speeding up the copy. >>>> Okay, in this case the second call out to *file will not happen. It only >>>> happens if the metadata says it contains data. >>>> So where does it actually help? >>>> >>>> The condition is: (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) && >>>> (ret & BDRV_BLOCK_OFFSET_VALID)) >>>> >>>> So from my view it can only have any effect if the metadata returns >>>> BDRV_BLOCK_DATA, but the protocol driver returns >>>> BDRV_BLOCK_ZERO. >>>> >>>> This can only happen if I partially write to a cluster, or am I wrong here? >>> I think you have a point. The metadata should have said BDRV_BLOCK_ZERO if >>> protocol would say BDRV_BLOCK_ZERO - there is no reason the format driver >>> cannot >>> know. >> That's true of qcow2, but many formats (including raw :)) don't know >> about BDRV_BLOCK_ZERO. > Raw is a little special, it could have forwarded the call to *file in its > BlockDriver callback. Most formats with metadata stores zero/nonzero > information
I thinks thats the check *file != bs for, right? > in L1/L2 tables. For qcow2 and VMDK I think it's okay to just trust meta data > on > zero/nonzero. So what are the really Formats that we don't trust? And is it worth that we add this (maybe expensive) callout to lseek via find_allocation for any format altough we already know the answer from the metadata? Peter