On 29.10.19 13:19, Vladimir Sementsov-Ogievskiy wrote: > 29.10.2019 15:11, Max Reitz wrote: >> On 29.10.19 13:05, Vladimir Sementsov-Ogievskiy wrote: >>> 29.10.2019 14:55, Max Reitz wrote: >>>> On 29.10.19 12:48, Vladimir Sementsov-Ogievskiy wrote: >>>>> 29.10.2019 11:50, Max Reitz wrote: >>>>>> On 28.10.19 12:25, Vladimir Sementsov-Ogievskiy wrote: >>>>>>> 28.10.2019 14:04, Kevin Wolf wrote: >>>>>>>> Am 27.10.2019 um 13:35 hat Stefan Hajnoczi geschrieben: >>>>>>>>> On Fri, Oct 25, 2019 at 11:58:46AM +0200, Max Reitz wrote: >>>>>> >>>>>> [...] >>>>>> >>>>>>>>>> (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. >>>>>>>>>> To my knowledge I’m the only one who has provided any >>>>>>>>>> benchmarks for >>>>>>>>>> this commit, and even then I was a bit skeptical because it >>>>>>>>>> performs >>>>>>>>>> well in some cases and bad in others. I concluded that it’s >>>>>>>>>> probably worth it because the “some cases” are more likely >>>>>>>>>> to occur. >>>>>>>>>> >>>>>>>>>> Now we have this problem of corruption here (granted due to >>>>>>>>>> a bug in >>>>>>>>>> the XFS driver), and another report of massively degraded >>>>>>>>>> performance on ppc64 >>>>>>>>>> (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – >>>>>>>>>> sorry, a >>>>>>>>>> private BZ; I hate that :-/ The report is about 40 % worse >>>>>>>>>> performance for an in-guest fio write benchmark.) >>>>>>>>>> >>>>>>>>>> So I have to ask the question about what the justification >>>>>>>>>> for >>>>>>>>>> keeping c8bb23cbdbe32f is. How much does performance >>>>>>>>>> increase with >>>>>>>>>> it actually? (On non-(ppc64+XFS) machines, obviously) >>>>>>>>>> >>>>>>>>>> Advantages: >>>>>>>>>> + Trivial >>>>>>>>>> + No layering violations >>>>>>>>>> + We wouldn’t need to keep track of whether the kernel bug >>>>>>>>>> has been >>>>>>>>>> fixed or not >>>>>>>>>> + Fixes the ppc64+XFS performance problem >>>>>>>>>> >>>>>>>>>> Disadvantages: >>>>>>>>>> - Reverts cluster allocation performance to >>>>>>>>>> pre-c8bb23cbdbe32f >>>>>>>>>> levels, whatever that means >>>>>>>>> >>>>>>>>> My favorite because it is clean and simple, but Vladimir has a valid >>>>>>>>> use-case for requiring this performance optimization so reverting >>>>>>>>> isn't >>>>>>>>> an option. >>>>>>>> >>>>>>>> Vladimir also said that qcow2 subclusters would probably also solve his >>>>>>>> problem, so maybe reverting and applying the subcluster patches instead >>>>>>>> is a possible solution, too? >>>>>>> >>>>>>> I'm not sure about ssd case, it may need write-zero optimization anyway. >>>>>> >>>>>> What exactly do you need? Do you actually need to write zeroes (e.g. >>>>>> because you’re storing images on block devices) or would it be >>>>>> sufficient to just drop the COW areas when bdrv_has_zero_init() and >>>>>> bdrv_has_zero_init_truncate() are true? >>>>> >>>>> Hmm, what do you mean? We need to zero COW areas. So, original way is to >>>>> write real zeroes, optimized way is fallocate.. What do you mean by drop? >>>>> Mark sublusters as zeroes by metadata? >>>> >>>> Why do you need to zero COW areas? For normal files, any data will read >>>> as zero if you didn’t write anything there. >>> >>> Hmm, but when allocating new cluster in qcow2, it's not guaranteed to be >>> zero, >>> as it may be reused previously allocated cluster.. >> >> Hm, right. We could special-case something for beyond the EOF, but I >> don’t know whether that really makes it better. >> >> OTOH, maybe allocations at the EOF are the real bottleneck. Reusing >> existing clusters should be rare enough that maybe the existing code >> which explicitly writes zeroes is sufficient. > > But, as I understand pre-EOF fallocates are safe in xfs? So, we may just drop > calling > fallocate past-EOF (it's zero anyway) and do fallocate pre-EOF (it's safe) ?
But probably slow still. And there’s the question of how much complexity we want to heap onto this. Max
signature.asc
Description: OpenPGP digital signature