On 13/05/2020 17:42, Mark Johnston wrote: > On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote: >> On 13/05/2020 10:35, Andriy Gapon wrote: >>> In r329363 I re-worked zfs_getpages and introduced range locking to it. >>> At the time I believed that it was safe and maybe it was, please see the >>> commit >>> message. >>> There, indeed, have been many performance / concurrency improvements to the >>> VM >>> system and r358443 is one of them. >> >> Thinking more about it, it could be r352176. >> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are >> not >> equivalent to the code that they replaced. >> The original code would check valid field before any locking and it would >> attempt any locking / busing if a page is invalid. The object was required >> to >> be locked though. >> The new code tries to busy the page in any case. >> >>> I am not sure how to resolve the problem best. Maybe someone who knows the >>> latest VM code better than me can comment on my assumptions stated in the >>> commit >>> message. > > The general trend has been to use the page busy lock as the single point > of synchronization for per-page state. As you noted, updates to the > valid bits were previously interlocked by the object lock, but this is > coarse-grained and hurts concurrency. I think you are right that the > range locking in getpages was ok before the recent change, but it seems > preferable to try and address this in ZFS. > >>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking >>> in >>> this corner of the code because of a similar deadlock a long time ago. > > Do they just not implement readahead?
I think so, but not 100% sure. I recall seeing a comment in illumos code that they do not care about read-ahead because there is ZFS prefetch and the data will be cached in ARC. That makes sense from the I/O point of view, but it does not help with page faults. > Can you explain exactly what the > range lock accomplishes here - is it entirely to ensure that znode block > size remains stable? As far as I can recall, this is the reason indeed. -- Andriy Gapon _______________________________________________ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"