On Wed, Jun 11, 2025 at 02:48:16PM -0700, Nicolin Chen wrote: > On Wed, Jun 11, 2025 at 10:19:56AM -0700, Nicolin Chen wrote: > > On Wed, Jun 11, 2025 at 10:04:35AM +0200, Thomas Weißschuh wrote: > > > On Wed, Jun 11, 2025 at 12:05:25AM -0700, Nicolin Chen wrote: > > > > 2) parent doesn't seem to wait for the setup() to complete.. > > > > > > setup() is called in the child (L431) right before the testcase itself is > > > called (L436). The parent waits for the child to exit (L439) before > > > unmapping. > > > > > > > 3) when parent runs faster than the child that is still running > > > > setup(), the parent unmaps the no_teardown and set it to NULL, > > > > then UAF in the child, i.e. signal 11? > > > > > > That should never happen as the waitpid() will block until the child > > > running > > > setup() and the testcase itself have exited. > > > > Ah, maybe I was wrong about these narratives. But the results show > > that iommufd_dirty_tracking_teardown() was not called in the failed > > cases: > > Here is a new finding... > > As you replied that I was wrong about the race between the parent > and the child processes, the parent does wait for the completion > of the child. But the child exited with status=139 i.e. signal 11 > due to UAF, which however is resulted from the iommufd test code: > > FIXTURE_SETUP(iommufd_dirty_tracking) > { > .... > vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE, > ^ > | > after this line, the _metadata->no_teardown is set to NULL. > > So, the child process accessing this NULL pointer crashed with the > signal 11.. > > And I did a further experiment by turning "bool *no_teardown" to a > "bool no_teardown". Then, the mmap() in iommufd_dirty_tracking will > set _metadata->teardown_fn function pointer to NULL..
So, the test case sets an alignment with HUGEPAGE_SIZE=512MB while allocating buffer_size=64MB: rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size); vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE, this gives the self->buffer a location that is 512MB aligned, but only mmap part of one 512MB huge page. On the other hand, _metadata->no_teardown was mmap() outside the range of the [self->buffer, self->buffer + 64MB), but within the range of [self->buffer, self->buffer + 512MB). E.g. _metadata->no_teardown = 0xfffbfc610000 // inside range2 below buffer=[0xfffbe0000000, fffbe4000000) // range1 buffer=[0xfffbe0000000, fffc00000000) // range2 Then ,the "vrc = mmap(..." overwrites the _metadata->no_teardown location to NULL.. The following change can fix, though it feels odd that the buffer has to be preserved with the entire huge page: --------------------------------------------------------------- @@ -2024,3 +2027,4 @@ FIXTURE_SETUP(iommufd_dirty_tracking) - rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size); + rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, + __ALIGN_KERNEL(variant->buffer_size, HUGEPAGE_SIZE)); if (rc || !self->buffer) { --------------------------------------------------------------- Any thought? Thanks Nicolin