Hello Thomas,

17.09.2024 04:01, Thomas Munro wrote:
On Mon, Sep 16, 2024 at 6:00 PM Alexander Lakhin <exclus...@gmail.com> wrote:
So this leak looks like a recent and still existing defect.
 From my cartoon-like understanding of Windows, I would guess that if
event handles created by a program are leaked after it has exited, it
would normally imply that they've been duplicated somewhere else that
is still running (for example see the way that PostgreSQL's
dsm_impl_pin_segment() calls DuplicateHandle() to give a copy to the
postmaster, so that the memory segment continues to exist after the
backend exits), and if it's that, you'd be able to see the handle
count going up in the process monitor for some longer running process
somewhere (as seen in this report from the Chrome hackers[1]).  And if
it's not that, then I would guess it would have to be a kernel bug
because something outside userspace must be holding onto/leaking
handles.  But I don't really understand Windows beyond trying to debug
PostgreSQL at a distance, so my guesses may be way off.  If we wanted
to try to find a Windows expert to look at a standalone repro, does
your PS script work with *any* source directory, or is there something
about the initdb template, in which case could you post it in a .zip
file so that a non-PostgreSQL person could see the failure mode?

[1] 
https://randomascii.wordpress.com/2021/07/25/finding-windows-handle-leaks-in-chromium-and-others/

That's very interesting reading. I'll try to research the issue that deep
later (though I guess this case is different — after logging off and
logging in as another user, I can't see any processes belonging to the
first one, while those "Event objects" in non-paged pool still occupy
memory), but finding a Windows expert who perhaps can look at the
robocopy's sources, would be good too (and more productive).

So, the repro we can show is:
rm -r c:\temp\source
mkdir c:\temp\source
for ($i = 1; $i -le 1000; $i++)
{
echo 1 > "c:\temp\source\$i"
}

for ($i = 1; $i -le 1000; $i++)
{
echo "iteration $i"
rm -r c:\temp\target
robocopy.exe /E /NJH /NFL /NDL /NP c:\temp\source c:\temp\target
Get-WmiObject -Class Win32_PerfRawData_PerfOS_Memory | % PoolNonpagedBytes
}

It produces for me (on Windows 10 [Version 10.0.19045.4780]):
iteration 1
...
216887296
...
iteration 1000


------------------------------------------------------------------------------

               Total    Copied   Skipped  Mismatch    FAILED Extras
    Dirs :         1         1         0         0         0 0
   Files :      1000      1000         0         0         0 0
   Bytes :     7.8 k     7.8 k         0         0         0 0
   Times :   0:00:00   0:00:00                       0:00:00 0:00:00


   Speed :               17660 Bytes/sec.
   Speed :               1.010 MegaBytes/min.
   Ended : Monday, September 16, 2024 8:58:09 PM

365080576

Just "touch c:\temp\source\$i" is not enough, files must be non-empty for
the leak to happen.

Best regards,
Alexander


Reply via email to