Eryk Sun <eryk...@gmail.com> added the comment:

> C:\Users>fsutil file queryfileid u:\test\test.jpg
> File ID is 0x00000000000029d500000000000004ae

ReFS uses a 128-bit file ID, which I gather consists of a 64-bit directory ID 
and a 64-bit relative ID. (Take this with a grain of salt. AFAIK, Microsoft 
hasn't published a spec for ReFS.) The latter is 0 for the directory itself and 
increments by 1 for each file created in the directory, with no reuse of 
previous values if a file is deleted or moved. If that's correct, and if 
"test.jpg" was created in "\test", then the directory ID of "\test" is 0x29d5, 
and the relative file ID is 0x4ae. 

> >>> from pathlib import Path
> >>> hex(Path('U:/test/test.jpg').stat().st_ino)
> '0x4000000004ae29d5'

os.stat calls WINAPI GetFileInformationByHandle, which returns a 64-bit file 
ID. It appears that ReFS generates this ID by concatenating the relative ID and 
directory ID in a way that is "not guaranteed to be unique" according to the 
BY_HANDLE_FILE_INFORMATION [1] docs. 

I haven't checked whether this 64-bit file ID can even be used successfully 
with OpenFileById [2]. It could be that ReFS simply fails an open-by-ID request 
unless it includes the full 128-bit ID (i.e. ExtendedFileIdType).

You can request the 128-bit ID as a FILE_ID_128 record (an array of 16 bytes) 
via GetFileInformationByHandleEx: FileIdInfo [3][4]. Maybe os.stat should try 
to query the 128-bit ID and use it as st_ino (or st_ino_128) if it's available. 
However, looking into my crystal ball, I don't see this happening, unless 
someone makes a strong case in its favor.

> The problem does *not* exist on an NTFS volume:
> 
> C:\Users>fsutil file queryfileid o:\OneDrive\test\test.jpg
> File ID is 0x0000000000000000000300000001be39

NTFS uses a 64-bit file ID, which consists of a 48-bit MFT record number and a 
16-bit sequence number. The latter gets incremented when an MFT record is 
reused in order to detect stale references. In the above case, the 48-bit 
record number is 0x00000001be39, and the sequence number is 0x0003.

[1]: 
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/ns-fileapi-by_handle_file_information
[2]: 
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-openfilebyid
[3]: 
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getfileinformationbyhandleex
[4]: 
https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_id_info

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40095>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to