Is it possible these applications are using mmap() to do the IO? I'm not sure if mmap is (or can) be effectively tracked at the user/kernel interface (which is what llite stats are showing).
You _might_ be able to see the page faults in the "vmstat 1" output? I'm of course happy to be proven wrong by adding stats counters for this (eg. count page faults, etc). Cheers, Andreas On Nov 19, 2024, at 07:55, Martin, Philipp <pm.mar...@itc.rwth-aachen.de> wrote: You don't often get email from pm.mar...@itc.rwth-aachen.de. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification> Hi all, We are having an issue where the statistics file in `.../lustre/llite/*/stats` does not show the read or write bytes for some traffic. File opens & closes are being recorded and the read/write activity is shown in the `osc/*/stats` files as expected, but it would be more convenient to see the aggregated results rather than having to sum up the data for every storage target. What would cause traffic to not be shown under llite? Can certain I/O bypass the LLITE subsystem? For reference, we have noticed this specifically for machine learning tools using PyTorch and nVidia DALI. I'd be grateful for any hints! Philipp _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org