If you are evicting a client by NID, then use the "nid:" keyword:

    lctl set_param mdt.*.evict_client=nid:10.68.178.25@tcp

Otherwise it is expecting the input to be in the form of a client UUID (to allow
evicting a single export from a client mounting the filesystem multiple times).

That said, the client *should* be evicted by the server automatically, so it 
isn't
clear why this isn't happening.  Possibly this is something at the LNet level
(which unfortunately I don't know much about)? 

Cheers, Andreas

> On Dec 6, 2023, at 13:23, Huang, Qiulan via lustre-discuss 
> <[email protected]> wrote:
> 
> 
> 
> Hello all,
> 
> 
> We removed some clients two weeks ago but we see the Lustre server is still 
> trying to handle the lnet recovery reply to those clients (the error log is 
> posted as below). And they are still listed in the exports dir.
> 
> 
> I tried to run  to evict the clients but failed with  the error "no exports 
> found"
> 
> lctl set_param mdt.*.evict_client=10.68.178.25@tcp
> 
> 
> Do you know how to clean up the removed the depreciated clients? Any 
> suggestions would be greatly appreciated.
> 
> 
> 
> For example:
> 
> [root@mds2 ~]# ll /proc/fs/lustre/mdt/data-MDT0000/exports/10.67.178.25@tcp/
> total 0
> -r--r--r-- 1 root root 0 Dec  5 15:41 export
> -r--r--r-- 1 root root 0 Dec  5 15:41 fmd_count
> -r--r--r-- 1 root root 0 Dec  5 15:41 hash
> -rw-r--r-- 1 root root 0 Dec  5 15:41 ldlm_stats
> -r--r--r-- 1 root root 0 Dec  5 15:41 nodemap
> -r--r--r-- 1 root root 0 Dec  5 15:41 open_files
> -r--r--r-- 1 root root 0 Dec  5 15:41 reply_data
> -rw-r--r-- 1 root root 0 Aug 14 10:58 stats
> -r--r--r-- 1 root root 0 Dec  5 15:41 uuid
> 
> 
> 
> 
> 
> /var/log/messages:Dec  6 12:50:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 13:05:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec  6 13:05:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 13:20:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec  6 13:20:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 13:35:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec  6 13:35:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 13:50:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec  6 13:50:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 14:05:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec  6 14:05:17 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 14:20:16 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec  6 14:20:16 mds2 kernel: LNetError: 
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous 
> similar message
> /var/log/messages:Dec  6 14:30:17 mds2 kernel: LNetError: 
> 3806712:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.176.25@tcp) recovery failed with -111
> /var/log/messages:Dec  6 14:30:17 mds2 kernel: LNetError: 
> 3806712:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 3 previous 
> similar messages
> /var/log/messages:Dec  6 14:47:14 mds2 kernel: LNetError: 
> 3812070:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.176.25@tcp) recovery failed with -111
> /var/log/messages:Dec  6 14:47:14 mds2 kernel: LNetError: 
> 3812070:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 8 previous 
> similar messages
> /var/log/messages:Dec  6 15:02:14 mds2 kernel: LNetError: 
> 3817248:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI 
> (10.67.176.25@tcp) recovery failed with -111
> 
> 
> Regards,
> Qiulan
> _______________________________________________
> lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to