Here the failover is designed in such a way that the IP address moves (fails 
over) with OST and becomes active on the other server.

This is probably the source of your problem. I would suggest assigning unique 
IP addresses to each OSS.

Chris Horn

From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of 
Backer <backer.k...@gmail.com>
Date: Tuesday, November 5, 2024 at 10:19 PM
To: Backer via lustre-discuss <lustre-discuss@lists.lustre.org>, 
lustre-de...@lists.lustre.org <lustre-de...@lists.lustre.org>
Subject: Re: [lustre-discuss] Lustre switching to loop back lnet interface when 
it is not desired
Any ideas on how to avoid using 0@lo as failover_nids? Please see below.

On Tue, 5 Nov 2024 at 12:34, Backer 
<backer.k...@gmail.com<mailto:backer.k...@gmail.com>> wrote:
Hi,

Mounting the Lustre file file system on the OSS. Some of the OSTs are locally 
attached to the OSS.

The failover IP on the OST is "10.99.100.152". It is a local lnet on the OSS. 
However, when the client mounts it, the import automatically changes to 0@lo. 
It is undesirable here because when this OST fails over to another server, the 
client is still trying to connect to 0@lo while it is no longer on the same 
host. This makes the client fs mount hangs for ever.

Here the failover is designed in such a way that the IP address moves (fails 
over) with OST and becomes active on the other server.

How can I make the import pointing to the real IP and not the loopback? (so 
that the failover works)


[oss000 ~]$ lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
fs-MDT0000_UUID     29068444       25692    26422344   1% /mnt/fs[MDT:0]
fs-OST0000_UUID     50541812    30160292    17743696  63% /mnt/fs[OST:0]
fs-OST0001_UUID     50541812    29301740    18602248  62% /mnt/fs[OST:1]
fs-OST0002_UUID     50541812    29356508    18547480  62% /mnt/fs[OST:2]
fs-OST0003_UUID     50541812     8822980    39081008  19% /mnt/fs[OST:3]

filesystem_summary:    202167248    97641520    93974432  51% /mnt/fs

[oss000 ~]$ df -h
Filesystem                  Size  Used Avail Use% Mounted on
devtmpfs                     30G     0   30G   0% /dev
tmpfs                        30G  8.1M   30G   1% /dev/shm
tmpfs                        30G   25M   30G   1% /run
tmpfs                        30G     0   30G   0% /sys/fs/cgroup
/dev/mapper/ocivolume-root   36G   17G   19G  48% /
/dev/sdc2                  1014M  637M  378M  63% /boot
/dev/mapper/ocivolume-oled   10G  2.5G  7.6G  25% /var/oled
/dev/sdc1                   100M  5.1M   95M   6% /boot/efi
tmpfs                       5.9G     0  5.9G   0% /run/user/987
tmpfs                       5.9G     0  5.9G   0% /run/user/0
/dev/sdb                     49G   28G   18G  62% /fs-OST0001
/dev/sda                     49G   29G   17G  63% /fs-OST0000
tmpfs                       5.9G     0  5.9G   0% /run/user/1000
10.99.100.221@tcp1:/fs  193G   94G   90G  51% /mnt/fs

[oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda
checking for existing Lustre data: found

   Read previous values:
Target:     fs-OST0000
Index:      0
Lustre FS:  fs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.99.100.221@tcp1 
failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1


   Permanent disk data:
Target:     fs-OST0000
Index:      0
Lustre FS:  fs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.99.100.221@tcp1 
failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1

exiting before disk write.


[oss000 proc]# cat /proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import
import:
    name: fs-OST0000-osc-ffff89c57672e000
    target: fs-OST0000_UUID
    state: IDLE
    connect_flags: [ write_grant, server_lock, version, request_portal, 
max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize, 
alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink, full20, 
layout_lock, 64bithash, object_max_bytes, jobstats, einprogress, grant_param, 
lvb_type, short_io, lfsck, bulk_mbits, second_flags, lockaheadv2, 
increasing_xid, client_encryption, lseek, reply_mbits ]
    connect_data:
       flags: 0xa0425af2e3440078
       instance: 39
       target_version: 2.15.3.0
       initial_grant: 8437760
       max_brw_size: 4194304
       grant_block_size: 4096
       grant_inode_size: 32
       grant_max_extent_size: 67108864
       grant_extent_tax: 24576
       cksum_types: 0xf7
       max_object_bytes: 17592186040320
    import_flags: [ replayable, pingable, connect_tried ]
    connection:
       failover_nids: [ 0@lo, 0@lo ]
       current_connection: 0@lo
       connection_attempts: 1
       generation: 1
       in-progress_invalidations: 0
       idle: 36 sec
    rpcs:
       inflight: 0
       unregistering: 0
       timeouts: 0
       avg_waittime: 2627 usec
    service_estimates:
       services: 1 sec
       network: 1 sec
    transactions:
       last_replay: 0
       peer_committed: 0
       last_checked: 0
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to