On Wednesday, January 23, 2019 at 1:48:19 PM UTC-8, [email protected] wrote: > > We have a LIO target on RHEL 7.5 with the lun created using fileio through > targetcli. We exported it > to RHEL initiator on the same box (Tried with other box as well). > On the lun, when we do mkfs for ext3/ext4, it fails with following message > and can not be mounted. > > > ------------------------------------------------------------------------------------------------- > [root@linux_machine /]# mkfs -t ext4 /dev/sdh > mke2fs 1.42.9 (28-Dec-2013) > /dev/sdh is entire device, not just one partition! > Proceed anyway? (y,n) y > Filesystem label= > OS type: Linux > Block size=4096 (log=2) > Fragment size=4096 (log=2) > Stride=0 blocks, Stripe width=1024 blocks > 2621440 inodes, 10485760 blocks > 524288 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=2157969408 > 320 block groups > 32768 blocks per group, 32768 fragments per group > 8192 inodes per group > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, > 2654208, > 4096000, 7962624 > > Allocating group tables: done > Writing inode tables: done > Creating journal (32768 blocks): done > Writing superblocks and filesystem accounting information: > Warning, had trouble writing out superblocks. > > ----------------------------------------------------------------------------------------------------- > while above task fails, /var/log/messages on initiator has following > errors. > > > ------------------------------------------------------------------------------------------------------------- > kernel: connection1:0: detected conn error (1020) > Kernel reported iSCSI connection 1:0 error (1020 - > ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3) > connection1:0 is operational after recovery (1 attempts) > connection1:0: detected conn error (1020) > Kernel reported iSCSI connection 1:0 error (1020 - > ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3) > connection1:0 is operational after recovery (1 attempts) > connection1:0: detected conn error (1020) > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 54 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 5505040 > Kernel: Buffer I/O error on dev sdf, logical block 688130, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688131, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688132, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688133, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688134, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688135, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688136, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688137, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688138, lost async page > write > kernel: Buffer I/O error on dev sdf, logical block 688139, lost async page > write > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 50 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 5242896 > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 4c 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 4980752 > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 48 00 10 00 10 00 00 > blk_update_request: I/O error, dev sdf, sector 4718608 > sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 44 00 10 00 10 00 00 > blk_update_request: I/O error, dev sdf, sector 4456464 > sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 40 00 10 00 10 00 00 > blk_update_request: I/O error, dev sdf, sector 4194320 > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 3c 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 3932176 > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 38 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 3670032 > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 34 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 3407888 > kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED > driverbyte=DRIVER_OK > kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 30 00 10 00 10 00 00 > kernel: blk_update_request: I/O error, dev sdf, sector 3145744 > iscsid: Kernel reported iSCSI connection 1:0 error (1020 - > ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3) > iscsid: connection1:0 is operational after recovery (1 attempts) > > ------------------------------------------------------------------------------------------------------------------------------------ > > Upon further debugging we found out that target TCP window is becoming > full because of the writes on initiator side due to mkfs. > We then tried dd command on initiator with oflag=direct to perform > synchronous writes. This time we did not face any issue. > If we try dd without oflag=direct, we see the same error messages in > /var/log/messages as in case of mkfs >
So that shows this has nothing to do with ext3/ext4, but instead has to do with your network. > > > Following are the things we tried: > 1) We tried increasing the TCP RECV window on Target to more than the > existing > window size. But it did not help. > 2) We tried increasing MaxRecvDataSegmentLength, MaxBurstLength, > FirstBurstLengt on > target side. This helped in a sense that it delayed the occurance of the > error > but still the errors were seen. > 3) We also changed the and node.session.timeo.replacement_timeout, > node.conn[0].timeo.noop_out_interval, node.conn[0].timeo.noop_out_timeout, > node.session.err_timeo.abort_timeout on initiator side. > They were not effective in solving the problem > > Following is the query: > 1) What could be the causes of this issue? Why is the target deamon so > slow? > 2) what other tunables could we try to solve the problem? > > Environment Details: > OS: Red Hat Enterprise Linux Server release 7.5 (Maipo) > Kernel Version: 3.10.0-862.el7.x86_64 > > PFA image: > In the Wireshark image, 10.182.110.221 is the target and 10.182.111.167 is > the > initiator > > [image: tcp_reset (002).jpg] > And you say you get the same TCP congestion when initiator and target on are on the same system? If so, can you try using 127.0.0.1. Your distro packages look quite old. Are they all up to date with current patches/fixes? What version of targetcli-fb do you have? I'm afraid I know little about networking issues, but if the issue persists using loopback that would seem to eliminate any issues with your switches. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
