** Tags added: kernel-da-key -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1696445
Title: OpenPower: Some multipaths temporarily have only a single path Status in linux package in Ubuntu: New Bug description: [Impact] * The SES driver causes a long delay in disk discovery when a large number of disks is present in the disk enclosure, which increases with the number of disks attached. * This delays the addition and visibility of the disk devices to userspace, which among other things causes multipath not to have multiple paths, actually, until the disk discovery eventually/finally finishes. * The fix significantly shortens the time taken by the SES driver to handle disk discovery, causing no extra delays, by removing a superfluous SCSI command sent to enclosure. [Test Case] * Load the module to access the enclosure and its disks; e.g., $ sudo modprobe mpt3sas * Notice the interval between the discovery of each disk; e.g., dmesg $ dmesg -T | grep 'Attached SCSI disk' | tail -n2 [Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk [Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk * The interval should be in the same second or so range with the fix. $ dmesg -T | grep 'Attached SCSI disk' | tail -n2 [Wed Jun 7 13:11:59 2017] sd 18:0:176:0: [sdly] Attached SCSI disk [Wed Jun 7 13:11:59 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk [Regression Potential] * The power status of the disks in the enclosure is no longer checked during probe time. However, the patch demonstrates that initial value was never used in any way. So, little regression potential. * Nonetheless, users of SES enclosures which verify the power status of disks in the enclosure might _theoretically_ see a problem, iff the fix has a problem (which has not been found yet). [Other Info] * None at this time. Problem Description: ==================== This week, I went ahead and scaled up my test configuration to max configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a different issue. The issue is that some multipaths only have a single path and no redundancy. Others have multiple paths and redundancy. Checkpoint #1: ============== - system reboot around 2pm (14:00) Checkpoint # 2: =============== - It took several minutes for first disk to be detected. root@smb1p1:~# multipath -ll|grep dm |wc -l 103 root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail [Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk [Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk [Thu Jun 1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk [Thu Jun 1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk [Thu Jun 1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk [Thu Jun 1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk [Thu Jun 1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk [Thu Jun 1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk [Thu Jun 1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk [Thu Jun 1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk root@smb1p1:~# ... root@smb1p1:~# multipath -ll|grep dm |wc -l 142 root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' | grep 'Attached SCSI disk' | tail [Thu Jun 1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk [Thu Jun 1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk [Thu Jun 1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk [Thu Jun 1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk [Thu Jun 1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk [Thu Jun 1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk [Thu Jun 1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk [Thu Jun 1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk [Thu Jun 1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk [Thu Jun 1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk root@smb1p1:~# ... - After 43 minutes, multipath -ll command shows some paths with only single path and no redundancy and some path with multiple paths and redundancy. root@smb1p1:~# date Thu Jun 1 14:43:00 CDT 2017 root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+' 252 root@smb1p1:~# ... - After 47 minutes, multipath -ll command still shows some paths with only single path and no redundancy. root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+' 288 root@smb1p1:~# - After 51 minutes after system reboot, looks like all disk are discovered and the Multipath is correctly built. root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+' 336 == Comment: #24 - Mauricio Faria De Oliveira - 2017-06-06 11:42:59 == Hi Paul, Per your logs, yes, it's the slowness with the SES driver. I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into 16.04.2 and 16.04.3. Thanks, Mauricio == Comment: #26 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2017-06-06 12:06:32 == The patch applies cleanly in the master-next branch of ubuntu-zesty.git and ubuntu-yakkety.git. Mirroring to Canonical to get a LP bug number, required in the submission process. == Comment: #27 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2017-06-06 12:07:58 == The commit is [1]. commit 75106523f39751390b5789b36ee1d213b3af1945 Author: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com> Date: Wed Apr 5 12:18:19 2017 -0300 scsi: ses: don't get power status of SES device slot on probe [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1696445/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp