Public bug reported:

Problem Description
=================================
I am facing this issue for Texan Flash storage 840 disks which are coming from 
coho and salfish adapter

coho adapter with 840 storage  is 3G disks and salfish adapter with 840
is 12G disks

I am able to see those disks in lsblk o/p but not in multipath -ll
comamnd

0004:01:00.0     Coho: Saturn-X          U78C9.001.WZS0060-P1-C6         
0x10000090fa2a51f8      host10                  Online
0004:01:00.1     Coho: Saturn-X          U78C9.001.WZS0060-P1-C6         
0x10000090fa2a51f9      host11                  Online

0005:09:00.0     Sailfish: QLogic 8GB    U78C9.001.WZS0060-P1-C9         
0x21000024ff787778      host2           Online
0005:09:00.1     Sailfish: QLogic 8GB    U78C9.001.WZS0060-P1-C9         
0x21000024ff787779      host4           Online


root@luckyv1:/dev/disk# multipath -ll | grep "size=3.0G" -B 1
root@luckyv1:/dev/disk# multipath -ll | grep "size=12G" -B 1
root@luckyv1:/dev/disk#

== Comment: #3 - Luciano Chavez <cha...@us.ibm.com> - 2016-09-20 20:22:20 ==
I edited /etc/multipath.conf and added
verbosity 6

to crank up the output and ran multipath -ll and saved it off to a text
file (attached). All the using the directio checker failed and those
using the tur checker seem to work.

Sep 20 20:07:36 | loading //lib/multipath/libcheckdirectio.so checker
Sep 20 20:07:36 | loading //lib/multipath/libprioconst.so prioritizer
Sep 20 20:07:36 | Discover device 
/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai
Sep 20 20:07:36 | sdai: udev property ID_WWN whitelisted
Sep 20 20:07:36 | sdai: not found in pathvec
Sep 20 20:07:36 | sdai: mask = 0x25
Sep 20 20:07:36 | sdai: dev_t = 66:32
Sep 20 20:07:36 | open 
'/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/block/sdai/size'
Sep 20 20:07:36 | sdai: size = 20971520
Sep 20 20:07:36 | sdai: vendor = IBM
Sep 20 20:07:36 | sdai: product = FlashSystem-9840
Sep 20 20:07:36 | sdai: rev = 1442
Sep 20 20:07:36 | sdai: h:b:t:l = 3:0:0:0
Sep 20 20:07:36 | SCSI target 3:0:0 -> FC rport 3:0-2
Sep 20 20:07:36 | sdai: tgt_node_name = 0x500507605e839800
Sep 20 20:07:36 | open 
'/sys/devices/pci0000:00/0000:00:00.0/0000:01:00.0/host3/rport-3:0-2/target3:0:0/3:0:0:0/state'
Sep 20 20:07:36 | sdai: path state = running
Sep 20 20:07:36 | sdai: get_state
Sep 20 20:07:36 | sdai: path_checker = directio (internal default)
Sep 20 20:07:36 | sdai: checker timeout = 30 ms (internal default)
Sep 20 20:07:36 | io_setup failed
Sep 20 20:07:36 | sdai: checker init failed

== Comment: #4 - Luciano Chavez <cha...@us.ibm.com> - 2016-09-20 21:00:23 ==
Hello Mauricio,

I edited /etc/multipath.conf to use the tur path checker but the output
still shows directio. Ideas?

devices {
        device {
                vendor                  "IBM     "
                product                 "FlashSystem-9840"
                path_selector           "round-robin 0"
                path_grouping_policy    multibus
                path_checker            tur
                rr_weight               uniform
                no_path_retry           fail
                failback                immediate
                dev_loss_tmo            300
                fast_io_fail_tmo        25
        }
}

== Comment: #7 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2016-09-27 
18:32:57 ==
The function is failing at the io_setup() system call.

        @ checkers/directio.c

        int libcheck_init (struct checker * c)
        {
                unsigned long pgsize = getpagesize();
                struct directio_context * ct;
                long flags;

                ct = malloc(sizeof(struct directio_context));
                if (!ct)
                        return 1;
                memset(ct, 0, sizeof(struct directio_context));

                if (io_setup(1, &ct->ioctx) != 0) {
                        condlog(1, "io_setup failed");
                        free(ct);
                        return 1;
                }
        <...>

The syscall is failing w/ EAGAIN

        # grep ^io_setup multipath_-v2_-d.strace
        io_setup(1, 0x100163c9130)              = -1 EAGAIN (Resource 
temporarily unavailable)
        io_setup(1, 0x10015bae2c0)              = -1 EAGAIN (Resource 
temporarily unavailable)
        io_setup(1, 0x100164d65a0)              = -1 EAGAIN (Resource 
temporarily unavailable)
        io_setup(1, 0x10016429f20)              = -1 EAGAIN (Resource 
temporarily unavailable)
        io_setup(1, 0x100163535c0)              = -1 EAGAIN (Resource 
temporarily unavailable)
        io_setup(1, 0x10016368510)              = -1 EAGAIN (Resource 
temporarily unavailable)
        <...>

According to the manpage (man 2 io_setup)

        NAME
               io_setup - create an asynchronous I/O context

        DESCRIPTION
               The io_setup() system call creates an asynchronous I/O context 
suitable for concurrently processing nr_events operations. <...>

        ERRORS
               EAGAIN The specified nr_events exceeds the user's limit of 
available events, as defined in /proc/sys/fs/aio-max-nr.


On luckyv1:

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr
        65536

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
        130560

According to linux's Documentation/sysctl/fs.txt [1]

        aio-nr & aio-max-nr:

        aio-nr is the running total of the number of events specified on the
        io_setup system call for all currently active aio contexts.  If aio-nr
        reaches aio-max-nr then io_setup will fail with EAGAIN.  Note that
        raising aio-max-nr does not result in the pre-allocation or re-sizing
        of any kernel data structures.

Interestingly, aio-nr is greater than aio-max-nr. Hm.

Increased aio-max-nr to 262144, and could get some more maps created.

Accidentally killed multipathd and got this:

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-max-nr
        262144

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
        0

With just a few io_setup failures (previously there were hundreds..)

        root@luckyv1:~/mauricfo/bz146849/sep27# multipath -v2
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | io_setup failed
        Sep 27 17:11:22 | mpathcs: ignoring map
        Sep 27 17:11:22 | mpathcs: ignoring map

Then noticed a huge increase:

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
        354560

multipathd was restarted automatically.


The number decreases when multipathd is being stopped/shutdown:

        let this running:       
        root@luckyv1:~# systemctl stop multipathd

        and watch it:
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
        354560
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
        276480
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
        267520
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr
        262400
        ...
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
        43520
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
        29440
        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
        0

Start it, and aio-nr goes high:

        root@luckyv1:~# cat /proc/sys/fs/aio-nr;
        0

        root@luckyv1:~# systemctl start multipathd

        root@luckyv1:~# cat /proc/sys/fs/aio-nr;
        523520


And it seems there's something very wrong w/ the number of requests that are 
successful, and the number reported in aio-nr:

        root@luckyv1:~/mauricfo/bz146849/sep27# grep -c 'io_setup.*= 0' 
strace.multipathd_-d_-s.io_setup 
        409

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
        519680

        root@luckyv1:~/mauricfo/bz146849/sep27# killall -9 multipathd

        root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
        0

All request are for a single (one) AIO context (nr_events parameter):

        root@luckyv1:~/mauricfo/bz146849/sep27# grep -m3 'io_setup.*= 0' 
strace.multipathd_-d_-s.io_setup 
        130184 io_setup(1, [70366829412352])    = 0
        130184 io_setup(1, [70366818795520])    = 0
        130184 io_setup(1, [70366818729984])    = 0

Checking further..

[1] https://www.kernel.org/doc/Documentation/sysctl/fs.txt

== Comment: #8 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2016-09-27 
18:56:08 ==
This attached test-case demonstrates that for each io_setup() request of 1 
nr_event, actually 1280 seem to be allocated.


root@luckyv1:~/mauricfo/bz146849/sep27# gcc -o io_setup io_setup.c -laio

root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
0

root@luckyv1:~/mauricfo/bz146849/sep27# ./io_setup &
[1] 12352
io_setup rc = 0
sleeping 10 seconds...

root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
1280

<...>
io_destroy rc = 0

[1]+  Done                    ./io_setup

root@luckyv1:~/mauricfo/bz146849/sep27# cat /proc/sys/fs/aio-nr 
0

== Comment: #9 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2016-09-27 
19:17:13 ==
(In reply to comment #8)
> This attached test-case demonstrates that for each io_setup() request of 1
> nr_event, actually 1280 seem to be allocated.

The kernel code for the io_setup() syscall (call to ioctx_alloc())
indeed does that.

SYSCALL_DEFINE2(io_setup, unsigned, nr_events, aio_context_t __user *, ctxp)
...
        ioctx = ioctx_alloc(nr_events);
...

static struct kioctx *ioctx_alloc(unsigned nr_events)
...
        nr_events = max(nr_events, num_possible_cpus() * 4);
        nr_events *= 2;
...


The math is:

        root@luckyv1:~# cat /sys/devices/system/cpu/possible 
        0-159

        root@luckyv1:~# echo $((160*4*2))
        1280

Now, need to understand why.

== Comment: #10 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2016-09-27 
19:22:44 ==
Anyway, the number of paths to FlashSystem-9840 in the system is 424.

    root@luckyv1:~/mauricfo/bz146849/sep27# grep FlashSystem-9840 
/sys/block/sd*/device/model | wc -l
    424

For each one there's an io_setup() call to allocate 1 nr_event, which
actually becomes 1280, so we have in total... 542720, when the default
aio-max-nr is 65536

    root@luckyv1:~/mauricfo/bz146849/sep27# echo $((424*1280))  
    542720

So, obviously, it will exceed the max amount available (it already did,
for some reason, as aio-nr showed 130560).

Next steps is to understand why this is done, if this is a bug at all
(you know, that multiplier of 160 * 4 is big on Power because we have
lots of threads w/ P8/SMT8), and if so, if there's a proper "fix" for
this. It may very well be just working correctly, and then we'll
document this and instructions to increase the limit for
FlashSystem-9840.

== Comment: #11 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2016-09-27 
19:30:47 ==
(In reply to comment #10)
> For each one there's an io_setup() call to allocate 1 nr_event, which
> actually becomes 1280, so we have in total... 542720, when the default
> aio-max-nr is 65536
<...>
> So, obviously, it will exceed the max amount available (it already did, for
> some reason, as aio-nr showed 130560).

Got that; the check is against 2x aio-max-nr:

        if (aio_nr + nr_events > (aio_max_nr * 2UL)

130560 + 1280 = 131840
which is greater than
65536 * 2 = 131072

so the check fails from that point on.

130560 (default of 64k aio-max-nr) is enough for 102 paths (130560 /
1280).

== Comment: #14 - LEKSHMI C. PILLAI <lekshmi.cpil...@in.ibm.com> - 2016-09-28 
07:57:03 ==
Hi

I applied the workaround to increase aio-max-nr.

I did it on luckyv1.I changed the default value to 1048576.it didn't worked 
.Again changed to 4194304                                         
root@luckyv1:/test/lucky# cat /proc/sys/fs/aio-max-nr
4194304--------------------------------------------------------------------->
root@luckyv1:/test/lucky#                                           

How I did

Edited the value  in /etc/sysctl.conf and ran sysctl -p /etc/sysctl.conf
.

After that I am able to see all the disks

Thanks for the workaround


Thanks
Lekshmi

== Comment: #15 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2016-09-28 
08:44:39 ==
(In reply to comment #14)
> I applied the workaround to increase aio-max-nr.
> 
> I did it on luckyv1.I changed the default value to 1048576.it didn't worked
> .Again changed to 4194304                                         
<...>
> After that I am able to see all the disks

Yes, the actual value that works may vary depending on what multipathd
is doing, and by which time did you run which commands before/after
multipathd was running.

The point is, multipathd allocates a number of events. I'd expect 424 FS9840 
paths * 1280 events/path = 542720 events (if multipathd doesn't allocate a few 
ones for itself / path-independent)), so theoretically, half of that (271360) 
or close should pass. 
However, if multipathd was already running (ie, allocated a number of events), 
and /then/ you run multipath command to test, multipath would try to allocate 
/more/ events on top of those already allocated by multipathd, which would be 
much more likely to fail because a large number was already allocated.

> Thanks for the workaround

You're welcome!

== Comment: #19 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> -
2016-10-03 10:44:10 ==


== Comment: #45 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 2017-09-19 
18:32:10 ==
Verification of this commit with the linux-hwe-edge kernel in -proposed,
using the attached test-case "io_setup_v2.c" 

    commit 2a8a98673c13cb2a61a6476153acf8344adfa992
    Author: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>
    Date:   Wed Jul 5 10:53:16 2017 -0300

        fs: aio: fix the increment of aio-nr and counting against aio-
max-nr

Test-case (attached)

    $ sudo apt-get install gcc libaio-dev
    $ gcc -o io_setup_v2 io_setup_v2.c -laio

Original kernel:

    - Only 409 io_contexts could be allocated, 
    but that took 130880 [ div by 2, per bug] = 65440 slots out of 65535

    $ uname -rv
    4.11.0-14-generic #20~16.04.1-Ubuntu SMP Wed Aug 9 09:06:18 UTC 2017

    $ ./io_setup_v2 1 65536
    nr_events: 1, nr_requests: 65536
    rc = -11, i = 409
    ^Z
    [1]+  Stopped                 ./io_setup_v2 1 65536

    $ cat /proc/sys/fs/aio-nr 
    130880

    $ cat /proc/sys/fs/aio-max-nr 
    65536

    $ kill %%

Patched kernel:

    - Now 65515 io_contexts could be allocated out of 65535 (much better)
      (and reporting correctly, without div by 2.)

    $ uname -rv
    4.11.0-140-generic #20~16.04.1+bz146489 SMP Tue Sep 19 17:46:15 CDT 2017

    $ ./io_setup_v2 1 65536
    nr_events: 1, nr_requests: 65536
    rc = -12, i = 65515
    ^Z
    [1]+  Stopped                 ./io_setup_v2 1 65536

    $ cat /proc/sys/fs/aio-nr 
    65515

    $ kill %%

** Affects: ubuntu-power-systems
     Importance: Critical
     Assignee: Canonical Kernel Team (canonical-kernel-team)
         Status: New

** Affects: kernel-package (Ubuntu)
     Importance: Undecided
     Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
         Status: New


** Tags: architecture-ppc64le bugnameltc-146489 severity-critical 
targetmilestone-inin16043

** Tags added: architecture-ppc64le bugnameltc-146489 severity-critical
targetmilestone-inin16043

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1718397

Title:
  multipath -ll is not showing the disks which are actually multipath

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1718397/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to