On 08/22/2013 06:15 PM, Aaron Jones wrote: > I have a rather odd use case where I would like to have several thousand > iSCSI sessions (one connection per session) on a single initiator, but > I'm having trouble reaching beyond 6000-8000 before things start breaking. >
I do not think people have tried that many. I think at Red Hat we only tested with 1K targets/sessions. How many LUs per session do you have? There are some limits due to how many file handles we can have open. That is currently hard coded to #define ISCSI_MAX_FILES 16384 I think though it does not like up 1 session == 1 file. There are other limits like the host number and session number are 32 bits, but you are not close to them. > My hardware has 72GB memory, 12 Xeon Cores at 2.5GHz, and 2 10GB > ethernet ports. For software I'm running on Ubuntu 12.04 (With Linux > kernel 3.8) with open-iscsi 2.0-873. > > First, I make several ifaces using the same hardware so I can present > different iqns to the target. > > iscsiadm -m iface -I testiface.0 -o new > iscsiadm -m iface -I testiface.0 -o update -n iface.initiatorname -v > iqn.something.0000 > iscsiadm -m iface -I testiface.0 -o update -n iface.iface_num -v 0 > iscsiadm -m iface -I testiface.1 -o new > iscsiadm -m iface -I testiface.1 -o update -n iface.initiatorname -v > iqn.something.0001 > iscsiadm -m iface -I testiface.1 -o update -n iface.iface_num -v 1 > > Then I perform discovery for each iface. The discovery may report up to > 2000 targets per initiator iface that I'm using. > > iscsiadm -m discovery -t sendtargets -p some.ip.address -I > testiface.0 -o new > iscsiadm -m discovery -t sendtargets -p some.ip.address -I > testiface.1 -o new > > Finally I try to log in to everything: > > iscsiadm -m node -L all > > > After iscsid crunches on it for a while (seemingly single threaded from > watching top), and eventually all of the iSCSI sessions are created (I It is single threaded but each operation does not block. At least it should not block for long. It does not wait for login to complete for example, so if you see iscsid blocking for a long time on a operation let me know. > can see the sessions are established on the targets). After the > sessions are created, iscsid then iscsiadm begins printing: > > Login to [iface: testiface.1, target: iqn.something.else, portal: > some.ip.address,3260] successful. > > > However, after quite a few successes, I finally start getting: > > iscsiadm: Could not login to [iface: testiface.5, target: > iqn.something.else, portal: some.ip.address,3260]. > iscsiadm: initiator reported error (9 - internal error) > > > I did an strace of iscsid during this, and it seems to be getting ENOMEM > back on clone system calls. Looking at ps after I start geting > failures, there are quite a few defunct iscsid processes in the list > that I guess the parent process has not attended to yet. What seems > strange to me is that I seem to have quite a lot of free memory looking > at vmstat/free (at least right up to the point where things start going > crazy). I added a 300GB SSD as swap space, but that did not seem to > affect the problem, nor did tweaking Linux's overcommit settings. > Eventually the issues get so bad that bash complains about not being > able to fork processes due to a lack of memory. Would 8000 sessions > really take 72GB of memory? > > strace snippet: > > clone(child_stack=0, > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > child_tidptr=0x7fc24aa7a9d0) = -1 ENOMEM (Cannot allocate memory) > > > If I try to log in for each iface individually it is FAR slower. That > is to say the first 2000 sessions are established rather quickly, but > logging in things after that is extremely slow, i.e. using the commands > > iscsiadm -m node -l -I testiface.0 > > iscsiadm -m node -l -I testiface.1 > > The first command goes fairly quickly, but the second command takes an > extreme amount of time. I am not sure what you mean here. Did you mean that after 2K sessions when you run those commands they are all slow or is one command fast and one is slow? Do you know what part of the login process is slow? Is it the iscsi login, the scsi scanning, or some other operation? There is not really enough info to go by in the bug report. It would take some major debugging. When I get time I will try it here. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/groups/opt_out.
