RE: [ofa-general] [PATCH v3] iw_cxgb3: Support "iwarp-only" interfacesto avoid 4-tuple conflicts.
Sean, What is the model on how client connects, say for iSCSI, when client and server both support, iWARP and 10GbE or 1GbE, and would like to setup "most" performant "connection" for ULP? Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 > -Original Message- > From: Sean Hefty [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 27, 2007 2:39 PM > To: Steve Wise > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; linux-kernel@vger.kernel.org > Subject: Re: [ofa-general] [PATCH v3] iw_cxgb3: Support > "iwarp-only" interfacesto avoid 4-tuple conflicts. > > > The sysadmin creates "for iwarp use only" alias interfaces > of the form > > "devname:iw*" where devname is the native interface name > (eg eth0) for > > the iwarp netdev device. The alias label can be anything > starting with "iw". > > The "iw" immediately after the ':' is the key used by the > iw_cxgb3 driver. > > I'm still not sure about this, but haven't come up with > anything better myself. And if there's a good chance of > other rnic's needing the same support, I'd rather see the > common code separated out, even if just encapsulated within > this module for easy re-use. > > As for the code, I have a couple of questions about whether > deadlock and a race condition are possible, plus a few minor comments. > > > +static void insert_ifa(struct iwch_dev *rnicp, struct > in_ifaddr *ifa) > > +{ > > + struct iwch_addrlist *addr; > > + > > + addr = kmalloc(sizeof *addr, GFP_KERNEL); > > + if (!addr) { > > + printk(KERN_ERR MOD "%s - failed to alloc memory!\n", > > + __FUNCTION__); > > + return; > > + } > > + addr->ifa = ifa; > > + mutex_lock(&rnicp->mutex); > > + list_add_tail(&addr->entry, &rnicp->addrlist); > > + mutex_unlock(&rnicp->mutex); > > +} > > Should this return success/failure? > > > +static int nb_callback(struct notifier_block *self, > unsigned long event, > > + void *ctx) > > +{ > > + struct in_ifaddr *ifa = ctx; > > + struct iwch_dev *rnicp = container_of(self, struct > iwch_dev, nb); > > + > > + PDBG("%s rnicp %p event %lx\n", __FUNCTION__, rnicp, event); > > + > > + switch (event) { > > + case NETDEV_UP: > > + if (netdev_is_ours(rnicp, ifa->ifa_dev->dev) && > > + is_iwarp_label(ifa->ifa_label)) { > > + PDBG("%s label %s addr 0x%x added\n", > > + __FUNCTION__, ifa->ifa_label, > ifa->ifa_address); > > + insert_ifa(rnicp, ifa); > > + iwch_listeners_add_addr(rnicp, > ifa->ifa_address); > > If insert_ifa() fails, what will iwch_listeners_add_addr() > do? (I'm not easily seeing the relationship between the > address list and the listen list at this point.) > > > + } > > + break; > > + case NETDEV_DOWN: > > + if (netdev_is_ours(rnicp, ifa->ifa_dev->dev) && > > + is_iwarp_label(ifa->ifa_label)) { > > + PDBG("%s label %s addr 0x%x deleted\n", > > + __FUNCTION__, ifa->ifa_label, > ifa->ifa_address); > > + iwch_listeners_del_addr(rnicp, > ifa->ifa_address); > > + remove_ifa(rnicp, ifa); > > + } > > + break; > > + default: > > + break; > > + } > > + return 0; > > +} > > + > > +static void delete_addrlist(struct iwch_dev *rnicp) { > > + struct iwch_addrlist *addr, *tmp; > > + > > + mutex_lock(&rnicp->mutex); > > + list_for_each_entry_safe(addr, tmp, &rnicp->addrlist, entry) { > > + list_del(&addr->entry); > > + kfree(addr); > > + } > > + mutex_unlock(&rnicp->mutex); > > +} > > + > > +static void populate_addrlist(struct iwch_dev *rnicp) { > > + int i; > > + struct in_device *indev; > > + > > + for (i = 0; i < rnicp->rdev.port_info.nports; i++) { > > + indev = in_dev_get(rnicp->rdev.port_info.lldevs[i]); > > + if (!indev) > > + continue; > > + for_ifa(indev) > > + if (is_iwarp_label(ifa->ifa_label)) { > > + PDBG("%s label %s addr 0x%x added\n", > > +__FUNCTION__, ifa->ifa_label, > > +ifa->ifa_address); > > + insert_ifa(rnicp, ifa); > > + } > > + endfor_ifa(indev); > > + } > > +} > > + > > static void rnic_init(struct iwch_dev *rnicp) { > > PDBG("%s iwch_dev %p\n", __FUNCTION__, rnicp); @@ > -70,6 +187,12 @@ > > static void rnic_init(struct iwch_dev *r > > idr_init(&rnicp->qpidr); > > idr_init(&rnicp->mmidr); > > spin_lock_init(&rnicp->lock); > > + INIT_LIST_HEAD(&rnicp->addrlist); > > + INI
RE: [ofa-general] [PATCH v3] iw_cxgb3: Support"iwarp-only"interfacesto avoid 4-tuple conflicts.
Sean, IB aside, it looks like an ULP which is capable of being both RDMA aware and RDMA not-aware, like iSER and iSCSI, NFS-RDMA and NFS, SDP and sockets, will be treated as two separete ULPs. Each has its own IP address, since there is a different IP address for iWARP port and "regular" Ethernet port. So it falls on the users of ULPs to "handle" it via DNS or some other services. Is this "acceptable" to users? I doubt it. Recall that ULPs are going in opposite directions by having a different port number for RDMA aware and RDMA unaware versions of the ULP. This way, ULP "connection manager" handles RDMA-ness under the covers, while users plug an IP address for a server to connect to. Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 > -Original Message- > From: Sean Hefty [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 27, 2007 3:12 PM > To: Kanevsky, Arkady; Sean Hefty; Steve Wise > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; > linux-kernel@vger.kernel.org; [EMAIL PROTECTED] > Subject: RE: [ofa-general] [PATCH v3] iw_cxgb3: > Support"iwarp-only"interfacesto avoid 4-tuple conflicts. > > >What is the model on how client connects, say for iSCSI, when client > >and server both support, iWARP and 10GbE or 1GbE, and would like to > >setup "most" performant "connection" for ULP? > > For the "most" performance connection, the ULP would use IB, > and all these problems go away. :) > > This proposal is for each iwarp interface to have its own IP > address. Clients would need an iwarp usable address of the > server and would connect using rdma_connect(). If that call > (or rdma_resolve_addr/route) fails, the client could try > connecting using sockets, aoi, or some other interface. I > don't see that Steve's proposal changes anything from the > client's perspective. > > - Sean > ___ > general mailing list > [EMAIL PROTECTED] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [ofa-general] [PATCH v3] iw_cxgb3: Support"iwarp-only"interfacesto avoid 4-tuple conflicts.
Exactly, it forces the burden on administrator. And one will be forced to try one mount for iWARP and it does not work issue another one TCP or UDP if it fails. Yack! And server will need to listen on different IP address and simple * will not work since it will need to listen in two different domains. Had we run this proposal by administrators? Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 > -Original Message- > From: Steve Wise [mailto:[EMAIL PROTECTED] > Sent: Friday, September 28, 2007 3:47 PM > To: Kanevsky, Arkady > Cc: Sean Hefty; Sean Hefty; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; > [EMAIL PROTECTED] > Subject: Re: [ofa-general] [PATCH v3] iw_cxgb3: > Support"iwarp-only"interfacesto avoid 4-tuple conflicts. > > > > Kanevsky, Arkady wrote: > > Sean, > > IB aside, > > it looks like an ULP which is capable of being both RDMA aware and > > RDMA not-aware, like iSER and iSCSI, NFS-RDMA and NFS, SDP and > > sockets, will be treated as two separete ULPs. > > Each has its own IP address, since there is a different IP > address for > > iWARP port and "regular" Ethernet port. So it falls on the users of > > ULPs to "handle" it via DNS or some other services. > > Is this "acceptable" to users? I doubt it. > > > > Recall that ULPs are going in opposite directions by having a > > different port number for RDMA aware and RDMA unaware > versions of the ULP. > > This way, ULP "connection manager" handles RDMA-ness under > the covers, > > while users plug an IP address for a server to connect to. > > Thanks, > > Arkady, I'm confused about how this proposed design changes > the behavior of the ULPs that run on TCP and iWARP. I don't > see much difference from the point of view of the ULPs. > > The NFS-RDMA server, for example, will not need to change > since it binds to address 0.0.0.0 which will translate into a > bind/listen on the specific iwarp address for each iwarp > device on the rdma side, and address 0.0.0.0 for the TCP side. > > Am I missing your point? > > The real pain, IMO, with this solution is that it FORCES the > admins to use 2 subnets when 1 is sufficient if the net > maintainers would unify the port space... > > Steve. > > > > > > > Arkady Kanevsky email: [EMAIL PROTECTED] > > Network Appliance Inc. phone: 781-768-5395 > > 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 > > Waltham, MA 02451 central phone: 781-768-5300 > > > > > >> -Original Message- > >> From: Sean Hefty [mailto:[EMAIL PROTECTED] > >> Sent: Thursday, September 27, 2007 3:12 PM > >> To: Kanevsky, Arkady; Sean Hefty; Steve Wise > >> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; > >> linux-kernel@vger.kernel.org; [EMAIL PROTECTED] > >> Subject: RE: [ofa-general] [PATCH v3] iw_cxgb3: > >> Support"iwarp-only"interfacesto avoid 4-tuple conflicts. > >> > >>> What is the model on how client connects, say for iSCSI, > when client > >>> and server both support, iWARP and 10GbE or 1GbE, and > would like to > >>> setup "most" performant "connection" for ULP? > >> For the "most" performance connection, the ULP would use > IB, and all > >> these problems go away. :) > >> > >> This proposal is for each iwarp interface to have its own > IP address. > >> Clients would need an iwarp usable address of the server and would > >> connect using rdma_connect(). If that call (or > >> rdma_resolve_addr/route) fails, the client could try > connecting using > >> sockets, aoi, or some other interface. I don't see that Steve's > >> proposal changes anything from the client's perspective. > >> > >> - Sean > >> ___ > >> general mailing list > >> [EMAIL PROTECTED] > >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > >> > >> To unsubscribe, please visit > >> http://openib.org/mailman/listinfo/openib-general > >> > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [ofa-general] [PATCH v3] iw_cxgb3: Support"iwarp-only"interfacestoavoid 4-tuple conflicts.
Sean, Not so simple. How does client application knows where to connect? Does this proposal forces applications to choose the "right" network? Currently, MPA or ULP and not applications handle it. Why would we want to change that? Sean, I may be beating the dead horse, but I recall that one of the main selling points of RDMA that it magical bust to performance with no changes applications. Just plug it in an viola, performances goes up and CPU utilization for network stack goes does. Win-Win. Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 Waltham, MA 02451 central phone: 781-768-5300 > -Original Message- > From: Sean Hefty [mailto:[EMAIL PROTECTED] > Sent: Friday, September 28, 2007 5:35 PM > To: Kanevsky, Arkady > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; > linux-kernel@vger.kernel.org; [EMAIL PROTECTED] > Subject: Re: [ofa-general] [PATCH v3] iw_cxgb3: > Support"iwarp-only"interfacestoavoid 4-tuple conflicts. > > Kanevsky, Arkady wrote: > > Exactly, > > it forces the burden on administrator. > > And one will be forced to try one mount for iWARP and it > does not work > > issue another one TCP or UDP if it fails. > > Yack! > > > > And server will need to listen on different IP address and simple > > * will not work since it will need to listen in two > different domains. > > The server already has to call listen twice. Once for the > rdma_cm and once for sockets. Similarly on the client side, > connect must be made over rdma_cm or sockets. I really don't > see any impact on the application for this approach. > > We just end up separating the port space based on networking > addresses, rather than keeping the problem at the transport > level. If you have an alternate approach that will be > accepted upstream, feel free to post it. > > - Sean > ___ > general mailing list > [EMAIL PROTECTED] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/