On 10/15/15, 12:17 PM, "Alexander Duyck" <alexander.duyck at gmail.com> wrote:
>On 10/15/2015 08:43 AM, Alex Forster wrote: >> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck at gmail.com> >>wrote: >> >>> On 10/15/2015 07:46 AM, Alex Forster wrote: >>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck at gmail.com> >>>> wrote: >>>> >>>>> If you are using Intel's out-of-tree ixgbe driver I believe the >>>>>module >>>>> parameters are comma separated with one index per port. So if you >>>>>have >>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for >>>>>4 >>>>> you would need four '1's. >>>> >>>> This seemed very promising. I compiled and installed the out of tree >>>> ixgbe >>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows >>>>all >>>> eight "allow_unsupported_sfp enabled" messages but the last four ports >>>> still error out with the unsupported SFP message when running the >>>>tests. >>>> >>>> Before I start arbitrarily trying to patch out parts of the SFP >>>> verification code in ixgbe, are there any other tips I should know? >>> >>> Can you send me the command you used to load the module, and the exact >>> number of ixgbe ports you have in the system? With that I could then >>> verify that the command was entered correctly as it is possible there >>> could still be an issue in the way the command was entered. >>> >>> One other possibility is that when the driver loads each load counts as >>> an instance in the module parameter array. So if for example you >>>unbind >>> the driver on one port and then later rebind it you will have consumed >>> one of the values in the array. Do it enough times and you exceed the >>> bounds of the array as you entered it and it will simply use the >>>default >>> value of 0. >>> >>> Also the output of "ethtool -i <ethX>" would be useful to verify that >>> you have the out-of-tree driver loaded and not the in kernel. >>> >>> - Alex >>> >> >> Er, let me try that again. >> >> https://gist.github.com/AlexForster/f5372c5b60153d278089 >> >> >> Alex Forster >> >> > >It looks like you are probably seeing interfaces be unbound and then >rebound. As such you are likely pushing things outside of the array >boundary. One solution might just be to at more ",1"s if you are only >going to be doing this kind of thing at boot up. The upper limit for >the array is 32 entries so as long as you only are setting this up once >you could probably get away with that. > >An alternative would be to modify the definition of the parameter in >ixgbe_param.c. If you look through the file you should fine several >likes like below: > struct ixgbe_option opt = { > .type = enable_option, > .name = "allow_unsupported_sfp", > .err = "defaulting to Disabled", > .def = OPTION_DISABLED > }; > >If you modify the .def value to "OPTION_ENABLED", and then rebuild and >reinstall your driver you should be able have it install without any >issues. > >- Alex > Yeah, I've had roughly the same thought process since you mentioned the args array. My first idea was "maybe the driver can't fit all of my 1's" but I saw it was defined at 32. Then I decided to just patch the whole enable_unsupported_sfp option out https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still failing. I've been digging a bit, and I'm failing here in ixgbe_main.c... /* reset_hw fills in the perm_addr as well */ hw->phy.reset_if_overtemp = true; err = hw->mac.ops.reset_hw(hw); hw->phy.reset_if_overtemp = false; if (err == IXGBE_ERR_SFP_NOT_PRESENT) { err = IXGBE_SUCCESS; } else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) { e_dev_err("failed to load because an unsupported SFP+ or QSFP " "module type was detected.\n"); e_dev_err("Reload the driver after installing a supported " "module.\n"); goto err_sw_init; } else if (err) { e_dev_err("HW Init failed: %d\n", err); goto err_sw_init; } I've attempted a hand-stacktrace and came up with the following... ixgbe_82599.c at 1016 * ixgbe_reset_hw_82599() is defined * calls phy->ops.init() which potentially returns IXGBE_ERR_SFP_NOT_SUPPORTED ixgbe_82599.c at 102 * ixgbe_init_phy_ops_82599() is defined * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling phy->ops.identify() ixgbe_82599.c at 2085 * ixgbe_identify_phy_82599() is defined * calls ixgbe_identify_module_generic() ixgbe_phy.c at 1281 * ixgbe_identify_module_generic() is defined * calls ixgbe_identify_qsfp_module_generic() ixgbe_phy.c at 1663 * ixgbe_identify_qsfp_module_generic() is defined * We fail somewhere before the ending call to ixgbe_get_device_caps() which does take allow_unsupported_sfp into account * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER, &identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS * Possibility: active_cable != true And then I'm over my head. Should I assume from here that the most likely explanation is a bad transceiver or bad fiber? Alex Forster