[dpdk-dev] Question about unsupported transceivers

Alex Forster alex at alexforster.com
Thu Oct 15 19:13:33 CEST 2015


On 10/15/15, 12:17 PM, "Alexander Duyck" <alexander.duyck at gmail.com> wrote:


>On 10/15/2015 08:43 AM, Alex Forster wrote:
>> On 10/15/15, 11:30 AM, "Alexander Duyck" <alexander.duyck at gmail.com>
>>wrote:
>>
>>> On 10/15/2015 07:46 AM, Alex Forster wrote:
>>>> On 10/13/15, 4:34 PM, "Alexander Duyck" <alexander.duyck at gmail.com>
>>>> wrote:
>>>>
>>>>> If you are using Intel's out-of-tree ixgbe driver I believe the
>>>>>module
>>>>> parameters are comma separated with one index per port.  So if you
>>>>>have
>>>>> two ports you should be passing "allow_unsupported_sfp=1,1", and for
>>>>>4
>>>>> you would need four '1's.
>>>>
>>>> This seemed very promising. I compiled and installed the out of tree
>>>> ixgbe
>>>> driver and set the option in /etc/modprobe.d/ixgbe.conf. dmesg shows
>>>>all
>>>> eight "allow_unsupported_sfp enabled" messages but the last four ports
>>>> still error out with the unsupported SFP message when running the
>>>>tests.
>>>>
>>>> Before I start arbitrarily trying to patch out parts of the SFP
>>>> verification code in ixgbe, are there any other tips I should know?
>>>
>>> Can you send me the command you used to load the module, and the exact
>>> number of ixgbe ports you have in the system?  With that I could then
>>> verify that the command was entered correctly as it is possible there
>>> could still be an issue in the way the command was entered.
>>>
>>> One other possibility is that when the driver loads each load counts as
>>> an instance in the module parameter array.  So if for example you
>>>unbind
>>> the driver on one port and then later rebind it you will have consumed
>>> one of the values in the array.  Do it enough times and you exceed the
>>> bounds of the array as you entered it and it will simply use the
>>>default
>>> value of 0.
>>>
>>> Also the output of "ethtool -i <ethX>" would be useful to verify that
>>> you have the out-of-tree driver loaded and not the in kernel.
>>>
>>> - Alex
>>>
>>
>> Er, let me try that again.
>>
>> https://gist.github.com/AlexForster/f5372c5b60153d278089
>>
>>
>> Alex Forster
>>
>>
>
>It looks like you are probably seeing interfaces be unbound and then
>rebound.  As such you are likely pushing things outside of the array
>boundary.  One solution might just be to at more ",1"s if you are only
>going to be doing this kind of thing at boot up.  The upper limit for
>the array is 32 entries so as long as you only are setting this up once
>you could probably get away with that.
>
>An alternative would be to modify the definition of the parameter in
>ixgbe_param.c.  If you look through the file you should fine several
>likes like below:
>	struct ixgbe_option opt = {
>			.type = enable_option,
>			.name = "allow_unsupported_sfp",
>			.err  = "defaulting to Disabled",
>			.def  = OPTION_DISABLED
>		};
>
>If you modify the .def value to "OPTION_ENABLED", and then rebuild and
>reinstall your driver you should be able have it install without any
>issues.
>
>- Alex
>

Yeah, I've had roughly the same thought process since you mentioned the
args array. My first idea was "maybe the driver can't fit all of my 1's"
but I saw it was defined at 32. Then I decided to just patch the whole
enable_unsupported_sfp option out
https://gist.github.com/AlexForster/112fd822704caf804849 but I'm still
failing.

I've been digging a bit, and I'm failing here in ixgbe_main.c...

/* reset_hw fills in the perm_addr as well */
hw->phy.reset_if_overtemp = true;
err = hw->mac.ops.reset_hw(hw);
hw->phy.reset_if_overtemp = false;
if (err == IXGBE_ERR_SFP_NOT_PRESENT) {
	err = IXGBE_SUCCESS;
} else if (err == IXGBE_ERR_SFP_NOT_SUPPORTED) {
	e_dev_err("failed to load because an unsupported SFP+ or QSFP "
		  "module type was detected.\n");
	e_dev_err("Reload the driver after installing a supported "
		  "module.\n");
	goto err_sw_init;
} else if (err) {
	e_dev_err("HW Init failed: %d\n", err);
	goto err_sw_init;
}


I've attempted a hand-stacktrace and came up with the following...

ixgbe_82599.c at 1016
 * ixgbe_reset_hw_82599() is defined
 * calls phy->ops.init() which potentially returns
IXGBE_ERR_SFP_NOT_SUPPORTED

ixgbe_82599.c at 102
 * ixgbe_init_phy_ops_82599() is defined
 * IXGBE_ERR_SFP_NOT_SUPPORTED is returned after calling
phy->ops.identify()

ixgbe_82599.c at 2085
 * ixgbe_identify_phy_82599() is defined
 * calls ixgbe_identify_module_generic()

ixgbe_phy.c at 1281
 * ixgbe_identify_module_generic() is defined
 * calls ixgbe_identify_qsfp_module_generic()

ixgbe_phy.c at 1663
 * ixgbe_identify_qsfp_module_generic() is defined
 * We fail somewhere before the ending call to ixgbe_get_device_caps()
which does take allow_unsupported_sfp into account

 * Possibility: hw->phy.ops.read_i2c_eeprom(hw, IXGBE_SFF_IDENTIFIER,
&identifier) != IXGBE_SFF_IDENTIFIER_QSFP_PLUS
 * Possibility: active_cable != true

And then I'm over my head. Should I assume from here that the most likely
explanation is a bad transceiver or bad fiber?

Alex Forster



More information about the dev mailing list