[dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF from host
Ilya Maximets
i.maximets at ovn.org
Mon Nov 4 15:58:31 CET 2019
On 04.11.2019 15:30, Asaf Penso wrote:
>
>
> Regards,
> Asaf Penso
>
>> -----Original Message-----
>> From: dev <dev-bounces at dpdk.org> On Behalf Of Ilya Maximets
>> Sent: Monday, November 4, 2019 12:28 PM
>> To: Shahaf Shuler <shahafs at mellanox.com>; Ilya Maximets
>> <i.maximets at ovn.org>; Thomas Monjalon <thomas at monjalon.net>
>> Cc: dev at dpdk.org; Jerin Jacob <jerinjacobk at gmail.com>; Andrew
>> Rybchenko <arybchenko at solarflare.com>; Ferruh Yigit
>> <ferruh.yigit at intel.com>; Stephen Hemminger
>> <stephen at networkplumber.org>; Roni Bar Yanai <roniba at mellanox.com>;
>> Rony Efraim <ronye at mellanox.com>; declan.doherty at intel.com;
>> bernard.iremonger at intel.com; ajit.khaparde at broadcom.com; Ananyev,
>> Konstantin <konstantin.ananyev at intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF from
>> host
>>
>> On 03.11.2019 7:48, Shahaf Shuler wrote:
>>> Friday, November 1, 2019 11:33 AM, Ilya Maximets:
>>>> Subject: Re: [dpdk-dev] [PATCH v2 0/3] ethdev: configure SR-IOV VF from
>>>> host
>>>>
>>>> On 30.10.2019 22:42, Thomas Monjalon wrote:
>>>>> 30/10/2019 17:09, Ilya Maximets:
>>>>>> On 30.10.2019 16:49, Thomas Monjalon wrote:
>>>>>>> 30/10/2019 16:07, Ilya Maximets:
>>>>>>>> On 29.10.2019 19:50, Thomas Monjalon wrote:
>>>>>>>>> In a virtual environment, the network controller may have to
>>>>>>>>> configure some SR-IOV VF parameters for security reasons.
>>>>>>>>>
>>> [...]
>>>
>>>>> If we consider what Intel did, i.e. configure VF in place of
>>>>> representor for some operations, there are two drawbacks:
>>>>> - confusing that some ops apply to representor, others apply to VF
>>>>> - some ops are not possible on representor (because targetted to VF)
>>>>>
>>>>> I still feel that the addition of one single bit in the port ID is an
>>>>> elegant solution to target either the VF or its representor.
>>>>
>>>> Since we already have a confusion about what is configured when
>> operations
>>>> are performed on a representor port we have 2 options:
>>>
>>> I don't agree we have. I don't think there is any design note or API doc that
>> says the ethdev configuration on representor should be applied on VF
>> (please share if I missed it).
>>> The fact that there are some drivers that implemented it doesn't mean it is
>> correct.
>>>
>>>> 1. Have this proposed API to configure representor itself while
>>>> setting config to representor and configuring VF if special
>>>> bit enabled.
>>>> 2. Reverse the logic of current proposal, i.e. always apply
>>>> configuration to VF while working with representor and apply
>>>> configuration to representor itself if special bit is set.
>>>>
>>>> I'd probably prefer option #2, because:
>>>> - From the OVS and OpenStack point of view, I think, we don't
>>>> really need to configure representor itself in most cases.
>>>> And OVS really should not know if it works with representor
>>>> or some real port.
>>>
>>> I don't thinks OVS can be really agnostic to the fact it runs on top of
>> representors:
>>> 1. probing of representor has different command line -w
>> <bdf>,representor=XXX
>>
>> OVS doesn't care about content of devargs. It just passes them to hotplug
>> engine without any parsing (except a single case that must be eliminated
>> with a proper device iterators, not an OVS issue).
>>
>>> 2. the whole acceleration framework based on insertion of flow rules for
>> direct forward from the VF to target entity. Rules are applied on the
>> representor and would not work if port is not such.
>>
>> OVS tries to offload rules to the netdev from which packet was received.
>> That's it. If it succeeds - OK. If not, OVS doesn't care.
>>
>>> 3. some multi-port devices cannot do direct fwd between its different port.
>> This is why rep has switch_id and application should query it and act upon.
>>
>> This is part of offloading engine that doesn't affect the generic code.
>> If needed, OVS could request switch_id for netdev it tries to offload rules on.
>> OVS should not know if it representor port or not. If this operation will not
>> succeed for non-representors, OVS should not care because we can't offload
>> anything for non-representors anyway.
>>
>>> 4. representor carry the VF port id. This is how application know to which
>> VF (or vport) they associated with on their other side.
>>
>> This is just part of devargs, i.e. part of device unique identifier.
>> Once again, OVS doesn't parse devargs and should not do that.
>>
>>>
>>>> - It seems that most of the existing code in DPDK already works
>>>> like this, i.e. applying configs to VF itself. Intel drivers
>>>> works like this and Mellanox drivers, as Thomas said, doesn't
>>>> have this functionality at all.
>>>
>>> As I said above, I don't think we need to refer to specific driver behavior,
>> rather the API guidelines.
>>> To me, it is a bit strange and not natural that ethdev configuration is applied
>> to different port w/o any explicit request from the application.
>>> This is why I would prefer #1 above.
>>
>> IMHO, the whole concept of representors is that representor is a
>> way of attaching same port both to VM and vSwitch/hypervisor.
>>
>> If you're looking at representors as a separate ports on a switch, well..
>> In this case, for me VF configuration looks like something that
>> vSwitch should not do at all, because it should not configure ports
>> that doesn't attached to it. It's like configuring the other
>> side of veth pair, which is nonsense.
>>
>>
>> BTW, I don't know a way to find out if port is a representor of something
>> or not in Linux kernel.
>
> Specifically in OVS, in function dpdk_eth_dev_init, you can do this to detect:
> *info.dev_flags & RTE_ETH_DEV_REPRESENTOR
Sure, but I meant a way to do that for Linux network interface, not DPDK.
>
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>>> The this is that this new API will produce conceptual fragmentation
>>>>>>>> between DPDK and the Linux kernel, because to do the same thing
>>>>>>>> you'll have to use different ways. I mean, to change mac of VF in
>>>>>>>> kernel you need to set mac to the representor, but in DPDK changing
>>>>>>>> setting mac to representor will lead to changing the mac of the
>>>>>>>> representor itself, not the VF. This will be really confusing for users.
>>>>>>>
>>>>>>> I am not responsible of the choices in Linux.
>>>>>>> But I agree it would be interesting to check the reason of such
>> decision.
>>>>>>> Rony, please could you explain?
>>>>>
>>>>> I looked at few Linux drivers:
>>>>>
>>>>> bnxt_vf_rep_netdev_ops has no op to set MAC
>>>>> bnxt_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
>>>>>
>>>>> lio_vf_rep_ndev_ops has no op to set MAC
>>>>> lionetdevops.ndo_set_vf_mac = set VF MAC from PF
>>>>>
>>>>> mlx5e_netdev_ops_rep has no op to set MAC
>>>>> mlx5e_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
>>>>> mlx5e_netdev_ops_uplink_rep.ndo_set_vf_mac = set VF MAC from
>>>> PF
>>>>>
>>>>> nfp_repr_netdev_ops.ndo_set_mac_address = set representor
>>>> MAC
>>>>> nfp_repr_netdev_ops.ndo_set_vf_mac = set VF MAC from
>>>> representor
>>>>> nfp_net_netdev_ops.ndo_set_vf_mac = set VF MAC from PF
>>>>>
>>>>> There is a big chance that the behaviour is not standardized in Linux
>>>>> (as usual). So it is already confusing for users of Linux.
>>>>>
>>>>>
More information about the dev
mailing list