[PATCH v1] ethdev: add direction info when creating the transfer table

Andrew Rybchenko andrew.rybchenko at oktetlabs.ru
Fri Sep 23 09:25:40 CEST 2022


Hi Ori,

On 9/22/22 16:00, Ori Kam wrote:
> Hi Andrew,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
>> Sent: Thursday, 22 September 2022 13:31
>>
>> On 9/22/22 13:06, Ori Kam wrote:
>>> Hi Andrew,
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
>>>> Sent: Thursday, 22 September 2022 10:39
>>>>
>>>> On 9/21/22 15:51, Morten Brørup wrote:
>>>>>> From: Ori Kam [mailto:orika at nvidia.com]
>>>>>> Sent: Wednesday, 21 September 2022 14.41
>>>>>>
>>>>>>> From: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
>>>>>>>
>>>>>>> On 9/21/22 12:40, Thomas Monjalon wrote:
>>>>>>>> 21/09/2022 11:04, Ivan Malov:
>>>>>>>>> Now it's clear to me that your intention is to match on exact
>>>>>> ports,
>>>>>>>>> as usual, but this time with a hint for the flow table. Got it.
>>>>>>>>>
>>>>>>>>> In your response, you say that matching on ALL vports is not what
>>>>>>>>> the use case needs. OK, I understood. But please note that the
>>>>>>>>> item name does not say "ALL", it says "ANY".
>>>>>>>>>
>>>>>>>>> OK. Say, "ANY" is also confusing. Let's then name it
>> "VPORTS_ONLY"
>>>>>>>>> and "PHY_PORTS_ONLY". This way, if user provides item
>>>> VPORTS_ONLY
>>>>>>>>> and then  provides item REPRESENTED_PORT, these two items do
>> not
>>>>>>>>> contradict each other. Item VPORTS_ONLY defines the scope of
>> some
>>>>>>>>> kind, then the following item, REPRESENTED_PORT, makes it
>>>>>> narrower.
>>>>>>>>>
>>>>>>>>> And, in documentation, one can say clearly that the user *may*
>>>>>>>>> omit item VPORTS_ONLY in the exact rule pattern provided that
>>>>>>>>> they have already submitted this item as part of the template.
>>>>>>>>
>>>>>>>> I think the problem that Rongwei & Ori are trying to solve
>>>>>>>> is to allocate resources for the templates table in the right
>>>>>> place.
>>>>>>>> A table can have multiple templates.
>>>>>>>> If all rules/templates for this table are dedicated to virtual
>>>>>> ports,
>>>>>>>> then the table will be allocated in a place managing only virtual
>>>>>> ports.
>>>>>>>> This allocation decision must be taken at table creation,
>>>>>>>> whereas rules will be created later.
>>>>>>>> In order to do this specific table allocation for vports,
>>>>>>>> we need to restrict all templates of the table to be "vports only".
>>>>>>>>
>>>>>>>> I hope it makes things clearer.
>>>>>>>> Now the question is how to achieve this? Solutions are:
>>>>>>>>
>>>>>>>> 1/ give a hint to the table allocation
>>>>>>>> 2/ insert a pattern item in all templates of the table
>>>>>>>>
>>>>>>>> I don't see any other solution. Please propose if there are more
>>>>>> options.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> See my mail
>>>>>>>
>>>>>>> 3/ use jump rule which ensures that all traffic meets out
>>>>>>>        expectations
>>>>>>>
>>>>>>> It means that the table creation could be postponed. Or the
>>>>>>> table could be per-configured at the point of creation and
>>>>>>> finalized when we know that all traffic will be from wires
>>>>>>> or from vports. Yes, it complicates internals to achieve
>>>>>>> the optimization.
>>>>>>
>>>>>> Sorry Andrew your suggestion is not a valid one for the following
>>>>>> reasons:
>>>>>> 1. table creation can't be postponed this is a key idea of the rte_flow
>>>>>> template API.
>>>>
>>>> I guess nobody cares if it delays insertion on the first rule
>>>> only. Anyway, see below.
>>>>
>>>>>> 2. we can never know what rules will be inserted if the application
>>>>>> doesn't tell us.
>>>>>>         how can we know this is the last rule? What do we do with the
>>>>>> first rule?
>>>>>> 3. I don't see how jumping helps since it worsens the issue when you
>>>>>> jump to a table,
>>>>>>        how does the PMD know if this table should have only wire or only
>>>>>> vports?
>>>>
>>>> Jump rules say so. PMD can analyze there rules.
>>>> May be just need an attribute saying that all jump rules
>>>> to the table are configured and further attempts to reconfigure
>>>> will be rejected?
>>>>
>>>
>>> The idea is the PMD will not analyze rules. That is why we have the table
>>> and template.
>>> Sorry, I don't understand what attribute can be in jump? The jump is just
>>> to table. It can't say anything about the table destination table.
>>> This is all this patch adds the attribute to a table to say where this
>>> table should be located.
>>>
>>>>>>
>>>>>> I agree with Thomas, there are two valid options, I vote for the hint
>>>>>> since this is the
>>>>>> feature idea to tell the PMD where this resource should be allocated.
>>>>>
>>>>> This is an optimization; I agree with Ori that a hint is appropriate, like the
>>>> MBUF_FAST_FREE hint on TX queues.
>>>>>
>>>>> No need to add more complexity by requiring the driver to recognize
>> that
>>>> the pattern is present in all templates. (And perhaps also remove that
>>>> pattern when applying the templates.)
>>>>
>>>> What does the part of the matching criteria so special
>>>> that it is allowed to have dedicated hint attribute?
>>>>
>>>> May be we can have really generic solution when any
>>>> part of the matching criteria could provide such hints?
>>>
>>> That is the point I keep returning to, it is not matching!
>>> This is on which HW resource the table should be allocated.
>>
>> Sorry, but it is just your HW details that you have different
>> location/resources for rules which apply on packets coming
>> from wire and coming from host (vports).
>>
> 
> 
> Right, maybe other HW may have this issue, and this patch
> can help them but currently, this patch solves something in Nvidia HW.
> Template API is all about giving hints, some of the hints can be used
> only buy some PMDs.
> I promise that any vendor that has some way to optimize its PMD
> I will support, may differently name or different place but not all PMD
> are equal, each one needs its hints.
> 
> 
>>> Think about ingress/egress/transfer why are they not in the  pattern?
>>
>> We have no ingress/egress in transfer domain any more because
>> it is ambiguous.
>>
> 
> Yes that is why the name is wire and non wire,

My question here is why application really needs to know it.
Why does it make the difference?
IMHO for a VM which uses some function everything coming to it
is from the logical wire.
Of course since it is a transfer layer, we are talking about an
application like OvS. May be OvS knows the difference...

> 
>> Transfer itself is really a different domain. Logically and
>> from privileges point of view. That's why it is important to
>> distinguish it.
>>
>> Ingress and egress in non-transfer case are natively bound
>> to two main functions of the driver: transmit (egress rules)
>> and receive (ingress rules). In general, it is a matching
>> criteria as well, but because of its nature (explained
>> above) it is simply handy to distinguish it from the very
>> beginning.
>>
> 
> I agree with you, but this is just to show that even if something can be treated
> as matching it is not the best way to look at it that way.
> 
>>> They are where rules should be offloaded, they are different domain.
>>
>> We have just two domains: transfer and non-transfer.
>>
>>> Like we have elsewhere for example in action create we can state on which
>>> domain the action should be created. If the application selects a number of
>> domains
>>> it may mean that extra resources will be allocated.>
>>
>> Two more points:
>>
>> 1/ If it is just a hint, it is optional for PMD to
>>     support/handle it. It means that it MUST NOT impose any
>>     limitations on matching. If so, if you want a rule to
>>     be applied on packets coming from wire, you still MUST
>>     specify it in the pattern.
>>     So, it does not sound like a hint in your case.
> 
> Right it is optional if the application doesn't give this hint
> the PMD will create just like it does now tables for both wire and
> non wire.

Let's make it clear here and in the attribute documentation.
You're taking about one side, but there is an another one.
If it is just a hint which is optional to specify/interpret,
it does not impose any matching criteria. So, if a rule does
not specify source, the rule must be applied on traffic
coming from both wire and not-wire. In order to limit
souses, we still need matching criteria anyway - i.e. pattern
item. Moreover, if a PMD supports the hint and matching
criteria contradicts it, flow rule insertion must fail.

Could you please confirm that we share our understanding here.

> 
>>
>> 2/ struct rte_flow_attr is used for really all rules.
>>      How a new attribute should be interpreted in non-transfer
>>      rules? Similar to ingress/egress? Duplication?
>>      Or even harder (if it is NOT a hint): should it really
>>      enforce matching of packets coming from wire (i.e. not
>>      a different vport)? Not sure that it is doable or even
>>      make sense.
>>      We can say that the attribute may be used for the transfer
>>      rules only. If so, it MUST be checked on ethdev level
>>      since it is a generic rule.
>>
> 
>  From my point of view, it should be treated only in case of transfer,
> I think it also stated in the original commit this way.
> Why should we validate it?

Because it is a generic limitation. Otherwise each PMD must
check it. The check is required to avoid misusage.

> We don't validate if the application
> set transfer + ingress/egress or just ingress+egress or non.

We're going to add corresponding checks on ethdev when we
finalize deprecation of ingress/egress in transfer rules.

> 
> 
>> 3/ struct rte_flow_attr is used for sync and async rules.
>>      As I understand you're using it for async rules only.
>>      Does it make sense for sync rules?
> 
> Yes, it can save insertion. Since even the sync API since it doesn't have this bit
> duplicate the rule.

Sorry I don't understand above.

> But if pressed we can move it to the table attribute, do you think it will be better?

If it is not a generic thing, IMHO it is better to put in
table attributes.

Andrew.


More information about the dev mailing list