[PATCH v3 0/7] introduce per-queue limit watermark and host shaper
Spike Du
spiked at nvidia.com
Wed May 25 15:59:28 CEST 2022
> -----Original Message-----
> From: Morten Brørup <mb at smartsharesystems.com>
> Sent: Wednesday, May 25, 2022 9:40 PM
> To: Spike Du <spiked at nvidia.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL) <thomas at monjalon.net>
> Cc: Matan Azrad <matan at nvidia.com>; Slava Ovsiienko
> <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; dev at dpdk.org;
> Raslan Darawsheh <rasland at nvidia.com>; stephen at networkplumber.org;
> andrew.rybchenko at oktetlabs.ru; ferruh.yigit at amd.com;
> david.marchand at redhat.com
> Subject: RE: [PATCH v3 0/7] introduce per-queue limit watermark and host
> shaper
>
> External email: Use caution opening links or attachments
>
>
> > From: Spike Du [mailto:spiked at nvidia.com]
> > Sent: Wednesday, 25 May 2022 15.15
> >
> > > From: Morten Brørup <mb at smartsharesystems.com>
> > > Sent: Wednesday, May 25, 2022 3:00 AM
> > >
> > > > From: Thomas Monjalon [mailto:thomas at monjalon.net]
> > > > Sent: Tuesday, 24 May 2022 17.59
> > > >
> > > > +Cc people involved in previous versions
> > > >
> > > > 24/05/2022 17:20, Spike Du:
> > > > > LWM(limit watermark) is per RX queue attribute, when RX queue
> > > > fullness reach the LWM limit, HW sends an event to dpdk
> > application.
> > > > > Host shaper can configure shaper rate and lwm-triggered for a
> > host
> > > > port.
> > >
> > > Please ignore this comment, it is not important, but I had to get it
> > out of my
> > > system: I assume that the "LWM" name is from the NIC datasheet;
> > otherwise
> > > I would probably prefer something with "threshold"... LWM is easily
> > > confused with "low water mark", which is the opposite of what the
> > > LWM does. Names are always open for discussion, so I won't object to it.
> > >
> > > > > The shaper limits the rate of traffic from host port to wire
> > port.
> > >
> > > From host to wire? It is RX, so you must mean from wire to host.
> >
> > The host shaper is quite private to Nvidia's BlueField 2 NIC. The NIC
> > is inserted In a server which we call it host-system, and the NIC has
> > an embedded Arm-system Which does the forwarding.
> > The traffic flows from host-system to wire like this:
> > Host-system generates traffic, send it to Arm-system, Arm sends it to
> > physical/wire port.
> > So the RX happens between host-system and Arm-system, and the traffic
> > is host to wire.
> > The shaper also works in a special way: you configure it on
> > Arm-system, but it takes effect On host-sysmem's TX side.
> >
> > >
> > > > > If lwm-triggered is enabled, a 100Mbps shaper is enabled
> > > > automatically when one of the host port's Rx queues receives LWM
> > event.
> > > > >
> > > > > These two features can combine to control traffic from host port
> > to
> > > > wire port.
> > >
> > > Again, you mean from wire to host?
> >
> > Pls see above.
> >
> > >
> > > > > The work flow is configure LWM to RX queue and enable lwm-
> > triggered
> > > > flag in host shaper, after receiving LWM event, delay a while
> > > > until
> > RX
> > > > queue is empty , then disable the shaper. We recycle this work
> > > > flow
> > to
> > > > reduce RX queue drops.
> > >
> > > You delay while RX queue gets drained by some other threads, I
> > assume.
> >
> > The PMD thread drains the Rx queue, the PMD receiving as normal, as
> > the PMD Implementation uses rte interrupt thread to handle LWM event.
> >
>
> Thank you for the explanation, Spike. It really clarifies a lot!
>
> If this patch is intended for DPDK running on the host-system, then the LWM
> attribute is associated with a TX queue, not an RX queue. The packets are
> egressing from the host-system, so TX from the host-system's perspective.
>
> Otherwise, if this patch is for DPDK running on the embedded ARM-system,
> it should be highlighted somewhere.
The host-shaper patch is running on ARM-system, I think in that patch I have some explanation in mlx5.rst.
The LWM patch is common and should work on any Rx queue(right now mlx5 doesn't support Hairpin Rx queue and shared Rx queue).
On ARM-system, we can use it to monitor traffic from host(representor port) or from wire(physical port).
LWM can also work on host-system if there is DPDK running, for example it can monitor traffic from Arm-system to host-system.
>
> > >
> > > Surely, the excess packets must be dropped somewhere, e.g. by the
> > shaper?
>
> I guess the shaper doesn't have to drop any packets, but the host-system will
> simply be unable to put more packets into the queue if it runs full.
>
When LWM event happens, the host-shaper throttles traffic from host-system to Arm-system. Yes, the shaper doesn't drop pkts.
Normally the shaper is small and if PMD thread on Arm keeps working, Rx queue is dropless.
But if PMD thread doesn't receive fast enough, or even with a small shaper but host-system is sending some burst, Rx queue may still drop on Arm.
Anyway even sometimes drop still happens, the cooperation of host-shaper and LWM greatly reduce the Rx drop on Arm.
More information about the dev
mailing list