[dpdk-dev] [PATCH v4 0/5] integrate librte_ipsec SAD into ipsec-secgw

Thomas Monjalon thomas at monjalon.net
Thu Jan 23 14:33:25 CET 2020


23/01/2020 13:56, Akhil Goyal:
> Hi Konstantin,
> > 
> > Hi Akhil,
> > 
> > > > > > Hi Vladimir,
> > > > > > The SA lookup logic and management is purely requirement based for the
> > > > > application.
> > > > > >The application may only cater to <128 SAs which can
> > > > > > be handled based on the current logic.
> > > > >
> > > > > Not always, current implementation can handle < 128 SA,
> > > > > whose SPI%128 never match (let say it cant't handle SPI=1 and SPI=129).
> > > > > Yes, what we have right now has nearly zero overhead,
> > > > > and might be ok for some really simple show-cases.
> > > > > But for majority of production IPsec implementations,
> > > > > I believe that definitely wouldn't be enough.
> > > > >
> > > > > > –single-sa option cannot handle this.
> > > > > > Sample applications in DPDK are there to showcase the best a hardware
> > can
> > > > > deliver.
> > > > >
> > > > > My thought was - that's the reason we have single-sa option -
> > > > > demonstrate best possible HW perf without minimal SW intervention.
> > > > > For something more serious than that, we use generic SAD implementation.
> > > > >
> > > > > > IMO, we cannot allow this logic on NXP hardwares. We
> > > > > > give performance numbers based on IPSec app to customers and we
> > cannot
> > > > > allow 15% degradation.
> > > > >
> > > > > As Vladimir said, we are looking how to improve current SAD numbers
> > > > > and minimize the drop.
> > > > > But with same equals - plain array will always be faster than hash table,
> > > > > so not sure we will be able to match existing performance.
> > > > > So two questions:
> > > > > 1. What exact case you use for perf testing
> > > > >     (total number of SAs, packets per burst belong to the same/different SAs)?
> > > > >     Might be there is a way to speedup it.
> > > > >     Again if 10-15% is not an affordable drop, which one is: zero or ...?
> > > >
> > > > We should add features judiciously, we cannot drop the performance of a
> > > > benchmarking
> > > > Application in lieu of adding functionality. We should only add features which
> > > > are not
> > > > Impacting the performance significantly.
> > > > Every vendor may have different cases. We cannot tune for everybody.
> > > > However, I see drop in 64 outbound 64 inbound SAs all with different SPI and
> > IPs.
> > > > Packets per burst = 32 all with different SAs.
> > > >
> > >
> > > We can have two modes of lookup similar to l3fwd - EM and LPM.
> > > LPM is O(1) while EM is more realistic. Similar logic can be added here as well.
> > > With L3fwd also we showcase performance for best case(lpm) and the worst
> > case(em)
> > > What Say?
> > 
> > We discussed it off-line with Vladimir and came up with similar idea:
> > Have a proper/generic SAD implementation and add limited size plain-array
> > on top of it as 1xway associative cache.
> > So for the case when all active SAs fit into the cache and no SPI collisions,
> > we should have same performance as now (with plain array).
> > From other side, we'll still have generic/scalable/rfc compliant implementation.
> > Sort of best sides from two words.
> > Plans are to submit v4 with such approach in next few days.
> 
> OK lets check the v4 before moving the discussion to techboard. 
> @Thomas: Do you have more thoughts on this? Should we get it added in the agenda
> Or wait for the v4?

If v4 is good for both cases, it lowers the priority of the discussion.

But still, it would be interesting to state the objectives of the examples:
	- show API usage?
	- show feature performance?
	- show best hardware performance?
	- what else?




More information about the dev mailing list