[dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue

Wang, Zhihong zhihong.wang at intel.com
Thu Sep 22 04:11:23 CEST 2016



> -----Original Message-----
> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
> Sent: Wednesday, September 21, 2016 8:54 PM
> To: Wang, Zhihong <zhihong.wang at intel.com>
> Cc: Maxime Coquelin <maxime.coquelin at redhat.com>; dev at dpdk.org;
> yuanhan.liu at linux.intel.com
> Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue
> 
> On 21 September 2016 at 17:27, Wang, Zhihong <zhihong.wang at intel.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
> >> Sent: Wednesday, September 21, 2016 4:50 PM
> >> To: Maxime Coquelin <maxime.coquelin at redhat.com>
> >> Cc: Wang, Zhihong <zhihong.wang at intel.com>; dev at dpdk.org;
> >> yuanhan.liu at linux.intel.com
> >> Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue
> >>
> >> Hi Maxime,
> >>
> >> On 22 August 2016 at 16:11, Maxime Coquelin
> >> <maxime.coquelin at redhat.com> wrote:
> >> > Hi Zhihong,
> >> >
> >> > On 08/19/2016 07:43 AM, Zhihong Wang wrote:
> >> >>
> >> >> This patch set optimizes the vhost enqueue function.
> >> >>
> >> ...
> >>
> >> >
> >> > My setup consists of one host running a guest.
> >> > The guest generates as much 64bytes packets as possible using
> >>
> >> Have you tested with other different packet size?
> >> My testing shows that performance is dropping when packet size is more
> >> than 256.
> >
> >
> > Hi Jianbo,
> >
> > Thanks for reporting this.
> >
> >  1. Are you running the vector frontend with mrg_rxbuf=off?
> >
> >  2. Could you please specify what CPU you're running? Is it Haswell
> >     or Ivy Bridge?
> >
> >  3. How many percentage of drop are you seeing?
> >
> > This is expected by me because I've already found the root cause and
> > the way to optimize it, but since it missed the v0 deadline and
> > requires changes in eal/memcpy, I postpone it to the next release.
> >
> > After the upcoming optimization the performance for packets larger
> > than 256 will be improved, and the new code will be much faster than
> > the current code.
> >
> 
> Sorry, I tested on an ARM server, but I wonder if there is the same
> issue for x86 platform.


For mrg_rxbuf=off path it might be slight drop for packets larger than
256B (~3% for 512B and ~1% for 1024B), no drop for other cases.

This is not a bug or issue, only we need to enhance memcpy to complete
the whole optimization, which should be done in a separated patch,
unfortunately it misses this release window.


> 
> >> > pktgen-dpdk. The hosts forwards received packets back to the guest
> >> > using testpmd on vhost pmd interface. Guest's vCPUs are pinned to
> >> > physical CPUs.
> >> >
> >> > I tested it with and without your v1 patch, with and without
> >> > rx-mergeable feature turned ON.
> >> > Results are the average of 8 runs of 60 seconds:
> >> >
> >> > Rx-Mergeable ON : 7.72Mpps
> >> > Rx-Mergeable ON + "vhost: optimize enqueue" v1: 9.19Mpps
> >> > Rx-Mergeable OFF: 10.52Mpps
> >> > Rx-Mergeable OFF + "vhost: optimize enqueue" v1: 10.60Mpps
> >> >


More information about the dev mailing list