[dpdk-dev] [dpdk-stable] [PATCH 1/2] net/virtio: fix performance regression due to TSO enabling
Jan Viktorin
viktorin at rehivetech.com
Thu Jan 12 16:02:56 CET 2017
On Thu, 12 Jan 2017 10:30:58 +0800
Yuanhan Liu <yuanhan.liu at linux.intel.com> wrote:
> On Wed, Jan 11, 2017 at 03:51:22PM +0100, Thomas Monjalon wrote:
> > 2017-01-11 12:27, Yuanhan Liu:
> > > The fact that virtio net header is initiated to zero in PMD driver
> > > init stage means that these costly writes are unnecessary and could
> > > be avoided:
> > >
> > > if (hdr->csum_start != 0)
> > > hdr->csum_start = 0;
> > >
> > > And that's what the macro ASSIGN_UNLESS_EQUAL does. With this, the
> > > performance drop introduced by TSO enabling is recovered: it could
> > > be up to 20% in micro benchmarking.
> >
> > This patch is adding a condition to assignments.
> > We need a benchmark on other architectures like ARM. Please anyone?
>
> I think the cost of condition should be way lower than the cost from the
> penalty introduced by the cache issue, that I don't see it would perform
> bad on other platforms.
>
> But, of course, testing is always welcome!
>
> --yliu
Hello,
we've done a synthetic measurement, principle briefly:
== Without condition check ==
start = gettimeofday();
for (i = 0; i < 1024*1024*128; ++i) {
hdr->csum_start = 0;
hdr->csum_offset = 0;
hdr->flags = 0;
}
end = gettimeofday();
== With condition check ==
start = gettimeofday();
for (i = 0; i < 1024*1024*128; ++i) {
ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
}
end = gettimeofday();
== Results ==
Computed as total time of all threads:
for i = 1..THREAD_COUNT:
result += end[i] - start[i]
cpu threads without-check (ms) with-check
Xeon E5-2670 1 516 529
Xeon E5-2670 2 1155 953
Xeon E5-2670 8 8947 5044
Xeon E5-2670 16 23335 16836
Zynq-7020 (armv7) 1 6735 7205
Zynq-7020 (armv7) 2 13753 14418
The advantage for Intel is evident when increasing the number
of threads.
However, on 32-bit ARMs we might expect some performance drop.
Regards
Jan
> >
> >
> > [...]
> > > +/* avoid write operation when necessary, to lessen cache issues */
> > > +#define ASSIGN_UNLESS_EQUAL(var, val) do { \
> > > + if ((var) != (val)) \
> > > + (var) = (val); \
> > > +} while (0)
More information about the dev
mailing list