<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<p style="font-family:Arial;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
[AMD Official Use Only - General]<br>
</p>
<br>
<div>
<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; color: rgb(0, 0, 0);">Thank you Morten for the understanding</span></div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Morten Brørup <mb@smartsharesystems.com><br>
<b>Sent:</b> 12 December 2023 23:39<br>
<b>To:</b> Varghese, Vipin <Vipin.Varghese@amd.com>; Bruce Richardson <bruce.richardson@intel.com><br>
<b>Cc:</b> Yigit, Ferruh <Ferruh.Yigit@amd.com>; dev@dpdk.org <dev@dpdk.org>; stable@dpdk.org <stable@dpdk.org>; honest.jiang@foxmail.com <honest.jiang@foxmail.com>; P, Thiyagarajan <Thiyagarajan.P@amd.com><br>
<b>Subject:</b> RE: [PATCH] app/dma-perf: replace pktmbuf with mempool objects</font>
<div> </div>
</div>
<style>
<!--
@font-face
{font-family:"Cambria Math"}
@font-face
{font-family:Calibri}
@font-face
{font-family:Tahoma}
p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif"}
a:link, span.x_MsoHyperlink
{color:blue;
text-decoration:underline}
a:visited, span.x_MsoHyperlinkFollowed
{color:purple;
text-decoration:underline}
p
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif"}
span.x_EmailStyle18
{font-family:"Calibri","sans-serif";
color:#1F497D}
.x_MsoChpDefault
{font-size:10.0pt}
@page WordSection1
{margin:72.0pt 72.0pt 72.0pt 72.0pt}
div.x_WordSection1
{}
-->
</style>
<div lang="EN-US" link="blue" vlink="purple">
<table border="0" cellspacing="0" cellpadding="0" align="left" width="100%">
<tbody>
<tr>
<td style="background:#ffb900; padding:5pt 2pt 5pt 2pt"></td>
<td width="100%" cellpadding="7px 6px 7px 15px" style="background:#fff8e5; padding:5pt 4pt 5pt 12pt; word-wrap:break-word">
<div style="color:#222222"><span style="color:#222; font-weight:bold">Caution:</span> This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
</div>
</td>
</tr>
</tbody>
</table>
<br>
<div>
<div class="x_WordSection1">
<p class="x_MsoNormal"><a name="x__MailEndCompose"><span lang="DA" style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></a></p>
<div style="border:none; border-left:solid blue 1.5pt; padding:0cm 0cm 0cm 4.0pt">
<p class="x_MsoNormal"><b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif""> Varghese, Vipin [mailto:Vipin.Varghese@amd.com]
<br>
<b>Sent:</b> Tuesday, 12 December 2023 18.14<br>
<br>
</span></p>
<div>
<div>
<p class="x_MsoNormal"><span style="font-family:"Calibri","sans-serif"; color:black">Sharing a few critical points based on my exposure to the dma-perf application below</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-family:"Calibri","sans-serif"; color:black"> </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black"><Snipped><br>
<br>
On Tue, Dec 12, 2023 at 04:16:20PM +0100, Morten Brørup wrote:<br>
> +TO: Bruce, please stop me if I'm completely off track here.<br>
><br>
> > From: Ferruh Yigit [<a href="mailto:ferruh.yigit@amd.com" id="OWAf90557d8-150f-cb2d-7de8-3c6a7c2889ad">mailto:ferruh.yigit@amd.com</a>] Sent: Tuesday, 12<br>
> > December 2023 15.38<br>
> ><br>
> > On 12/12/2023 11:40 AM, Morten Brørup wrote:<br>
> > >> From: Vipin Varghese [<a href="mailto:vipin.varghese@amd.com" id="OWA98c00610-3cd4-e752-037a-17ca12dfc14c">mailto:vipin.varghese@amd.com</a>] Sent: Tuesday,<br>
> > >> 12 December 2023 11.38<br>
> > >><br>
> > >> Replace pktmbuf pool with mempool, this allows increase in MOPS<br>
> > >> especially in lower buffer size. Using Mempool, allows to reduce the<br>
> > >> extra CPU cycles.<br>
> > ><br>
> > > I get the point of this change: It tests the performance of copying<br>
> > raw memory objects using respectively rte_memcpy and DMA, without the<br>
> > mbuf indirection overhead.<br>
> > ><br>
> > > However, I still consider the existing test relevant: The performance<br>
> > of copying packets using respectively rte_memcpy and DMA.<br>
> > ><br>
> ><br>
> > This is DMA performance test application and packets are not used,<br>
> > using pktmbuf just introduces overhead to the main focus of the<br>
> > application.<br>
> ><br>
> > I am not sure if pktmuf selected intentionally for this test<br>
> > application, but I assume it is there because of historical reasons.<br>
><br>
> I think pktmbuf was selected intentionally, to provide more accurate<br>
> results for application developers trying to determine when to use<br>
> rte_memcpy and when to use DMA. Much like the "copy breakpoint" in Linux<br>
> Ethernet drivers is used to determine which code path to take for each<br>
> received packet.</span><span style="font-family:"Calibri","sans-serif"; color:black"></span></p>
</div>
<div>
<p class="x_MsoNormal"> </p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">yes Ferruh, this is the right understanding. In DPDK example we already have </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">dma-forward application which makes use of pktmbuf payload to copy over</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">new pktmbuf payload area. </span></p>
</div>
<div>
<p class="x_MsoNormal"> </p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">by moving to mempool, we are actually now focusing on source and destination buffers.</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">This allows to create mempool objects with 2MB and 1GB src-dst areas. Thus allowing</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">to focus src to dst copy. With pktmbuf we were not able to achieve the same.</span></p>
</div>
<div>
<p class="x_MsoNormal"> </p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt"><br>
><br>
> Most applications will be working with pktmbufs, so these applications<br>
> will also experience the pktmbuf overhead. Performance testing with the<br>
> same overhead as the application will be better to help the application<br>
> developer determine when to use rte_memcpy and when to use DMA when<br>
> working with pktmbufs.</span></p>
</div>
<div>
<p class="x_MsoNormal"> </p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Morten thank you for the input, but as shared above DPDK example dma-fwd does </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">justice to such scenario. inline to test-compress-perf & test-crypto-perf IMHO test-dma-perf</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">should focus on getting best values of dma engine and memcpy comparision.</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt"><br>
><br>
> (Furthermore, for the pktmbuf tests, I wonder if copying performance<br>
> could also depend on IOVA mode and RTE_IOVA_IN_MBUF.)<br>
><br>
> Nonetheless, there may also be use cases where raw mempool objects are<br>
> being copied by rte_memcpy or DMA, so adding tests for these use cases<br>
> are useful.<br>
><br>
><br>
> @Bruce, you were also deeply involved in the DMA library, and probably<br>
> have more up-to-date practical experience with it. Am I right that<br>
> pktmbuf overhead in these tests provides more "real life use"-like<br>
> results? Or am I completely off track with my thinking here, i.e. the<br>
> pktmbuf overhead is only noise?<br>
><br>
I'm actually not that familiar with the dma-test application, so can't<br>
comment on the specific overhead involved here. In the general case, if we<br>
are just talking about the overhead of dereferencing the mbufs then I would<br>
expect the overhead to be negligible. However, if we are looking to include<br>
the cost of allocation and freeing of buffers, I'd try to avoid that as it<br>
is a cost that would have to be paid for both SW copies and HW copies, so<br>
should not count when calculating offload cost.</span></p>
</div>
<div>
<p class="x_MsoNormal"> </p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Bruce, as per test-dma-perf there is no repeated pktmbuf-alloc or pktmbuf-free. </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">Hence I disagree that the overhead discussed for pkmbuf here is not related to alloc and free.</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">But the cost as per my investigation goes into fetching the cacheline and performing mtod on</span></p>
</div>
<div>
<p class="x_MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:11.0pt">each iteration.<br>
<br>
/Bruce</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">I can rewrite the logic to make use pktmbuf objects by sending the src and dst with pre-computed </span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">mtod to avoid the overhead. But this will not resolve the 2MB and 1GB huge page copy alloc failures.</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">IMHO, I believe in similar lines to other perf application, dma-perf application should focus on acutal device</span></p>
</div>
<div>
<p class="x_MsoNormal"><span style="font-size:11.0pt">performance over application application performance.</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D">[MB:]</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D">OK, Vipin has multiple good arguments for this patch. I am convinced, let’s proceed with it.</span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D">Acked-by: Morten Brørup <mb@smartsharesystems.com></span></p>
<p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>