<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    just received an update marking this as `<span style="color: rgb(51, 51, 51); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial; display: inline !important; float: none;">Superseded`.
      I will send again with `ACK` also<br>
    </span>
    <p><br>
    </p>
    <p><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; color: rgb(0, 0, 0);">Thank
        you Morten for the understanding</span></p>
    <blockquote type="cite" cite="mid:MN2PR12MB30850F386E8FC5EE6DC066FF828EA@MN2PR12MB3085.namprd12.prod.outlook.com">
      <div>
        <div>
          <div>
            <hr style="display:inline-block;width:98%" tabindex="-1">
            <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> Morten Brørup
                <a class="moz-txt-link-rfc2396E" href="mailto:mb@smartsharesystems.com"><mb@smartsharesystems.com></a><br>
                <b>Sent:</b> 12 December 2023 23:39<br>
                <b>To:</b> Varghese, Vipin
                <a class="moz-txt-link-rfc2396E" href="mailto:Vipin.Varghese@amd.com"><Vipin.Varghese@amd.com></a>; Bruce Richardson
                <a class="moz-txt-link-rfc2396E" href="mailto:bruce.richardson@intel.com"><bruce.richardson@intel.com></a><br>
                <b>Cc:</b> Yigit, Ferruh <a class="moz-txt-link-rfc2396E" href="mailto:Ferruh.Yigit@amd.com"><Ferruh.Yigit@amd.com></a>;
                <a class="moz-txt-link-abbreviated" href="mailto:dev@dpdk.org">dev@dpdk.org</a> <a class="moz-txt-link-rfc2396E" href="mailto:dev@dpdk.org"><dev@dpdk.org></a>; <a class="moz-txt-link-abbreviated" href="mailto:stable@dpdk.org">stable@dpdk.org</a>
                <a class="moz-txt-link-rfc2396E" href="mailto:stable@dpdk.org"><stable@dpdk.org></a>; <a class="moz-txt-link-abbreviated" href="mailto:honest.jiang@foxmail.com">honest.jiang@foxmail.com</a>
                <a class="moz-txt-link-rfc2396E" href="mailto:honest.jiang@foxmail.com"><honest.jiang@foxmail.com></a>; P, Thiyagarajan
                <a class="moz-txt-link-rfc2396E" href="mailto:Thiyagarajan.P@amd.com"><Thiyagarajan.P@amd.com></a><br>
                <b>Subject:</b> RE: [PATCH] app/dma-perf: replace
                pktmbuf with mempool objects</font>
              <div> </div>
            </div>
            <style>@font-face
        {font-family:"Cambria Math"}@font-face
        {font-family:Calibri}@font-face
        {font-family:Tahoma}p.x_MsoNormal, li.x_MsoNormal, div.x_MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif"}a:link, span.x_MsoHyperlink
        {color:blue;
        text-decoration:underline}a:visited, span.x_MsoHyperlinkFollowed
        {color:purple;
        text-decoration:underline}p
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif"}span.x_EmailStyle18
        {font-family:"Calibri","sans-serif";
        color:#1F497D}.x_MsoChpDefault
        {font-size:10.0pt}div.x_WordSection1
        {}</style>
            <div link="blue" vlink="purple" lang="EN-US">
              <table width="100%" cellspacing="0" cellpadding="0" border="0" align="left">
                <tbody>
                  <tr>
                    <td style="background:#ffb900; padding:5pt 2pt 5pt 2pt"><br>
                    </td>
                    <td cellpadding="7px 6px 7px 15px" style="background:#fff8e5; padding:5pt 4pt 5pt 12pt; word-wrap:break-word" width="100%">
                      <div style="color:#222222"><span style="color:#222; font-weight:bold">Caution:</span>
                        This message originated from an External Source.
                        Use proper caution when opening attachments,
                        clicking links, or responding.
                      </div>
                    </td>
                  </tr>
                </tbody>
              </table>
              <br>
              <div>
                <div class="x_WordSection1">
                  <p class="x_MsoNormal"><a name="x__MailEndCompose" moz-do-not-send="true"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D" lang="DA"> </span></a></p>
                  <div style="border:none; border-left:solid blue 1.5pt; padding:0cm 0cm 0cm 4.0pt">
                    <p class="x_MsoNormal"><b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt; font-family:"Tahoma","sans-serif"">
                        Varghese, Vipin [<a class="moz-txt-link-freetext" href="mailto:Vipin.Varghese@amd.com">mailto:Vipin.Varghese@amd.com</a>]
                        <br>
                        <b>Sent:</b> Tuesday, 12 December 2023 18.14<br>
                        <br>
                      </span></p>
                    <div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-family:"Calibri","sans-serif"; color:black">Sharing
                            a few critical points based on my exposure
                            to the dma-perf application below</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-family:"Calibri","sans-serif"; color:black"> </span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black"><Snipped><br>
                            <br>
                            On Tue, Dec 12, 2023 at 04:16:20PM +0100,
                            Morten Brørup wrote:<br>
                            > +TO: Bruce, please stop me if I'm
                            completely off track here.<br>
                            ><br>
                            > > From: Ferruh Yigit [<a href="mailto:ferruh.yigit@amd.com" id="OWAf90557d8-150f-cb2d-7de8-3c6a7c2889ad" moz-do-not-send="true">mailto:ferruh.yigit@amd.com</a>]
                            Sent: Tuesday, 12<br>
                            > > December 2023 15.38<br>
                            > ><br>
                            > > On 12/12/2023 11:40 AM, Morten
                            Brørup wrote:<br>
                            > > >> From: Vipin Varghese [<a href="mailto:vipin.varghese@amd.com" id="OWA98c00610-3cd4-e752-037a-17ca12dfc14c" moz-do-not-send="true">mailto:vipin.varghese@amd.com</a>]
                            Sent: Tuesday,<br>
                            > > >> 12 December 2023 11.38<br>
                            > > >><br>
                            > > >> Replace pktmbuf pool with
                            mempool, this allows increase in MOPS<br>
                            > > >> especially in lower
                            buffer size. Using Mempool, allows to reduce
                            the<br>
                            > > >> extra CPU cycles.<br>
                            > > ><br>
                            > > > I get the point of this
                            change: It tests the performance of copying<br>
                            > > raw memory objects using
                            respectively rte_memcpy and DMA, without the<br>
                            > > mbuf indirection overhead.<br>
                            > > ><br>
                            > > > However, I still consider the
                            existing test relevant: The performance<br>
                            > > of copying packets using
                            respectively rte_memcpy and DMA.<br>
                            > > ><br>
                            > ><br>
                            > > This is DMA performance test
                            application and packets are not used,<br>
                            > > using pktmbuf just introduces
                            overhead to the main focus of the<br>
                            > > application.<br>
                            > ><br>
                            > > I am not sure if pktmuf selected
                            intentionally for this test<br>
                            > > application, but I assume it is
                            there because of historical reasons.<br>
                            ><br>
                            > I think pktmbuf was selected
                            intentionally, to provide more accurate<br>
                            > results for application developers
                            trying to determine when to use<br>
                            > rte_memcpy and when to use DMA. Much
                            like the "copy breakpoint" in Linux<br>
                            > Ethernet drivers is used to determine
                            which code path to take for each<br>
                            > received packet.</span><span style="font-family:"Calibri","sans-serif"; color:black"></span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"> </p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">yes
                            Ferruh, this is the right understanding. In
                            DPDK example we already have </span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">dma-forward
                            application which makes use of pktmbuf
                            payload to copy over</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">new
                            pktmbuf payload area. </span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"> </p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">by
                            moving to mempool, we are actually now
                            focusing on source and destination buffers.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">This
                            allows to create mempool objects with 2MB
                            and 1GB src-dst areas. Thus allowing</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:black">to
                            focus src to dst copy. With pktmbuf we were
                            not able to achieve the same.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"> </p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt"><br>
                            ><br>
                            > Most applications will be working with
                            pktmbufs, so these applications<br>
                            > will also experience the pktmbuf
                            overhead. Performance testing with the<br>
                            > same overhead as the application will
                            be better to help the application<br>
                            > developer determine when to use
                            rte_memcpy and when to use DMA when<br>
                            > working with pktmbufs.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"> </p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">Morten thank you
                            for the input, but as shared above DPDK
                            example dma-fwd does </span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">justice to such
                            scenario. inline to test-compress-perf &
                            test-crypto-perf IMHO test-dma-perf</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">should focus on
                            getting best values of dma engine and memcpy
                            comparision.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt"><br>
                            ><br>
                            > (Furthermore, for the pktmbuf tests, I
                            wonder if copying performance<br>
                            > could also depend on IOVA mode and
                            RTE_IOVA_IN_MBUF.)<br>
                            ><br>
                            > Nonetheless, there may also be use
                            cases where raw mempool objects are<br>
                            > being copied by rte_memcpy or DMA, so
                            adding tests for these use cases<br>
                            > are useful.<br>
                            ><br>
                            ><br>
                            > @Bruce, you were also deeply involved
                            in the DMA library, and probably<br>
                            > have more up-to-date practical
                            experience with it. Am I right that<br>
                            > pktmbuf overhead in these tests
                            provides more "real life use"-like<br>
                            > results? Or am I completely off track
                            with my thinking here, i.e. the<br>
                            > pktmbuf overhead is only noise?<br>
                            ><br>
                            I'm actually not that familiar with the
                            dma-test application, so can't<br>
                            comment on the specific overhead involved
                            here. In the general case, if we<br>
                            are just talking about the overhead of
                            dereferencing the mbufs then I would<br>
                            expect the overhead to be negligible.
                            However, if we are looking to include<br>
                            the cost of allocation and freeing of
                            buffers, I'd try to avoid that as it<br>
                            is a cost that would have to be paid for
                            both SW copies and HW copies, so<br>
                            should not count when calculating offload
                            cost.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"> </p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">Bruce, as per
                            test-dma-perf there is no repeated
                            pktmbuf-alloc or pktmbuf-free. </span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">Hence I disagree
                            that the overhead discussed for pkmbuf here
                            is not related to alloc and free.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">But the cost as per
                            my investigation goes into fetching the
                            cacheline and performing mtod on</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:11.0pt">each iteration.<br>
                            <br>
                            /Bruce</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">I can rewrite the
                            logic to make use pktmbuf objects by sending
                            the src and dst with pre-computed </span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">mtod to avoid the
                            overhead. But this will not resolve the 2MB
                            and 1GB huge page copy alloc failures.</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">IMHO, I believe in
                            similar lines to other perf application,
                            dma-perf application should focus on acutal
                            device</span></p>
                      </div>
                      <div>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt">performance over
                            application application performance.</span></p>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></p>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D">[MB:]</span></p>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D">OK,
                            Vipin has multiple good arguments for this
                            patch. I am convinced, let’s proceed with
                            it.</span></p>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></p>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D">Acked-by:
                            Morten Brørup
                            <a class="moz-txt-link-rfc2396E" href="mailto:mb@smartsharesystems.com"><mb@smartsharesystems.com></a></span></p>
                        <p class="x_MsoNormal"><span style="font-size:11.0pt; font-family:"Calibri","sans-serif"; color:#1F497D"> </span></p>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
  </body>
</html>