<!DOCTYPE html><html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Hi Sunyuechi,</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 04/06/2025 12:39, 孙越池 wrote:<br>
</div>
<blockquote type="cite" cite="mid:26aab158.28484.1973abd6624.Coremail.sunyuechi@iscas.ac.cn">
> why is it done in a scalar way instead of using
`__riscv_vsrl_vx_u32m1()?` I assume you're relying on the compiler
here?<br>
<br>
I don't know the exact reason, but based on experience, using
indexed loads tends to be slower for small-scale and
low-computation cases. So I've tried both methods.<br>
In this case, if using `vsrl`, it would require
`__riscv_vluxei32_v_u32m1`, which is much slower.<br>
<br>
```<br>
vuint32m1_t vip_shifted =
__riscv_vsll_vx_u32m1(__riscv_vsrl_vx_u32m1(__riscv_vle32_v_u32m1((const
uint32_t *)&ip, vl), 8, vl), 2, vl);<br>
vuint32m1_t vtbl_entry = __riscv_vluxei32_v_u32m1(<br>
(const uint32_t *)(lpm->tbl24), vip_shifted, vl);<br>
```<br>
<br>
> have you redefined the xmm_t type for proper index
addressing?<br>
<br>
It is in `eal/riscv/include/rte_vect.h:`<br>
<br>
```<br>
typedef int32_t xmm_t __attribute__((vector_size(16)));<br>
```<br>
<br>
> I'd recommend that you use FIB to select an implementation at
runtime. All the rest LPM vector x4 implementations are done this
way, and their code is inlined.<br>
> Also, please consider writing a slightly more informative and
explanatory commit message.<br>
</blockquote>
<p>The commit message still looks uninformative to me:</p>
<p>>lpm_perf_autotest on BPI-F3</p>
<p>we have no idea what's that</p>
<p>> scalar: 5.7 cycles</p>
<p>I'm not sure we want to have this information in commit message
as well, because it is useless. Cycles depends on so much variable
parts - what freq of the CPU was, what speed of memory, size of
caches, and so on. This information is irrelevant and become
obsolete pretty fast.</p>
<p>From the latest commit:</p>
<p>>The best way ... However, ... Therefore, ... this commit does
not modify</p>
<p>>Unifying the code style between lpm and fib may be worth
considering in the future.</p>
<p>I don't think this is a good idea to put into the commit message
information about what was NOT done.</p>
<p>You should put all this information (platform you were running,
performance, implementation considerations and thoughts) into the
patch notes.</p>
<blockquote type="cite" cite="mid:26aab158.28484.1973abd6624.Coremail.sunyuechi@iscas.ac.cn">
<br>
I agree that the FIB approach is clearly better here, but adopting
this method would require changing the function initialization
logic for all architectures in LPM, as well as updating the
relevant structures.<br>
<br>
I'm not sure it's worth doing right now, since this commit is
intended to be just a small change for RISC-V. I'm more inclined
to follow the existing structure and avoid touching other
architectures' code.<br>
Would it be acceptable to leave this kind of refactoring for the
future?<br>
<br>
If you're certain it should be done now, I'll make the changes.
For now, I've only updated the commit message to include this idea
(v2).<br>
<div style="white-space:nowrap;"> <br>
</div>
<br>
</blockquote>
<p>I'm not talking about adopting the FIB approach to the LPM.
Instead, I suggested keeping LPM code consistent and leaving your
implementation as a static inline function. And if you want to
have runtime CPU flags check - you're welcome to do so in the FIB.</p>
<blockquote type="cite" cite="mid:26aab158.28484.1973abd6624.Coremail.sunyuechi@iscas.ac.cn">
<br>
<br>
<blockquote name="replyContent" class="ReferenceQuote" style="font-family:SimSun;padding-left:5px;margin-left:5px;border-left:2px solid #B6B6B6;margin-right:0px;">
-----原始邮件-----<br>
<b>发件人:</b><span id="rc_from">"Medvedkin, Vladimir"
<a class="moz-txt-link-rfc2396E" href="mailto:vladimir.medvedkin@intel.com"><vladimir.medvedkin@intel.com></a></span><br>
<b>发送时间:</b><span id="rc_senttime">2025-05-30 21:13:57 (星期五)</span><br>
<b>收件人:</b> <a class="moz-txt-link-abbreviated" href="mailto:uk7b@foxmail.com">uk7b@foxmail.com</a>, <a class="moz-txt-link-abbreviated" href="mailto:dev@dpdk.org">dev@dpdk.org</a><br>
<b>抄送:</b> sunyuechi <a class="moz-txt-link-rfc2396E" href="mailto:sunyuechi@iscas.ac.cn"><sunyuechi@iscas.ac.cn></a>, "Thomas
Monjalon" <a class="moz-txt-link-rfc2396E" href="mailto:thomas@monjalon.net"><thomas@monjalon.net></a>, "Bruce Richardson"
<a class="moz-txt-link-rfc2396E" href="mailto:bruce.richardson@intel.com"><bruce.richardson@intel.com></a>, "Stanislaw Kardach"
<a class="moz-txt-link-rfc2396E" href="mailto:stanislaw.kardach@gmail.com"><stanislaw.kardach@gmail.com></a><br>
<b>主题:</b> Re: [PATCH 2/3] lib/lpm: R-V V rte_lpm_lookupx4<br>
<br>
<p> Hi c, </p>
<p> <br>
</p>
<div class="moz-cite-prefix"> On 28/05/2025 18:00, <a class="moz-txt-link-abbreviated moz-txt-link-freetext" href="mailto:uk7b@foxmail.com" moz-do-not-send="true">uk7b@foxmail.com</a>
wrote:<br>
</div>
<blockquote type="cite" cite="mid:tencent_A2BD85C9F3B0AA4658640DA91443CDF5630A@qq.com">
<pre wrap="" class="moz-quote-pre">From: sunyuechi <a class="moz-txt-link-rfc2396E" href="mailto:sunyuechi@iscas.ac.cn" moz-do-not-send="true"><sunyuechi@iscas.ac.cn></a> bpi-f3:
scalar: 5.7 cycles
rvv: 2.4 cycles
Maybe runtime detection in LPM should be added for all architectures,
but this commit is only about the RVV part.
</pre>
</blockquote>
<p> <span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="0:1" style="white-space:pre-wrap;">I</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="2:2" style="white-space:pre-wrap;">would</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="5:11" style="white-space:pre-wrap;">advise</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="17:3" style="white-space:pre-wrap;">you</span><span style="white-space:pre-wrap;"> to </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="21:9" style="white-space:pre-wrap;">look</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="31:1" style="white-space:pre-wrap;">into</span><span style="white-space:pre-wrap;"> the </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="44:3" style="white-space:pre-wrap;">FIB</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="33:10" style="white-space:pre-wrap;">library</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="47:1" style="white-space:pre-wrap;">,</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="49:3" style="white-space:pre-wrap;">it</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="53:4" style="white-space:pre-wrap;">has</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="58:6" style="white-space:pre-wrap;">exactly</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="69:3" style="white-space:pre-wrap;">what</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="73:2" style="white-space:pre-wrap;">you</span><span style="white-space:pre-wrap;"> are </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="76:5" style="white-space:pre-wrap;">looking</span><span style="white-space:pre-wrap;"> for</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="81:1" style="white-space:pre-wrap;">.</span> </p>
<p> <span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="0:5" style="white-space:pre-wrap;">Also</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="5:1" style="white-space:pre-wrap;">,</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="7:10" style="white-space:pre-wrap;">please</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="19:9" style="white-space:pre-wrap;">consider</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="42:8" style="white-space:pre-wrap;">writing</span><span style="white-space:pre-wrap;"> a </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="51:7" style="white-space:pre-wrap;">slightly</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="59:5" style="white-space:pre-wrap;">more</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="65:14" style="white-space:pre-wrap;">informative</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="80:1" style="white-space:pre-wrap;">and</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="82:10" style="white-space:pre-wrap;">explanatory</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="81:1" style="white-space:pre-wrap;"> commit message.</span> </p>
<blockquote type="cite" cite="mid:tencent_A2BD85C9F3B0AA4658640DA91443CDF5630A@qq.com">
<pre wrap="" class="moz-quote-pre">Signed-off-by: sunyuechi <a class="moz-txt-link-rfc2396E" href="mailto:sunyuechi@iscas.ac.cn" moz-do-not-send="true"><sunyuechi@iscas.ac.cn></a> ---
MAINTAINERS | 2 +
lib/lpm/meson.build | 1 +
lib/lpm/rte_lpm.h | 2 +
lib/lpm/rte_lpm_rvv.h | 91 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 96 insertions(+)
create mode 100644 lib/lpm/rte_lpm_rvv.h
</pre>
</blockquote>
<snip>
<blockquote type="cite" cite="mid:tencent_A2BD85C9F3B0AA4658640DA91443CDF5630A@qq.com">
<pre wrap="" class="moz-quote-pre">+static inline void rte_lpm_lookupx4_rvv(
+ const struct rte_lpm *lpm, xmm_t ip, uint32_t hop[4], uint32_t defv)
+{
+ size_t vl = 4;
+
+ const uint32_t *tbl24_p = (const uint32_t *)lpm->tbl24;
+ uint32_t tbl_entries[4] = {
+ tbl24_p[((uint32_t)ip[0]) >> 8],
+ tbl24_p[((uint32_t)ip[1]) >> 8],
+ tbl24_p[((uint32_t)ip[2]) >> 8],
+ tbl24_p[((uint32_t)ip[3]) >> 8],
+ };</pre>
</blockquote>
<p> <span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="0:1" style="white-space:pre-wrap;">I</span><span style="white-space:pre-wrap;">'m </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="2:2" style="white-space:pre-wrap;">not</span><span style="white-space:pre-wrap;"> an </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="5:7" style="white-space:pre-wrap;">expert</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="13:1" style="white-space:pre-wrap;">in</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="15:4" style="white-space:pre-wrap;">RISC</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="19:1" style="white-space:pre-wrap;">-</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="20:1" style="white-space:pre-wrap;">V</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="21:1" style="white-space:pre-wrap;">,</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="23:2" style="white-space:pre-wrap;">but</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="26:6" style="white-space:pre-wrap;">why</span><span style="white-space:pre-wrap;"> is </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="33:3" style="white-space:pre-wrap;">it</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="37:8" style="white-space:pre-wrap;">done</span><span style="white-space:pre-wrap;"> in a </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="46:9" style="white-space:pre-wrap;">scalar</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="56:8" style="white-space:pre-wrap;">way</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="66:1" style="white-space:pre-wrap;">instead</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="71:1" style="white-space:pre-wrap;">of</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="73:7" style="white-space:pre-wrap;">using</span><span style="white-space:pre-wrap;"> __riscv_vsrl_vx_u32m1</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="102:1" style="white-space:pre-wrap;">(</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="103:1" style="white-space:pre-wrap;">)</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="104:1" style="white-space:pre-wrap;">? </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="0:1" style="white-space:pre-wrap;">I</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="2:7" style="white-space:pre-wrap;">assume</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="11:2" style="white-space:pre-wrap;">you</span><span style="white-space:pre-wrap;">'re </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="14:11" style="white-space:pre-wrap;">relying</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="32:2" style="white-space:pre-wrap;">on</span><span style="white-space:pre-wrap;"> the </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="35:10" style="white-space:pre-wrap;">compiler</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="26:5" style="white-space:pre-wrap;">here</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="45:1" style="white-space:pre-wrap;">?</span> </p>
<p> <span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="0:5" style="white-space:pre-wrap;">Also</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="10:1" style="white-space:pre-wrap;">,</span><span style="white-space:pre-wrap;"> have </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="12:2" style="white-space:pre-wrap;">you</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="15:14" style="white-space:pre-wrap;">redefined</span><span style="white-space:pre-wrap;"> the xmm_t </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="30:3" style="white-space:pre-wrap;">type</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="40:3" style="white-space:pre-wrap;">for</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="44:10" style="white-space:pre-wrap;">proper</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="55:9" style="white-space:pre-wrap;">index</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="65:9" style="white-space:pre-wrap;">addressing</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="74:1" style="white-space:pre-wrap;">?</span> </p>
<blockquote type="cite" cite="mid:tencent_A2BD85C9F3B0AA4658640DA91443CDF5630A@qq.com">
<pre wrap="" class="moz-quote-pre">+ vuint32m1_t vtbl_entry = __riscv_vle32_v_u32m1(tbl_entries, vl);
+
+ vbool32_t mask = __riscv_vmseq_vx_u32m1_b32(
+ __riscv_vand_vx_u32m1(vtbl_entry, RTE_LPM_VALID_EXT_ENTRY_BITMASK, vl),
+ RTE_LPM_VALID_EXT_ENTRY_BITMASK, vl);
</pre>
</blockquote>
<snip>
<blockquote type="cite" cite="mid:tencent_A2BD85C9F3B0AA4658640DA91443CDF5630A@qq.com">
<pre wrap="" class="moz-quote-pre">+
+static inline void rte_lpm_lookupx4(
+ const struct rte_lpm *lpm, xmm_t ip, uint32_t hop[4], uint32_t defv)
+{
+ lpm_lookupx4_impl(lpm, ip, hop, defv);
+}
+
+RTE_INIT(rte_lpm_init_alg)
+{
+ lpm_lookupx4_impl = rte_cpu_get_flag_enabled(RTE_CPUFLAG_RISCV_ISA_V)
+ ? rte_lpm_lookupx4_rvv
+ : rte_lpm_lookupx4_scalar;
+}</pre>
</blockquote>
<span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="0:3" style="white-space:pre-wrap;">As</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="4:1" style="white-space:pre-wrap;">I</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="6:8" style="white-space:pre-wrap;">mentioned</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="15:5" style="white-space:pre-wrap;">earlier</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="20:1" style="white-space:pre-wrap;">,</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="22:1" style="white-space:pre-wrap;">I</span><span style="white-space:pre-wrap;">'d </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="24:10" style="white-space:pre-wrap;">recommend</span><span style="white-space:pre-wrap;"> that </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="35:3" style="white-space:pre-wrap;">you</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="39:12" style="white-space:pre-wrap;">use</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="52:3" style="white-space:pre-wrap;">FIB</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="56:3" style="white-space:pre-wrap;">to</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="60:6" style="white-space:pre-wrap;">select</span><span style="white-space:pre-wrap;"> an </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="67:10" style="white-space:pre-wrap;">implementation</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="78:2" style="white-space:pre-wrap;">at</span><span style="white-space:pre-wrap;"> </span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="81:16" style="white-space:pre-wrap;">runtime</span><span class="aNeGP0gI0B9AV8JaHPyH" data-src-align="97:1" style="white-space:pre-wrap;">. All the rest LPM vector x4 implementations are done this way, and their code is inlined.</span>
<blockquote type="cite" cite="mid:tencent_A2BD85C9F3B0AA4658640DA91443CDF5630A@qq.com">
<pre wrap="" class="moz-quote-pre">+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LPM_RVV_H_ */
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
Regards,
Vladimir</pre>
</blockquote>
</blockquote>
<pre class="moz-signature" cols="72">--
Regards,
Vladimir</pre>
</body>
</html>