<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Wed, Oct 1, 2025 at 1:25 PM Thomas Monjalon <<a href="mailto:thomas@monjalon.net">thomas@monjalon.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">29/09/2025 18:28, Shreesh Adiga:<br>
> On Wed, Sep 24, 2025 at 8:28 PM Thomas Monjalon <<a href="mailto:thomas@monjalon.net" target="_blank">thomas@monjalon.net</a>> wrote:<br>
> <br>
> > Hello,<br>
> ><br>
> > 16/07/2025 12:34, Shreesh Adiga:<br>
> > > Replace the clearing of lower 32 bits of XMM register with blend of<br>
> > > zero register.<br>
> > > Replace the clearing of upper 64 bits of XMM register with<br>
> > _mm_move_epi64.<br>
> > > Clang is able to optimize away the AND + memory operand with the<br>
> > > above sequence, however GCC is still emitting the code for AND with<br>
> > > memory operands which is being explicitly eliminated here.<br>
> > ><br>
> > > Additionally replace the 48 byte crc_xmm_shift_tab with the contents of<br>
> > > shf_table which is 32 bytes, achieving the same functionality.<br>
> > ><br>
> > > Signed-off-by: Shreesh Adiga <<a href="mailto:16567adigashreesh@gmail.com" target="_blank">16567adigashreesh@gmail.com</a>><br>
> ><br>
> > Sorry I'm not following.<br>
> > Please could you start with defining the goal of this patch?<br>
> > Is it a code simplification or a performance optimization?<br>
> <br>
> It is intended to be a minor performance optimization.<br>
<br>
Please could you give some performance numbers in the commit log?<br></blockquote><div>I don't think that this change can be reliably measured. The changes only impact</div><div>the last stage crc 64 to 32 fold and the last 16 bytes computation. The impact will only</div><div>be a couple of clock cycles at best. Reducing the static array usage also I don't know</div><div>if it can be reliably measured especially since it is not affecting the main loop.</div><div>This patch can be ignored if minor incremental changes are not desirable.</div><div> </div></div></div>