[PATCH 4/5] dts: add `show port info` command to TestPmdShell

Luca Vizzarro Luca.Vizzarro at arm.com
Thu Apr 18 12:52:25 CEST 2024


On 18/04/2024 07:41, Juraj Linkeš wrote:
>> The equivalent /\n\*.+?(?=\n\*|$)/gs (but slightly more optimised) takes
>> approximately 3*input_length steps to run (according to regex101 at
>> least). If that's reasonable enough, I can do this:
>>
>>     iter = re.finditer(input, "\n\*.+?(?=\n\*|$)", re.S)
>>     return [TestPmdPortInfo.parse(match.group(0)) for match in iter]
>>
>> Another optimization is artificially adding a `\n*` delimiter at the end
>> before feeding it to the regex, thus removing the alternative case (|$),
>> and making it 2*len steps:
>>
>>     input += "\n*"
>>     iter = re.finditer(input, "\n\*.+?(?=\n\*)", re.S)
>>     return [TestPmdPortInfo.parse(match.group(0)) for match in iter]
>>
> 
> I like this second one a bit more. How does the performance change if
> we try to match four asterisks "\n\****.+?(?=\n\****)"? Four asterisks
> shouldn't randomly be in the output as that's basically another
> delimited.

The difference is negligible as the regex walks every character anyways 
– while there is a match. Either \* or \*{4} will result in a match. The 
problem is that if an attempt of match fails, the regex backtracks, this 
is where the problem is. Of course if we already matched 4 asterisks, 
then the backtracking can skip the whole sequence at once (compared to 
1), but it's only going to be 3 steps less per **** found. It's a bigger 
difference if we attempt to match all the asterisks. The lookahead 
construct also increases backtracking.

In the meantime, still by amending the output, I've got a solution that 
doesn't perform any look aheads:

   \*{21}.+?\n{2}

instead of treating blocks as they are (\n******<snip>\n), we can add an 
extra \n at the end and treat the blocks as: ******<snip>\n\n. Basically 
this assumes that an empty line is the end delimiter. This takes 
input_length/2 steps!

Of course in reality every \n is \r\n, as I've discovered that when 
shells are invoked using paramiko, the stream becomes CRLF for some 
reason I haven't explored. I think this was worth mentioning for 
everybody, in case surprise carriage returns may reveal disruptive.

> And we should document this in the docstring - sample output, then
> explain the extra characters and the regex itself. We shouldn't forget
> this in the other commit as well (show port stats).

Ack.


More information about the dev mailing list