Re: [PATCH 4/5] dts: add `show port info` command to TestPmdShell

Luca Vizzarro Thu, 18 Apr 2024 03:52:34 -0700

On 18/04/2024 07:41, Juraj Linkeš wrote:

The equivalent /\n\*.+?(?=\n\*|$)/gs (but slightly more optimised) takes
approximately 3*input_length steps to run (according to regex101 at
least). If that's reasonable enough, I can do this:


    iter = re.finditer(input, "\n\*.+?(?=\n\*|$)", re.S)
    return [TestPmdPortInfo.parse(match.group(0)) for match in iter]

Another optimization is artificially adding a `\n*` delimiter at the end
before feeding it to the regex, thus removing the alternative case (|$),
and making it 2*len steps:

    input += "\n*"
    iter = re.finditer(input, "\n\*.+?(?=\n\*)", re.S)
    return [TestPmdPortInfo.parse(match.group(0)) for match in iter]


I like this second one a bit more. How does the performance change if
we try to match four asterisks "\n\****.+?(?=\n\****)"? Four asterisks
shouldn't randomly be in the output as that's basically another
delimited.

The difference is negligible as the regex walks every character anyways– while there is a match. Either \* or \*{4} will result in a match. Theproblem is that if an attempt of match fails, the regex backtracks, thisis where the problem is. Of course if we already matched 4 asterisks,then the backtracking can skip the whole sequence at once (compared to1), but it's only going to be 3 steps less per **** found. It's a biggerdifference if we attempt to match all the asterisks. The lookaheadconstruct also increases backtracking.

In the meantime, still by amending the output, I've got a solution thatdoesn't perform any look aheads:


  \*{21}.+?\n{2}

instead of treating blocks as they are (\n******<snip>\n), we can add anextra \n at the end and treat the blocks as: ******<snip>\n\n. Basicallythis assumes that an empty line is the end delimiter. This takesinput_length/2 steps!

Of course in reality every \n is \r\n, as I've discovered that whenshells are invoked using paramiko, the stream becomes CRLF for somereason I haven't explored. I think this was worth mentioning foreverybody, in case surprise carriage returns may reveal disruptive.

And we should document this in the docstring - sample output, then
explain the extra characters and the regex itself. We shouldn't forget
this in the other commit as well (show port stats).


Ack.

Re: [PATCH 4/5] dts: add `show port info` command to TestPmdShell

Reply via email to