On Thu, Aug 1, 2024 at 4:41 AM Luca Vizzarro <luca.vizza...@arm.com> wrote: > > Great work Jeremy! Just a couple of minor passable improvement points. > > On 30/07/2024 14:34, jspew...@iol.unh.edu wrote: > > > +@dataclass > > +class TestPmdVerbosePacket(TextParser): > > + """Packet information provided by verbose output in Testpmd. > > + > > + The "receive/sent queue" information is not included in this dataclass > > because this value is > > + captured on the outer layer of input found in > > :class:`TestPmdVerboseOutput`. > > + """ > > + > > + #: > > + src_mac: str = > > field(metadata=TextParser.find(r"src=({})".format(REGEX_FOR_MAC_ADDRESS))) > Just a(n optional) nit: TextParser.find(f"src=({REGEX_FOR_MAC_ADDRESS})") > The raw string is only needed to prevent escaping, which we don't do here.
Ack. I really just left it this way because it also adjusts highlighting in some IDEs, but there isn't much to see here anyway. > > + #: > > + dst_mac: str = > > field(metadata=TextParser.find(r"dst=({})".format(REGEX_FOR_MAC_ADDRESS))) > As above. Ack. > > + #: Memory pool the packet was handled on. > > + pool: str = field(metadata=TextParser.find(r"pool=(\S+)")) > > + #: Packet type in hex. > > + p_type: int = > > field(metadata=TextParser.find_int(r"type=(0x[a-fA-F\d]+)")) > > + #: > > <snip> > > > + @staticmethod > > + def extract_verbose_output(output: str) -> list[TestPmdVerboseOutput]: > > + """Extract the verbose information present in given testpmd output. > > + > > + This method extracts sections of verbose output that begin with > > the line > > + "port X/queue Y: sent/received Z packets" and end with the > > ol_flags of a packet. > > + > > + Args: > > + output: Testpmd output that contains verbose information > > + > > + Returns: > > + List of parsed packet information gathered from verbose > > information in `output`. > > + """ > > + iter = re.finditer(r"(port \d+/queue \d+:.*?(?=port \d+/queue > > \d+|$))", output, re.S) > > How about using a regex that matches what you described? ;) Keeping re.S: > > (port \d+/queue \d+.+?ol_flags: [\w ]+) > > Would spare you from using complex lookahead constructs and 4.6x less > steps. Maybe it doesn't work with every scenario? Looks like it works > well with the sample output I have. Let me know if it works for you. > I tried using something like this actually which is probably why the docstring reflects that type of language, but I didn't use it because it doesn't match one specific case. I'm not sure how common it is, and I haven't seen it happen in my recent testing, but since the verbose output specifically states that it sent/received X number of packets, I presume there is a case where that number will be more than 1, and there will be more than one set of packet information after that first line. I think I've seen it happen in the past, but I couldn't recreate it in testing. For context to this next section, if it wasn't clear, I consider the `port \d+/queue \d+` line to be the header line and the start of a "block" of verbose output. Basically though the problem with this is that if there are more than one packet under that header line, the lazy approach will only consume up until the ol_flags of the first packet of a block, and the greedy approach consumes everything until the last packet of the entire output. You could use the lazy approach with the next port/queue line as your delimiter, but then the opening line of the next block of output is included in the previous block's group. The only way I could find to get around this and go with the idea of "take everything from the start of this group until another group starts" but without capturing the opening of the next block was a look ahead. Again though, I definitely don't love the regex that I wrote and would love to find a better alternative. > Best, > Luca >