+
+    """============ BEGIN PARSER FUNCTIONS ============"""
+
+    @staticmethod
+    def compose(f: Callable, parser_fn: ParserFn) -> ParserFn:
+        """Makes a composite parser function.
+
+        The parser function is run and if a non-None value was returned, f is called with it.

Let's use `parser_fn` instead of "The parser function". Let's also put f into backticks. It's also peculiar that we're passing the functions in the reverse order - first the second one is applied and then the first one is applied.


Ack.

compose(f, g) is equivalent of f(g(..)). It follows a syntactical order, rather than a calling one. I don't have a preference on ordering either way, so anything is fine by me.


I'd prefer to have the calling order. You could also include that the resulting function is f(g()).

+        Otherwise the function returns early with None.
+
+        Metadata modifier for :func:`dataclasses.field`.

This sentence is all alone here and I don't understand what it's saying without more context.

Similarly to what was previously done with the Params class, this is just to signify that this function is modifies the metadata argument of dataclasses.field. It's just a note. I guess this could actually be in the Returns section, and it'd be clearer that way.


Ok, let's see how that reads.


+    @staticmethod
+    def find(
+        pattern: str | re.Pattern[str],
+        flags: re.RegexFlag = re.RegexFlag(0),
+        named: bool = False,
+    ) -> ParserFn:
+        """Makes a parser function that finds a regular expression match in the text.
+
+        If the pattern has capturing groups, it returns None if no match was found. If the pattern +        has only one capturing group and a match was found, its value is returned. If the pattern +        has no capturing groups then either True or False is returned if the pattern had a match or
+        not.
+

This description would be a better fit in a Returns: section. The body could explain the various scenarios in which the function can be used. At least I imagine there are various scenarios based on the different things the function returns, but I don't really want to go around digging in the code to verify that. :-)

Returns is about what the `find` returns, but it's not what it actually returns. `find` returns a dictionary compatible with `dataclasses.field`'s metadata argument, which contains the actual _find function. Hence why I wrote that segment in the body. I think it would be confusing to put this description in the returns. I agree that this should be clarified though.


They way it's phrased it looks like you're describing what find() returns, but I get that it returns a function that returns what you described. It could be easily clarified in the returns section though:

Returns:
    A function that returns <the whole description>.

+    @classmethod

Is there a reason why find_int() is a classmethod while the rest are staticmethods? Looks like it could also be a staticmethod.

Class methods are pretty much static methods that receive a reference to the class. In this case it's a class method as we are calling another class method: cls.find(..). The @staticmethod equivalent would be explicitly specifying the class and the method we want to use.


I was wondering about this because it doesn't need to be a classmethod, as we're calling a static method from within and as you mention, we can just specify the class with the method (i.e. TextParser.find()), so my thoughts were why is find() deviating from the rest of the methods when it doesn't need to.

This doesn't really matter though, it's going to be called the same from outside the class, so I'll leave this up to you. :-)

+    def find_int(
+        cls,
+        pattern: str | re.Pattern[str],
+        flags: re.RegexFlag = re.RegexFlag(0),
+        int_base: int = 0,
+    ) -> ParserFn:
+        """Makes a parser function that converts the match of :meth:`~find` to int.
+
+        This function is compatible only with a pattern containing one capturing group.
+
+        Metadata modifier for :func:`dataclasses.field`.
+
+        Args:
+            pattern: the regular expression pattern
+            flags: the regular expression flags
+            int_base: the base of the number to convert from
+        Raises:
+            RuntimeError: if the pattern does not have exactly one capturing group
+        """
+        if isinstance(pattern, str):
+            pattern = re.compile(pattern, flags)
+
+        if pattern.groups != 1:
+            raise RuntimeError("only one capturing group is allowed with this parser function")
+

Have you considered using a subclass of DTSError so that we can add a severity to this error? The severity is there to "rank" DTS errors from worst to least worst.

No, I actually didn't think of it. This error should only occur out of development error, and it should never be triggered in practice. Do you think it should be used?


I err on the side of using DTSErrors, as it gives automated tools a possibly useful way of telling the severity of the error (from the return code). A development error like this could be easily identifiable in CI then and we could maybe produce a helpful message based on that.

But maybe we can talk about this Patrick and his team (how usable the return code is and what the severities should be).

+        return cls.compose(partial(int, base=int_base), cls.find(pattern))
+
+    """============ END PARSER FUNCTIONS ============"""
+
+    @classmethod
+    def parse(cls, text: str) -> Self:

Would converting this into __init__(self, text: str) work? Sounds like we could just use "for field in fields(self)" and then setattr() to populate the variables.

I am not in favour of overriding the constructor, as I see the parsing ability as an extension to the class.

Well, the class should primarily do the parsing (based on its name), so everything else is an extension in my mind.

Nonetheless, this would make sense if we think that this would be used exclusively for objects that need to be parsed in order to be constructed. I purposefully left the flexibility open, but if we don't think it'll ever be needed, then it's not a game changer to me.


Everything about the class indicates (or outright says) that parsing should be the only purpose. I can't really imagine what else we could add (in this state, an instance before calling parse() would be just the text with fields containing function in the metadata needed for parsing - what else is there to do with this data?). Can you elaborate if you can think of something?

And again, that said, it doesn't matter much, but I like the constructor as it really doesn't look like the class could (or should) do anything else than parsing text.

Reply via email to