On Tue, Feb 15, 2022 at 11:51:41PM +0900, Stephen J. Turnbull wrote:
> scanf just isn't powerful enough. For example, consider parsing user
> input dates: scanf("%d/%d/%d", &year, &month, &day). This is nice and
> simple, but handling "2022-02-15" as well requires a bit of thinking
> and several extra statements in C. In Python, I guess it would
> probably look something like
>
> year, sep1, month, sep2, day = scanf("%d%c%d%c%d")
> if not ('/' == sep1 == sep2 or '-' == sep1 == sep2):
> raise DateFormatUnacceptableError
> # range checks for month and day go here
Assuming that scanf raises if there is no match, I would probably go
with:
try:
# Who writes ISO-8601 dates using slashes?
day, month, year = scanf("%d/%d/%d")
if ALLOW_TWO_DIGIT_YEARS and len(year) == 2:
year = "20" + year
except ScanError:
year, month, day = scanf("%d-%d-%d")
> which isn't too bad, though. But
>
> year, month, day = re.match(r"(\d+)[-/](\d+)[-/](\d+)").groups()
> if not sep1 == sep2:
> raise DateFormatUnacceptableError
> # range checks for month and day go here
Doesn't that raise an exception?
NameError: name 'sep1' is not defined
I think that
year, sep1, month, sep2, day =
re.match(r"(\d+)([-/])(\d+)([-/])(\d+)").groups()
might do it (until Tim or Chris tell me that actually is wrong).
Or use \2 as you suggest later on.
> expresses the intent a lot more clearly, I think.
Noooo, I don't think it does. The scanf (hypothetical) solution is a lot
closer to my intent.
But yes, regexes are more powerful: you can implement scanf using
regexes, but you can't implement regexes using scanf.
--
Steve
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/XFEHTMTCAETLTOJXF2WRXIERRON5EH5M/
Code of Conduct: http://python.org/psf/codeofconduct/