Why don't you use re.findall?
re.findall(r'\b[0-9]{2,7}-[0-9]{2}-[0-9]{2}\b', txt)
I think I can see what you did there but it won't make sense to me - or
whoever looks at the code - in future.
That answers your specific question. However, I am in awe of people who
can just "do" regular expressions and I thank you very much for what
would have been a monumental effort had I tried it.
I feel the same way about regex. If I can find a way to write something
without regex I very much prefer to as regex usually adds complexity and
hurts readability.
You might find https://regex101.com/ to be useful for testing your
regex. You can enter in sample data and see if it matches.
If I understood what your regex was trying to do I might be able to
suggest some python to do the same thing. Is it just removing numbers
from text?
The for loop, "for bit in bits" etc, could be written as a list
comprehension.
pieces = [bit if len(bit) > 6 else "" for bit in bits]
For devs familiar with other languages but new to Python this will look
like gibberish so arguably the original for loop is clearer, depending
on your team.
It's worth making the effort to get into list comprehensions though
because they're awesome.
That little re.sub() came from ChatGPT and I can understand it without
too much effort because it came documented
I suppose ChatGPT is the answer to this thread. Or everything. Or will be.
I am doubtful. We'll see!
R
--
https://mail.python.org/mailman/listinfo/python-list