2008/12/8 Robocop <[EMAIL PROTECTED]>: > I'm having a little text parsing problem that i think would be really > quick to troubleshoot for someone more versed in python and Regexes. > I need to write a simple script that parses some arbitrarily long > string every 50 characters, and does not parse text in the middle of > words (but ultimately every parsed string should be 50 characters, > ...
Hi, not sure, if I understand the task completely, but maybe some of the variants below using re may help (depending on what should be done further with the resulting test segments); in the first two possibilities the resulting lines are 50 characters long + 1 for "\n"; possibly 49 would be used if needed. import re input_txt = """I'm having a little text parsing problem that i think would be really quick to troubleshoot for someone more versed in python and Regexes. I need to write a simple script that parses some arbitrarily long string every 50 characters, and does not parse text in the middle of words (but ultimately every parsed string should be 50 characters, so adding in white spaces is necessary). So i immediately came up with something along the lines of:""" # print re.sub(r"((?s).{1,50}\b)", lambda m: m.group().ljust(50) + "\n", input_txt) # re.sub using a function # for m in re.finditer(r"((?s).{1,50}\b)", input_txt): # adjusting the matches via finditer # print m.group().ljust(50) print [chunk.ljust(50) for chunk in re.findall(r"((?s).{1,50}\b)", input_txt)] # adjusting the matched parts in findall hth, vbr -- http://mail.python.org/mailman/listinfo/python-list