409220000003 Life Fitness Products $1 (12-13-08) (CVS)
546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35
each
518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09)
518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
Cinnamon Rolls, etc. .40 (2-14-09)
into something like this:
"409220000003","Life Fitness Products $1","12-13-08"
"546500181141","Oust Air Sanitizer, any B1G1F up to $3.49","1-17-09"
"518000159258","Pillsbury Crescent Dinner Rolls, any .25","2-14-09"
"518000550406","Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
Cinnamon Rolls, etc. .40","2-14-09"
Any help, pseudo code, or whatever push in the right direction would
be most appreciated. I am a novice Python programmer but I do have a
good bit of PHP programming experience.
A regexp should be able to split this fairly neatly:
import re
r = re.compile(r"^(\d+)\s+(.*)\((\d{1,2}-\d{1,2}-\d{2,4})\).*")
out = file('out.csv', 'w')
for i, line in enumerate(file('in.txt')):
m = r.match(line)
if not m:
print "Line %i is malformed" % (i+1)
continue
out.write(','.join(
'"%s"' % item.strip().replace('"', '""')
for item in m.groups()
))
out.write('\n')
out.close()
-tkc
--
http://mail.python.org/mailman/listinfo/python-list