Chuck Connors wrote: > Hey guys. I'm working on a little program to help my wife catalog her/ > our coupons. I found a good resource but need help formatting the > text data so that I can import it into a mysql database. Here's the > data format: > > 409220000003 Life Fitness Products $1 (12-13-08) (CVS) > 546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35 > each > 518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09) > 518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini > Cinnamon Rolls, etc. .40 (2-14-09) > > The first value (large number) is the UPC, the next element is the > coupon description, followed by a date in parenthesis. Those are the > only three elements I am concerned with. Can someone help me in > reformatting this: > > 409220000003 Life Fitness Products $1 (12-13-08) (CVS) > 546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35 > each > 518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09) > 518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini > Cinnamon Rolls, etc. .40 (2-14-09) > > into something like this: > > "409220000003","Life Fitness Products $1","12-13-08" > "546500181141","Oust Air Sanitizer, any B1G1F up to $3.49","1-17-09" > "518000159258","Pillsbury Crescent Dinner Rolls, any .25","2-14-09" > "518000550406","Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini > Cinnamon Rolls, etc. .40","2-14-09" > > Any help, pseudo code, or whatever push in the right direction would > be most appreciated. I am a novice Python programmer but I do have a > good bit of PHP programming experience. > > Thanks for your time....
Hi Chuck, Don't be put off, not everyone here is rude and what you are asking isn't so trivial as to justify that flaming RTFM horsecrap. You are moving from php to python - this is cause for celebration! Welcome :-) Anyway I can see two ways of attacking this, you could either write a finite state machine or use regular expressions. My preference would be regular expressions. The following should do what you want although I haven't tested it much and you might want to lookout for irregularities in your source text like linebreaks etc... import re regex = re.compile("([0-9]{3,13}) ([^(]*) \(([^)]*)") for each in regex.findall(source_data): print '"'+each[0]+'","'+each[1]+'","'+each[2]+'"' I don't know if you already know regular expressions from some other language so I won't bother explaining this one other than to say you may note Pythons regex syntax is generally more liberal than many other languages implementations. If this is all new to you and you do need this explaining just let me know and I will happily run you through it. Also, you say your eventual target is a SQL database. If the database is local (or reachable via a network) then you are probably going to be better off using python to connect to it directly using the MySQLdb module... import MySQLdb db = MySQLdb.connect(host="localhost", user="uname", passwd="pwd", db="dbname") cursor = db.cursor() cursor.execute("INSERT YOU SQL QUERY HERE") db.commit () Hope this helps, Roger. -- http://mail.python.org/mailman/listinfo/python-list