On Mar 8, 4:20 pm, [EMAIL PROTECTED] wrote: > I have been searching all over for a solution to this. I am new to > Python, so I'm a little lost. Any pointers would be a great help. I > have a couple hundred emails that contain data I would like to > incorporate into a database or CSV file. I want to search the email > for specific text. > > The emails basically look like this: > > random text _important text:_15648 random text random text random text > random text > random text random text random text _important text:_15493 random text > random text > random text random text _important text:_11674 random text random text > random text > ===============Date: Wednesday March 5, 2008================ > name1: 15 name5: 14 > > name2: 18 name6: 105 > > name3: 64 name7: 2 > > name4: 24 name8: 13 > > I want information like "name1: 15" to be placed into the CSV with the > name "name1" and the value "15". The same goes for the date and > "_important text:_15493". > > I would like to use this CSV or database to plot a graph with the > data. > > Thanks!
This kind of work can be done using pyparsing. Here is a starting point for you: from pyparsing import Word, oneOf, nums, Combine import calendar text = """ random text _important text:_15648 random text random text random text random text random text random text random text _important text:_15493 random text random text random text random text _important text:_11674 random text random text random text ===============Date: Wednesday March 5, 2008================ name1: 15 name5: 14 name2: 18 name6: 105 name3: 64 name7: 2 name4: 24 name8: 13 """ integer = Word(nums) IMPORTANT_TEXT = "_important text:_" + integer("value") monthName = oneOf( list(calendar.month_name) ) dayName = oneOf( list(calendar.day_name) ) date = dayName("dayOfWeek") + monthName("month") + integer("day") + \ "," + integer("year") DATE = Word("=").suppress() + "Date:" + date("date") + Word("=").suppress() NAMEDATA = Combine("name" + integer)("name") + ':' + integer("value") for match in (IMPORTANT_TEXT | DATE | NAMEDATA).searchString(text): print match.dump() Prints: ['_important text:_', '15648'] - value: 15648 ['_important text:_', '15493'] - value: 15493 ['_important text:_', '11674'] - value: 11674 ['Date:', 'Wednesday', 'March', '5', ',', '2008'] - date: ['Wednesday', 'March', '5', ',', '2008'] - day: 5 - dayOfWeek: Wednesday - month: March - year: 2008 - day: 5 - dayOfWeek: Wednesday - month: March - year: 2008 ['name1', ':', '15'] - name: name1 - value: 15 ['name5', ':', '14'] - name: name5 - value: 14 ['name2', ':', '18'] - name: name2 - value: 18 ['name6', ':', '105'] - name: name6 - value: 105 ['name3', ':', '64'] - name: name3 - value: 64 ['name7', ':', '2'] - name: name7 - value: 2 ['name4', ':', '24'] - name: name4 - value: 24 ['name8', ':', '13'] - name: name8 - value: 13 Find out more about pyparsing at http://pyparsing.wikispaces.com. -- Paul -- http://mail.python.org/mailman/listinfo/python-list