On Tue, 21 Jan 2014 16:06:56 -0800, Shane Konings wrote:
> The following is a sample of the data. There are hundreds of lines that > need to have an automated process of splitting the strings into headings > to be imported into excel with theses headings > > ID Address StreetNum StreetName SufType Dir City Province > PostalCode Ok, the following general method seems to work: First, use a regex to capture two numeric groups and the rest of the line separated by whitespace. If you can't find all three fields, you have unexpected data format. re.search( r"(\d+)\s+(\d+)\s+(.*)", data ) Second, split the rest of the line on a regex of comma + 0 or more whitespace. re.split( r",\s+", data ) Check that the rest of the line has 3 or 4 bits, otherwise you have an unexpected lack or excess of data fields. Split the first bit of the rest of the line into street name and suffix/ type. If you can't split it, use it as the street name and set the suffix/ type to blank. re.search( r"(.*)\s+(\w+)", data ) If there are 3 bits in rest of line, set direction to blank, otherwise set direction to the second bit. Set the city to the last but one bit of the rest of the line. Capture one word followed by two words in the last bit of the rest of the line, and use these as the province and postcode. re.search( r"(\w+)\s+(\w+\s+\w+)", data ) Providing none of the searches or the split errored, you should now have the data fields you need to write. The easiest way to write them might be to assemble them as a list and use the csv module. I'm assuming you're capable of working out from the help on the python re module what to use for each data, and how to access the captured results of a search, and the results of a split. I'm also assuming you're capable of working out how to use the csv module from the documentation. If you're not, then either go back and ask your lecturer for help, or tell your boss to hire a real programmer for his quick and easy coding jobs. -- Denis McMahon, denismfmcma...@gmail.com -- https://mail.python.org/mailman/listinfo/python-list