Shane Konings <shane.koni...@gmail.com> writes: ...
> The following is a sample of the data. There are hundreds of lines > that need to have an automated process of splitting the strings into > headings to be imported into excel with theses headings > ID Address StreetNum StreetName SufType Dir City Province PostalCode > > > 1 1067 Niagara Stone Rd, W, Niagara-On-The-Lake, ON L0S 1J0 > 2 4260 Mountainview Rd, Lincoln, ON L0R 1B2 > 3 25 Hunter Rd, Grimsby, E, ON L3M 4A3 > 4 1091 Hutchinson Rd, Haldimand, ON N0A 1K0 > 5 5172 Green Lane Rd, Lincoln, ON L0R 1B3 > 6 500 Glenridge Ave, East, St. Catharines, ON L2S 3A1 > 7 471 Foss Rd, Pelham, ON L0S 1C0 > 8 758 Niagara Stone Rd, Niagara-On-The-Lake, ON L0S 1J0 > 9 3836 Main St, North, Lincoln, ON L0R 1S0 > 10 1025 York Rd, W, Niagara-On-The-Lake, ON L0S 1P0 The input doesn't look consistent to me. Is Dir supposed to be an optional value? If that is the only optional, it can be worked around. But if the missing direction (I'm guessing) is due to malformed input data, you have a hell of a job in front of you. What do you want to do with incomplete or malformed data? Try to parse it as a "best effort", or simply spew out an error message for an operator to look at? In the latter case, I suggest a stepwise approach: * Split input by ',' ->res0 * Split the first result by ' ' -> res -> Id = res[0] -> Address = res[1:] -> StreetNum = res[1] -> StreetName= res [2:] -> SufType = res[-1] * Check if res0[1] looks like a cardinal direction If so Dir = res0[1] Otherwise, croak or use the default direction. Insert an element in the list, so the remainder is shifted to match the following steps. -> City = res0[2] * Split res0[3] by ' ' -> respp respp[0] -> Province respp[1:] -> Postcode And put in som basic sanitation of the resulting values, before committing them as a parsed result. Provinces and post codes, should be easy enough to validate against a fixed list. -- /Wegge Leder efter redundant peering af dk.*,linux.debian.* -- https://mail.python.org/mailman/listinfo/python-list