Satyajit Sarangi wrote: > > > data = "GEOMETRYCOLLECTION (POINT (-8.9648437500000000 > -4.1308593750000000), POINT (2.0214843750000000 -2.6367187500000000), > POINT (-1.4062500000000000 -11.1621093750000000), POINT > (-11.9531250000000000,-10.8984375000000000), POLYGON > ((-21.6210937500000000 1.8457031250000000,2.4609375000000000 > 2.1972656250000000, -18.9843750000000000 -3.6914062500000000, > -22.6757812500000000 -3.3398437500000000, -22.1484375000000000 > -2.6367187500000000, -21.6210937500000000 > 1.8457031250000000)),LINESTRING (-11.9531250000000000 > 11.3378906250000000, 7.7343750000000000 11.5136718750000000, > 12.3046875000000000 2.5488281250000000, 12.2167968750000000 > 1.6699218750000000, 14.5019531250000000 3.9550781250000000))" > > This is my string . > How do I traverse through it and form 3 dicts of Point , Polygon and > Linestring containing the co-ordinates ?
Except for those space-separated number pairs, it could be a job for some well-crafted classes (e.g. `class GEOMETRYCOLLECTION ...`, `class POINT ...`) and eval. My approach would be to use a loop with regexes to recognize the leading element and pick out its arguments, then use the string split and strip methods beyond that point. Like (untested): recognizer = re.compile (r'(?(POINT|POLYGON|LINESTRING)\s*\(+(.*?)\)+,(.*)') # regex is not good with nested brackets, # so kill off outer nested brackets.. s1 = 'GEOMETRYCOLLECTION (' if data.startswith (s1): data = data (len (s1):-1) while data: match = recognizer.match (data) if not match: break # nothing usable in data ## now the matched groups will be: ## 1: the keyword ## 2: the arguments inside the smallest bracketed sequence ## 3: the rest of data ## so use str.split and str.match to pull out the individual arguments, ## and lastly data = match.group (3) This is all from memory. I might have got some details wrong in recognizer. Mel. -- http://mail.python.org/mailman/listinfo/python-list