On Thursday, August 15, 2002, at 07:27 , Paul Tremblay wrote:
> I am writing a script to convert RTF to XML, and my output looks like > this: (I explain this ugliness below) > > <id1listlevel1> > text > <id1listlevel2> > text > </id1listlevle2> > </id1listlevel1> > text > <id1listlevel1> > text > <id1listlevel2> > text > </id1listlevle2> > </id1listlevel1> > > I know that is ugly to read, but I'm just point out that the tags > repeat themselves. It should look like this: > > <id1listlevel1> > text > <id1listlevel2> > text > text > text > text > </id1listlevle2> > </id1listlevel1> actually let's start with the first part of the problem I see here by redefining that first list so that it is more obvious <dict> <id1listlevel1> text_1 <id1listlevel2> text_2</id1listlevle2> </id1listlevel1> text_3 <id1listlevel1> text_4 <id1listlevel2> text_5 </id1listlevle2> </id1listlevel1> </dict> here you will notice that 'text_3' is "outside" of any "tag" structure you offered, and is only a part of the 'dictionary' itself - hence your 'should look like' structure - maybe what you want - but it is not what the original showed up - although you might have wanted it that way - could have been a typo.... { sometimes white space can be your friend... 8-) } so let us assume for the moment that what you meant was that it really did have say the structure <dict> <id1listlevel1> text_1 <id1listlevel2> text_2</id1listlevle2> </id1listlevel1> <id1listlevel1> <id1listlevel2> text_3</id1listlevle2> </id1listlevel1> <id1listlevel1> text_4 <id1listlevel2> text_5 </id1listlevle2> </id1listlevel1> </dict> then you might get towards what you want... but you still have problems with the 'text_4' being at level1... not at level2.... So a part of the problem that you have is the simple how to do a 'look ahead/look behind' problem.... your "SLURP" idea has the virtue that you can walk around the whole of the text in some large @all_our_lines.... rather than walk into the text line by line and build up the appropriate tree structure and then prune the tree to what you want.... ciao drieux --- -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]