On Friday, April 12, 2002, at 12:38 , Raghupathy, Ramesh . wrote:

> I am sorry I was not clear in my question.
>
>    The word1 and word2 may occur on different lines of the file and may
> occur in different combinations.
>
> for e.g :
>
> (not showing the new lines..)
>
>
> ....word1.....word1.....word1....word2....word1...word2....word2....word2.
> ...
> .....
>
>   In this example I like to extract the text which is between the 3rd 
> word1
> and the 1st word2 and also the text between the 4th word1 and 2nd word2.
>
> Thanks.

this looks like you would want an XML DTD and then just use the
XML parser with that DTD as the reference.....

allow me to reconstruct what I think you are asking for

m1:     word1 <lA>
m2:             word1 <lB>
m3:                     word1 <LC> word2
                        <LD>
m4:                     word1 <LE> word2
                        <LF>
                word2
                <LG>
        word2

if we can assume then that the set {lA,lB,LD,LF,LG} will only
contain white space elements - and that we are only interested
in the information nested at LC and LE - you have one set of issues.

It's the moment that you decided that all of the <XX> 'messages' have
to be 'preserved' that things get messy....

using the 'm?' side annotation I put up, message m1 would need to
get out of the data stream

        m1: lA , m2, LG

and we then have to parse out of m2 it's information and nested data....

ciao
drieux

---


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to