On Saturday 19 July 2008 14:51, Christian Ridderström wrote:
> On Sat, 19 Jul 2008, Steve Litt wrote:
> >> My only suggestion would be to use one of the programming tools rather
> >> than do your replacement in Vim. Awk, sed, perl - It would allow you to
> >> gradually build up some regexps through trial and error.
> >>
> >> Alan
> >
> > The RTF code is too wierd and complex to do programatically (what you
> > get is what you mean). I need to do it a little at a time, interactively
> > (what you see is what you get).
>
> I agree with Alan... if you use a one or more separate scripts that you
> apply, it's easy for you to do the changes separately. 

Believe me, it's not easy at all. RTF is much too wierd. The style identifier 
occurs in the middle of a complex string. The text to which to apply it 
occurs at the end, but there's no reasonable, consistant way to identify 
where the markup ends and the content begins.

If converting Word docs were a weekly occurrence, it might be worth it to sit 
down with the RTF specification and write code to do the job. But this is my 
only MS Word book. I have one more non-LyX book -- a 1990 classic 
called "Troubleshooting: Tools, Tips and Techniques", written in WordPerfect 
5.1. 

So these conversions are one-off affairs ill suited to the complexities of 
decoding proprietary formats.

> And more 
> importantly, start over in case you later in the process discover a
> problem with the sequence of regexp-replacements you've used.

I used a series of files, each of which contains one tweak type, so that 
shouldn't happen. At the end of each tweak type I verify that it still loads 
and looks right in MS Word.

>
> What if you prototype/test out the regexps in vim, and then move them to a
> script?  (The script would perform a sequence of regexp replacements on
> the document).

The RTF markup is too complex and is inconsistent if you try to look at it in 
a simple way.

>
> I'd never do all the regexp-replacements manually, too much risk of making
> a mistake somewhere in the middle and having to redo everything
> manually...

Hence the multiple stages with multiple files.

Writing a program would have been an excellent idea, but RTF is MUCH too wierd 
to write that program in anything resembling a reasonable timeframe.

Thanks

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US

Reply via email to