> -----Original Message----- > From: Southworth, Harry [mailto:[EMAIL PROTECTED]] > Sent: Friday, January 24, 2003 10:38 PM > To: [EMAIL PROTECTED] > Subject: MS Word question > > > I'm running Perl on Cygwin on top of Windows 2000. > > I have a lot of ascii text files that someone has thoughfully saved as > Microsoft Word documents. If I open them in a text editor, I > can see the > ascii text, but there is some junk at the top and bottom. > Testing the files > with -T tells me they're not text. > > grep seems able to search through the text in the files, but > I'd like to > strip the junk out and save them as simple text files. I > tried a few things > in Perl, but didnt' get anywhere. > > I'd be grateful if someone were able to provide some pointers. >
Well you could use Win32::OLE to automate saving the docs as text. The process is fairly simple. <snip> use strict; use Win32::OLE; Win32::OLE->Option(Warn => sub {&error}); use constant TEXTFORMAT => 2; my $wd; my $doc; my $docin = 'c:\temp\doccy.doc'; my $docout = 'c:\temp\doccy.txt'; $wd = new Win32::OLE("Word.Application"); $doc = $wd->Documents->Open($docin); $doc->SaveAs($docout, TEXTFORMAT); $doc->Close(); undef $wd; sub error { die Win32::OLE->LastError; } </snip> -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]