Hi to all,

on the one hand I wanto to thank a lot John Krahn for his precious help;

on the other hand I would publish in the body of this email the script I use (thank again, John!) to clean all the text files in a directory. I don't know if this is a kind of mail permitted in this list, but - I thought so - that script could be useful for someone;

on the other other hand I'm not sure about the meaning of ?: in a regexp. Is the ?: at the beginning of the pattern equal to the ? at the end of the pattern?

greetings,

all'adr



~~~~~~~~~~~~~~~~~THE SCRIPT~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


#!/usr/bin/perl -w # it cleans the texts in a directory

use strict;
my $testo;
my $var = 1;

@ARGV = </Users/pes/Desktop/Testi40M/*.txt>;
while(<>){
tr/\015\012/\n/s;
tr/"*^_\-+' //s;
$_ = '' if /^(?:Newsgroups: it.|Subject: |Date: |Message-ID: |References: |From: )/;
s/(\w\')/$1 /g;
$testo .=$_;
if ( eof ) {
$testo =~ s!(?:http://)?\w{3,}(?:\.(?:\w-?)+)+\.\w{2,3}(?:/ \w+(?:\.\w{1,5})*(?:\?\w{1,32}=\w{1,32})*)*! URL!g;
$testo =~ tr/\n\r\f\t /\n\r\f\t /s;
close $ARGV;
open ATTUALE, ">$ARGV" or die "I can't open $ARGV because $!\n";
print ATTUALE $testo;
close ATTUALE;
print "$var - I have cleaned $ARGV\n";
$testo = "";
$var++;
}}


print "DONE.\n";


-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to