--- On Fri, 2/10/09, Manvendra Bhangui <[email protected]> wrote:
> Converting thirukkural to fortune cookie ultimately turned
> out to be
> quite interesting.
> Shakthi updated me that Thirukkural's trademark is that it
> has two-line
> representation of the kurals with four words in the first
> line, and
> three on the second?
>
> So I had to write a shell script to do this
> #!/bin/sh
> cat single.txt| while read line
> do
> # fold all English sentences blindly
> echo $line | grep "[a-zA-Z]" > /dev/null
> if [ $? -eq 0 ] ; then
> echo $line | fold -s -w 72
> else
> # Fold only the lines other
> than the first
> # two lines (which have 5 words
> or 3 words
> wc=`echo $line | wc -w`
> if [ $wc -gt 5 ] ; then
> echo $line | fold
> -s -w 72
> else
> echo $line
> fi
> fi
> done
>
> The script above is not efficient but it did the job. How
> would one do
> that in perl/python?
You are tempting me too much with perl+thirukkural combination.
Here it is, but wrap may require some tuning for unicode
use Text::Wrap $Text::Wrap::columns= 72;
open( $fh, "t.txt" );
while ( $line = <$fh> ) {
if ( $line =~ m/[a-zA-Z]/ ) {
print wrap( '', '', $line );
}
else {
$wc = split( ' ', $line );
if ( $wc > 4 ) {
print wrap( '', '', $line );
}
else { print $line; }
}
}
close($fh);
As usual, there is more than one way in perl, indeed better ways.
Raman.P
blog:http://ramanchennai.wordpress.com/
From cricket scores to your friends. Try the Yahoo! India Homepage!
http://in.yahoo.com/trynew
_______________________________________________
To unsubscribe, email [email protected] with
"unsubscribe <password> <address>"
in the subject or body of the message.
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc