On Fri, May 20, 2005 at 12:52:05PM +0300, Tzafrir Cohen wrote:
> On Fri, May 20, 2005 at 11:51:41AM +0300, avraham wrote:
>>...skip
> Any chance you could write those commands as a script, for reference?
> ...skip
> 
> =================================================================
> To unsubscribe, send mail to [EMAIL PROTECTED] with
> the word "unsubscribe" in the message body, e.g., run the command
> echo unsubscribe | mail [EMAIL PROTECTED]

Hi Tzafrir,
The lull I had at work has finished, so I think it's better to
sum up whatever I learned till now rather to report after
finishing. Some may benefit from it and some others are much
better qualified than me to improve.
My experience originates in a small number of exercises in which
I took some existing texts and transformed them into tex
->dvi->pdf files. 
As exercises in LaTeX that saved me the typing,
allowed me to concentrate in specific aspects of LaTeX and, by
yielding a final product of interest, maintained my motivation.
The first, Brett Hamilton: Installing debian Woody, to practice
the \verb+  + command and the \verbatim environement.
Main conclusion: In order to get a legible dvi file from the
start, double space the original document
$ cat source_file | sed G >preprocessed_file.
Then, when applying the correct LaTeX formatting methods, it is
very easy to get rid of the empty lines.
Second, Sven Guckes: VIM Editing Intro, where I tried to produce
all those symbols and commands of VIM with no, or minimal use of
verbatim.
Conclusion: To get a smooth translation to LaTeX it is worthwhile
to apply an additional filter which first replaces the charaters
&, ^ and $, which have special meanings in regular expressions with 
placebos, and then replaces \, ~, # ,% _, {, }, |, <, >, ^, $,
and & with their LaTeX symbols in text context. All this is, of
course, prefixed by #!sh, written in the file lpreproc and made
executable. 
#!sh
cat $1 | sed 's/\&/ personalET /g' | sed 's/\^/ personalHAT /g' | sed 's/\$/ 
personalDOLLAR /g' | sed 's/\\/ \\textbackslash /g' | sed 's/~/ 
\\textasciitilde /g' | sed 's/#/ \\# /g' | sed 's/%/ \\% /g' | sed 's/_/ \\_ 
/g' | sed 's/{/ \\{ /g' | sed 's/}/ \\} /g' | sed 's/|/ \\textbar \\  /g' | sed 
's/</ \\textless /g' | sed 's/>/ \\textgreater /g' | sed 's/personalHAT/ 
\\textasciicircum /g' | sed 's/personalDOLLAR/ \\$/g' | sed 's/personalET/ \\\& 
/g' 

Clearly, if any of the placebos appears by chance in the file to
be processed, the script has to be modified.
(I know that some of these symbols have better looking LaTeX
implementations among the mathematical symbols. I refrained from
using them because I did not learn yet this chapter.)

Third, Chekhov: The sea-gull from the Guttenberg project(this is 
just because I saw on TV the French film "La petite Lili", a modern 
version of the play, and discovered that I had forgotten most of it).
Conclusion: LaTeX may misinterpretate text in square brackets (in
that case stage instructions) as misformed or mistaken command
arguments. It's worth replacing to avoid trouble.
Fourth, My wife's cake recipes, to practice multilanguage
environement and indexing.
Conversion from DOS hebrew:
cat source_file | '\200-\232' '\340-\372' (from the Hebrew-HOWTO)
When the source file originates from wp 5.2 for DOS (I have no
experience with other sources) the numbers and parantheses come
out the wrong way. The simplest way I found to treat the
multitude of possible cases was to reverse everything left-right
with:
sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//' (collection of sed
1liners -sorry did not write down the source)
and then reverse back the text only, with bidiv 

In multi-language environement, in the end result of this treatment
the English lines are reversed. As I had only very few English
text, I separated them out with grep, before the treatment, and
recombined back with vimdiff. This is not the proper solution:
TODO.

After all these treatments, add a preamble at the beginning and
an \end{document} at the end, and try to latex (elatex, for text
containing Hebrew).
Simple preable for English-only txt:
\documentclass[12pt]{article}
\pagestyle{empty}
\begin{document}
And for multilanguage environement:
\documentclass[letterpaper,12pt]{article} 
\pagestyle{plain}
\usepackage[english,german,hebrew]{babel}
\usepackage{hebcal}
\usepackage{hebfont}
\begin{document}
This is best done in the editor which you will use to refine the
LaTeX source and deal with the problems caused by non-printable
control characters carried over from the source.

Most if not all the sed and tr commands I mentioned have
equivalents in the major editors. I wished to steer away from the
holly war between the adepts of one or another of them. I hope
that people will not find very hard to use a pipe of filters/or
to translate them into the form understood by their editor.

Still other TODOs:
1-An efficient way to find non-printable characters in the middle
of Hebrew text.
2-Specific to WP: Dashes between Hebrew text and numbers/english
text are consistently misplaced.

Hoping it will be of use to someone, cheers,
Avraham

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to