On Mon, Apr 10, 2006 at 02:25:23PM +0200, Helge Hafting wrote:

> Enrico Forestieri wrote:
> 
> >TeX removes space from the beginning or end of each line, except at
> >the beginning and end of a paragraph.
> >  
> >
> Not in this case:
> \begin{document}
> 
> Hspace at line end:\hspace{13cm}\hspace{1mm}protect bla bla bla bla dfg
> sdg
>  dg
>  dgf
>  wWdfgdfgdfgklgj ksjg  dghsh asgdf haskdfhksah 
> adhsahfhgahfsjddsfgdsfgdsgdsfgdsfg j gfdlkglsdkflgs dfgdksdll glkd kfgds
> 
> \end{document}
> 
> Use margins (default A4) so the "13 cm" goes well into the right margin.
> The word "protect" is divided as "pro-", instead of moving it to the 
> next line.
> Increase the 1mm hspace, and see how text moves further into the margin.
> The 1mm of hspace is not removed, even though it is issued past the margin.
> Well, technically the space is not removed because we are not at the 
> "line end"
> but why not?  Why not break the line between the two hspaces, or at least
> after the second one?  Looks like tex need text to break the line, and 
> in that case,
> will we ever see space removed?

Before we embark in a lengthy discussion, let me set something
straight, directly citing the texbook:

<texbook>
Roughly speaking, TeX breaks paragraphs into lines in the following
way: Breakpoints are inserted between words or after hyphens so as to
produce lines whose badnesses do not exceed the current \tolerance. If
there's no way to insert such breakpoints, an overfull box is set.
Otherwise the breakpoints are chosen so that the paragraph is
mathematically optimal, i.e., best possible, in the sense that it has
no more "demerits" than you could obtain by any other sequence of
breakpoints. Demerits are based on the badnesses of individual lines
and on the existence of such things as consecutive lines that end with
hyphens, or tight lines that occur next to loose ones.

But the informal description of line breaking in the previous
paragraph is an oversimplification of what really happens. The
remainder of this chapter explains the details precisely, for people
who want to apply TeX in nonstandard ways. TeX's line-breaking
algorithm has proved to be general enough to handle a surprising
variety of different applications; this, in fact, is probably the most
interesting aspect of the whole TeX system. However, every paragraph
from now on until the end of the chapter is prefaced by at least one
dangerous bend sign, so you may want to learn the following material
in easy stages instead of all at once.

Before the lines have been broken, a paragraph inside of TeX is
actually a "horizontal list", i.e., a sequence of items that TeX has
gathered while in horizontal mode. We have been saying informally that
a horizontal list consists of boxes and glue; the truth is that boxes
and glue aren't the whole story. Each item in a horizontal list is one
of the following types of things:

* a box (a character or ligature or rule or hbox or vbox);
* a discretionary break (to be explained momentarily);
* a "whatsit" (something special to be explained later);
* vertical material (from \mark or \vadjust or \insert);
* a glob of glue (or \leaders, as we will see later);
* a kern (something like glue that doesn't stretch or shrink);
* a penalty (representing the undesirability of breaking here);
* "math-on" (beginning a formula) or "math-off" (ending a formula).

The last four types (glue, kern, penalty, and math items) are called
discardable, since they may change or disappear at a line break; the
first four types are called non-discardable, since they always remain
intact. Many of the things that can appear in horizontal lists have
not been touched on yet in this manual, but it isn't necessary to
understand them in order to understand line breaking. Sooner or later
you'll learn how each of the gismos listed above can infiltrate a
horizontal list; and if you want to get a thorough understanding of
TeX's internal processes, you can always use \showlists with various
features of the language, in order to see exactly what TeX is doing.

...

In order to save time, TeX tries first to break a paragraph into lines
without inserting any discretionary hyphens. This first pass will
succeed if a sequence of breakpoints is found for which none of the
resulting lines has a badness exceeding the current value of
\pretolerance. If the first pass fails, the method of Appendix H is
used to hyphenate each word of the paragraph by inserting
discretionary breaks into the horizontal list, and a second attempt is
made using \tolerance instead of \pretolerance. When the lines are
fairly wide, as they are in this manual, experiments show that the
first pass succeeds more than 90% of the time, and that fewer than
2 words per paragraph need to be subjected to the hyphenation algorithm,
on the average. But when the lines are very narrow the first pass
usually fails rather quickly. Plain TeX sets \pretolerance=100 and
\tolerance=200 as the default values. If you make \pretolerance=10000,
the first pass will essentially always succeed, so hyphenations will
not be tried (and the spacing may be terrible); on the other hand if
you make \pretolerance=-1, TeX will omit the first pass and will try
to hyphenate immediately.

Line breaks can occur only in certain places within a horizontal list.
Roughly speaking, they occur between words and after hyphens, but in
actuality they are permitted in the following five cases:

a) at glue, provided that this glue is immediately preceded by a
non-discardable item, and that it is not part of a math formula (i.e.,
not between math-on and math-off). A break ``at glue'' occurs at the
left edge of the glue space.

b) at a kern, provided that this kern is immediately followed by
glue, and that it is not part of a math formula.

c) at a math-off that is immediately followed by glue.

d) at a penalty (which might have been inserted automatically in a
formula).

e) at a discretionary break.

Notice that if two globs of glue occur next to each other, the second
one will never be selected as a breakpoint, since it is preceded by
glue (which is discardable).

Each potential breakpoint has an associated "penalty," which
represents the "aesthetic cost" of breaking at that place. In cases
(a), (b), (c), the penalty is zero; in case (d) an explicit penalty
has been specified; and in case (e) the penalty is the current value
of \hyphenpenalty if the pre-break text is nonempty, or the current
value of \exhyphenpenalty if the pre-break text is empty. Plain TeX
sets \hyphenpenalty=50 and \exhyphenpenalty=50.
</texbook>

I think in the above there are all the answers to your questions.

As regards the fact if we will we ever see space removed, please,
try the attached latex files, to be used with default A4. I think
they also answer your question.

-- 
Enrico

Attachment: removed.tex
Description: TeX document

Attachment: not_removed.tex
Description: TeX document

Reply via email to