Hi Oliver,

At 2024-06-12T22:12:34+0200, Oliver Corff via wrote:
> Absolutely reasonable.
[...]
> Do you know whether your scan is scaled 1:1? In this case only, direct
> measures could be taken from the image, assuming that the paper size
> is letter.

I attached the scan I used to my earlier email.  I think a quick glance
is enough to reveal that such hopes are much too high for this document.

The pages aren't even scanned _straight_.  This made it tedious to
repair the OCR generated from it.  The OCR engine appears to have
imposed resolutely horizontal baselines on the page images and quantized
the word position to them, which scrambled the word order with high
reliability.  I've attached the OCR text output; you may find it
amusing.

(There _was_ an "unskew" option in the OCR UI.  I clicked it.  It seems
to have done little or nothing.)

The pages also appear not to be cropped consistently, which annoys me
for another reason: that makes it impossible for me judge the sizes of
the page margins used by the formatter/macro package.

With these problems I think the geometry of the page scans is pretty far
from a rectangle with a consistent aspect ratio.

I think it's more likely than not that the paper format was U.S. letter.
If this had been a journal article, that bet would be off.

> Which, in return, makes visual identity an ideal tool to check for
> undiscovered glitches which have the potential to cause inconsistent
> line breaks.
> 
> When working in a negotiation team quite a few years ago, we would
> take pages of claimed-to-be identical copies or transcripts of text,
> superimpose them and check them against the light of a strong lamp for
> gray areas --- mismatches in print. This helped us discover a good few
> issues like altered digits etc., and the method was *much* faster than
> reading side by side.

Right.  I think I've mentioned this on the list before, but this is the
principle behind the blink comparator, the tool that helped Clyde
Tombaugh discover the dwarf planet Pluto.

And I do in fact use that technique in groff development (including
today while comparing nroff mode output for various memorandum types
between DWB 3.3 and groff).  Since I run terminals maximized to the
screen geometry anyway, such comparison is always a keyboard chord away.

I simply don't have any hope of applying the technique here.  Not unless
a much superior scan of an authenticated original document turns up.

Regards,
Branden
a itn hare th

Pate

oS ~~

ae

Case-39394-21

te ee

an ors i

date: July 7, 1978

Subject: A UNIX™ Operating System for the DEC VAX-

from: Thomas B. London.
John F. Reiser
78-1353-4

fe

™:

nh ne Bye

11/780 Computer

li

Bell Laboratories

MEMORANDUM FOR FILE

Introdaction

'

ms

1.

lee

ic digital comThe VAX-11/780 [1] is a new, general-purpose, stored-program 
electron
it provides
prices
mputer
puter manufactured by Digital Equipment Corporation. At minico
address space bound of
addresses and data which are 32 bits wide; the traditional minicomputer
the implementation of a
64K is gone. This memorandum describes the VAX-11/780 and
2 contains an overview
UNIX operating system and complete user e..vironment for it. Section
only to devotees of computer syssuitable for general consumption, details 
normally of interest
on software portability in Section
t
tem architecture appear in Section 3. The authors commen

4.

2.

Overview

Environment.

the VAXA user of UNIX and C software on the PDP-11 will find that

Nala Weta SO

a ap

apparent differences in the com11/780 provides a very similar environment. 
There are no
rily invoked directly from
customa
arc
mand language or the vast majority of programs which
hardware, except by issuing
the
the shell. A casual user probably will not be able to distinguish
the current user) or by noting that
the command “who am i” (which identifies the hardware and
is in hexadecimal rather than
one of the columns printed by the process status command ps
pointer data types all occupy 4
octal. The C language programmer will find that int, long, and

The architecture seen

is “culturally compatible” with
by the user-mode assembly-language programmer of a VAX-11
r with the PDP-11 can quickly
the PDP-11. Specific details differ, but a programmer familia
and uses

MASSBUS interfaces
understand tne differences. The VAX-11 provides UNIBUS and
the same input/output peripheral devices as a PDP-11.

virtual address space, intelliSignificant new features of the VAX-11 include an 
extended
The address space of a process is
gent console, and dramatically improved physical packaging.
divided into a large number of
divided into a few gigantic segments. Each segment is further
« viable memory management
paging
demand
small pages. Sufficient hardware exists to make
omputer through a standard
microc
LSI-11
an
strategy. All console functions are handled by
processor and can still halt,
the
from
located
ASCII terminal. The terminal may be remotely
of the VAX-1 1/780 is well
design
l
boot, or diagnese the VAX-11. The mechanical and physica
parts are easily accessiAll
cables.
done. The processor contains no sliding drawers or moving

ble for servicing.

Adequate airflow is maintained even under maintenance conditions.

ee

The VAX-11 is a follow-on computer to the PDP-11.

Ae

Hardware.

oe ee

ed char.
ted to longer integer types, but one may use the declaration unsign

te ps ene mcrae neS Bit wile ala eee the Mage ae oh eatin

stored in a different
bytes (a short still occupies 2 bytes), and that a long has its two halves
on when converextensi
sign
suffer
still
ers
order on the PDP-11 than on the VAX-11. Charact

The actual configuration purchased by Department

Configuration.

1353 is:

VAX-11/780 cpu
0.5 megabytes memory with battery backup
floating-point accelerator
12Kbyte uses-writeable control store
UNIBUS adaptor with DZ11 (8 RS-232C lines)

MASSBUS adaptor with TE16 tape drive (800/1600 bpi)

bytes per spindle)
_ MASSBUS adaptor with two RP06 disk spindles (176M
additional BA1IKE UNIBUS box

1978 was $241,255; the price including a

ary
The list price of the above configuration in Febru

DEC discount to a Bell Labs purchaser was $200, 242.
Software.

‘We

have

implemented

a

UNIX

operating

system

[2]

and

complete

user

ting system is Research version 7 as of
software environment on the VAX-11/780. The opera
shell, C compiler, code improver c2,
April 15, 1978. The environment includes the Bourne
y libS, C subroutine library libe,
assembler, loader, debugger, standard 1/0 subroutine librar
enance prothan 130 commands.
source code control system SCCS, nrofftroff, and more

Maint

disk pack handling have also been
grams for file system checking, bootstrapping, ani physical
implemented.

ting system,
We began with the C language code of Research version 7 of the UNIX opera

ing a C compiler which produand a PDP-11/45 running UNIX as a bootstrap 
machine. Creat
The code generator portion of the ced VAX-11 native-mode assembly code was the 
first task.
code ;

J

a
i

PDP-11/45 to the VAX-11/780.
for deadstart load, and physically carried these tapes from the
arrived on March
Work on the C compiler began in mid-December 1977. The hardware
system.
the
n of
3. We held a party on May 19 to celebrate successful multiuser operatio

rd

loader, based on similar
portable C compiler was rewritten to do this. An assembler and
Existing PDP-11/70 device for the Interdata 8/32, completed the basic support 
software.
were adapted to the VAX-11/780. :
drivers for disk, tape, and terminal communication lines
etc.) were completely
Assembly language interfaces (trap handlers, hardware initialization,
initial file system and an
for
format
rewritten. We then created magnetic tapes in the proper
-

/780 and on a
Performance. Identical documents were formatted by nroffon our VAX-11
Identical C proPDP-11/70 running Research version 7 UNIX, both systems used 
RP06 disks.
the PDP-11/70.
grams were compiled and assembled on the VAX-11/780 and on

As reported

by the fime command, the results (converted to seconds) were:

nroff -ms -e -1T450-12 ios.c >/dev/null
VAX-11/780
PDP-11/70

real
47.0
54.0

user
28.6
36.9

sys
8.7
7.9

real
86.0
82.0
153.0

user
43.5
64.0
114.6

ce -c -O pftn.c
PDP-11/70 (Ritchie compiler)
VAX-11/780 (portable compiler)
PDP-11/70 (portable compiler

for Interdata 8/32)

sys
11.8
10.5
16.6

time, the
From the statistics on nroff one should conclude that, based on user-mode CPU

VAX-11/780 can execute the code produced by the VAX-11

C compiler approximately 22%

faster thar the PDP-11/70 can execute the code produced by the PDP-11 C 
compiler.

This is a

by the
measure of the combined power of the hardware and efficiency of the code 
generated

~

_

Except

compiler.

as an

upper

limit,

figures

the

give

no

indication

as to the

throughput,

in real time and system
response time, or efficiency of the operating system. The differences
significant.
time between the VAX-11/780 and the PDP-11/ 70 are not
a "black box” comparThe times given for compilation of the file pfin.c are an 
attempt at

er) which takes C language
ison of appies and oranges. The black box is any program (compil
son is that the current
compari
ox
input and produces executable instructions. The black-b

the VAX-11 requires
VAX-11 C compiler running on the VAX-11/780 and compiling code for
on the PDPrunning
r
49% more user-mode CPU time than the current PDP-11 C compile
the

The apples and oranges aspect arises because
black box viewpoint, are (on the inside) totally

11/70 and compiling code for the PDP-11.
two compilers, while equivalent from the

different pieces of software.
Ritchie; the VAX-11

M.
The PDP-11 compiler is a production compiler written by D.
on work

compiler is a portable compiler based

The

by S. C. Johnson.

and compiling for the Interfigures for the portable compiler running on the 
PDP-11/70
portable compilers. We have no
data 8/32 are included for those who wish to compare two
enable

the tests which would
VAX-11 equivalent to the Ritchie compiler, and thus cannct run
comparison of two production compilers.
programs appears in
The loaded size in bytes of the operating system and seven other
(instructions) sizes on the
Table 1. One should note the general similarity between the text
sizes on the VAX-11 and
data)
alized
(uniniti
PDP-11 and on the VAX-11, and between the bss

on

the

Inte:data 8/32.

The

particular

PDP-11

system

UNIX

chosen

has

several

not in the VAX-11
input/output device drivers and experimental multiplexing software

which accounts for its larger text size.

If many global integer variables (or large arrays) are

used, there is a tendency for the data and bss portions to double

PDP-11

to a VAX-11

.]
'

;

going from a

However, character arrays occupy the same amount of

An unusually large number of references to global variables in the nroff

program accounts for its increase in text size on the VAX-11

4

in size when

or an Interdata 8/32 because an int occupies two bytes on the PDP-11

and four bytes on the other machines.

space on all machines.

more

system,

compared with the PDP-11.

A

used in the VAX-11 code
program can be written to automatically change the addressing modes
been

so that most references to global data become
done.

shorter than at present, but this has not

t hardware environEvaluation. We believe that the VAX-11/780 provides an 
excellen
state, we view the
ment for running UNIX and C software. With the software in its current
that the 64K
except
,
software
UNIX
system as operationally equivalent to a PDP-11/70 running
advanced
We believe that the
limit on process address space is gone and progrems run faster.
ties of the VAX-11/780 offer an
memory management and user/system communication capabili
ially higher throughput than
opportunity to construct future UNIX-like systems with substant

provided by today’s UNIX on a PDP- i 1/70.

3.

Details

Hardware

.

memory, and input/output
Four main subsystems — the central processor, console, main
processor, memory, ‘id
central
The
— constitute the VAX-11/780 computer system.
Interconnect (SBI), an
ane
Backpl
input/output subsystems are connected by the Synchronous
13.3 megabytes per second. The
internal synchronous bus with a maximum data throughput of
the SBI address space is (zserSBI deals in physical addresses which are 30 bits 
wide. Half of
registers. Arbitraticn for bus
ved for memory addresses, and half for input/output device
use the next bus cycle.
cycles on the SBI is distributed; each subsystem decides if it will
er computer. The archiThe central processor is a microprogrammed 32-bit 
ysneral-regist
mmer is “culturally compatible” with
tecture seen by the user-mode assembly-language progra
can learn and understand the
the PDP-11; an expert programmer familiar with the PDP-11

handles binary integers of 8, 16, and 32 bits,
differences in one day or less. The processor
s
(64 bit) floating-point numbers, character string
single precision (32 bit) and double precision
tu
up
s
string
wide; and IBM-style packed decimal
up to 65535 bytes long, bit fields up to 32 bits
er, all other data types require
res‘rictions whatsoe'
31 digits lony. Bit fields have no alignment
genThe central processor provides sixteen 32-bit
alignment only to a byte (8 bit) boundary.
the
counter pe. Software operating in one of
eral registers. Register 15 is the program
cinstru
The
sp.
er
register 14 as a stack point
privileged access modes (see below) must use
call and return
tions which implement high-lev.1 procedure

(pushl, calls, callg, ret) assume a

the
(fp, the frame pointer) and register 12 (ap,
convention about the use of sp, register 13
use
s
string
handle character and packed decimal
argument pointer). The instructions which
to be interruptible. Floating-point

counters, so as
registers 0 through 5 to hold pointers and
care no separate floating-point registers. Instru
operations may use the general registers, there
by
tion code occupies one byte and is followed
tions take from zero to six operands. The opera
modes (including all

each. Nine addressing
the operands, which require from one to nine bytes
the addressing modes are independent of the
the PDP-11 modes except *—(r)) are allowed, and
executing in the context of a process, there are
operation code. When the central processor is
tive, kernel), each with its own stack
four access privilege modes (user, supervisor, execu
A fifth stack

stack is easy to implement.
pointer, software which desires a per-process kernel
e interrupt context. The VAX-11/780
pointer is used when executing in a special system-wid
associative, write-through, memory data
processor includes an cight kilobyte, two-way set
a 128-address virtual. address translation
cache; an eight-byte instruction stream buffer, and
A programmable

buffer.

MSI logic.
Most of the processor is implemented in Schottky TTL

ed during loss of line voltage) are stan-.
realtime clock and a time-of-year clock (battery operat
ng-point accelerator and user-writeable condard equipment. Options include a 
hardwired floati
trol store.
ter, local memory, floppy disk, DECThe console subsystem consists of an LSI-11 
compu
port. The console is connected directly to
writer terminal, and remote-access communications

of a conventional “lights and switches”
the central processor and performs all the functions
operation and
disk serves as the initial bootstrap device for normal

lly) 98% of all memory
need for extra memory references during address translation for (typica
RAMs with an error
r
nducto
semico
MOS
memory is implemented using

references. The
all double-bit errors and 70% of
correcting code which corrects all single-bit errors and detects
handle 8 memory boards; using 4K
all greater-than-double bit errors. A memory controller can
memory controllers, thus the
chips each board can hold 128K bytes. There can be two

wee

eee

front panel. The floppy
When activated by a key switch on the central
holds special microcode for diagnostic operation.
e. A terminal connected through the
processor, the remote-acccss port becomes the consol
it, diagnose it, etc.
remote-access fort can halt the central processor, boot
the VAX-11/780 consists of 2°°32 8-bit
The virtual address space of a process running on
mine one of four segments. Two of
bytes. The two high-order bits of a 32-bit address deter
address space of all processes. One of the
these segments are system segments common to the
two segments are separately defined for
system segments is reserved for future use. The other
context switching instructions. One of the
each process and are automatically managed by the
grows towards lower-numbered memory
per-process segments is designed for a stack which
bytes. Memory mapping hardware translates
addresses. Segments are divided into pages of 512
A page table contains one four-byte
virtual addresses into physical addresses using page tables.
bit, a four-bit field which encodes access
entry for each page mapped, the entry contains a valid
number where the page is mapped.
privileges, a modify bit, and the physical page-frame
re!) A base register and @ iimit regis(There is no reference bit which is 
maintained by hardwa
register of a per-process segment conter describe the page table of each 
segment. The bese
register for the system segment contains a virtual address within the system 
segment, the base
sor contains a virtual address
tains a physical memory address. The VAX-11/780 central proces
r pairs which eliminates the
translation buffer holding 128 virtual address-page frame numbe
|

maximum

amount of physical memory

is currently 2 megabytes.

When

16K chips are used

be 8 mega(forecasted for late 1978], each board will hold 512K, and physical 
memory can

ey

failure.
bytes. There is a battery backup option for maintaining data in the event of a 
power
Esch optional battery will maintain 1 megabyte for 10 minutes.
adaptors. A
The input/output subsystem consists of UNIBUS adaptors and MASSBUS

the SBI. The UBA
UNIBUS adaptor (UBA) is an interface between a standard UNIBUS and
It also conUNIBUS.
the
er
administ
bus arbitration and everything else necessary to

does the

addresses. The maxtains a set of registers for mapping UNIBUS addresses to and 
from SDI
S adaptor (MBA) is an
imum throughput on a UBA is 1.5 megabytes per second. A MASSBU
TE16 tape, etc.). An MBA
interface between the SBI and MASSBUS devices (RPC6 disk,
controller on a
would be more properly called an RH-780 controller, analogous to the RH-11
PDP-11/70 MASSBUS;

units
only one unit inay transfer data at a time, although severa! similar

. The MBA contains
connected to the same MBA can execute control functions simultaneously.

registers lie in the /O
the device control registers normally found in an RH controller. The
which translate devsection of SBI addresses. An MBA also contains a set of 
mapping registers
on a MBA is 2.0
ice byte addresses to and from SBI addresses. The maximum throughput
Theoretically
megabytes per second. The published limits are 1 UBA and 4 MBAs per system.
of central procesone could have any number of either kind as long as the sum of 
the number
since the SBI
sors, memory controllers, MBAs, and twice the number of UBAs were 15 or less,
has 15 “ports”.
with the
The physical packaging of the system has been dramatically improved compared

is
PPP-11. The VAX-11/780 processor cabinet contains no drawers or moving cables. 
The SBI
—
flow
air
sufficient
fixed and rigid. Three one-third horsepower squirrel-cage blowers provide

be replaced within
even while servicing the CPU. Any logic card, rower supply, or blower can
x 1.17m x
twenty minutes by one person using only a screwdriver. The CPU stands 1.53m
usually bolted
0.77m (HWD); cabinets housing the CPU, UNIBUS devices, and tape drive are
section 2)
(see
tion
configura
Our
0.77m.
x
togetner to form a single unit 1.53m x 2.5lm
weighs 3452 pounds and requires 42050 BTU/hr cooling.

C Compiler

portable comA VAX-11 “native mode” C compiler was constructed using S. C. 
Johnson’s
it produced code which
piler as a base. After one month, a reasonable version began to evolve:
bootstrap PDPwas good enough to exercise the assembler, loader, and debugger 
(on the
(which does
addressing
indexed
VAX-11
of
use
make
11/45). This initial version did not
or
instructions,
field
bit
shifts),
index
single-level array subscripting including appropriate
the
since
particularly
bugs,
of
share
autoincrement/decrement audressing. It contained its
code.
hardware had not arrived and could not be used to actually rin the generated

of the
Substantial effort has been subsequently direcied towards improving all aspects
and
y,
efficientl
more
compiier: buss have been corrected, routines have been made to execute
the quality of the generated code has been improved.

All addressing modes are supported, bit-

wi’

and autodefield instructions are used for programmer-defined bit fields, and 
autoincrement

crement addressing as well as three-address instructions are used.
Overall, our experience with the compiler has been vety favorable.
11/780 was delivered, the compiler worked well enough to compile itself,
and many user-level commands. In fact, since the delivery of the machine,
dozen serious bugs have been detected. Additionally, the framework of the

When the VAXthe UNIX kernel,
only about a halfcompiler has pro-

for
ven itself to be flexible: a compiler for the Interdata 8/32 was transformed 
into a compiler

a
the VAX-11/780, some improvements and extensions were easily added, and, in 
general,
witn
that,
feel
authors
The
quickly evolving compiler has remained stable and productive.

few extensions to the model of the compiler and a certain amount of tuning, the 
current VAX11 compiler could easily remain as the production VAX-11 compiler.

n of the compiler, as well as in the
There are still some deficiencies in the current versio
quite large; see the statistics in section 2 and
basic “product” itself. The compiler is slow and
gy of the first pass can be attributed to the
Table 1. Some of the blame for the size and lethar
nicate
for the parser, and to the use of ASCII to commu
use of lex for the scanner and ya
bytes
17K
is
r
e large routines: the scanne
information between passes. Both /ex and yacc produc
bytes

parser is 16K bytes long (over 5.5K
in length (over 4.5K dytes of instructions), and the
spends 20% of its time in the lexical scanner.
of instructions). On the average, the first pass
yylook, and 9% of its time in the parser yyparse.
passes causes an additional speed penalty
Using ASCII to communicate between the two
of its
programs, the first pass (parser) spends roughly 30%

for character conversion. On typical
_strout
time performing output services (i.e., calls to _doprnt (18%),
its time
of
21%
y
roughl
spends
tor)
while the second pass (code genera
used to
e
routin
the
y,
ionall
(Addit
calls to read (18%) and rdin (3%)).
31)
—(2°°
is
(which
48"
binary contained a bug which caused *.21474836
our PDP-11/45.)
model.
The above problems are not inherent to the compiler

(8%), and printf (4%)),
reading it back in (i.e.,
convert from ASCII to
) to be read as zero on

To speedup compilation, the

er), and the interpass data can be
scanner can be hand-coded (as in the standard PDP-11 compil
With these simple modifications
formatted in binary (or the two passes can be combined).
e a compiler almost twice as fast
(some are already in progress), it should be possible to produc
as the current one.

Two

features of the VAX-11

architecture

—

three-address instructions and indexed

ure of the compiler. The full
addressing mode — were difficult to mode! within the basic struct
lt that it was not really
difficu
so
address instructions proved to be
implementation of threeer, tries to merge several instrucattempted. Instead, 
c2, the assembly language code improv
example, tr: statement @ = b+7
tions into an appropriate three-a.'dress instruction. For
compiles
addl3__b,c.r0
movi
10,a
which the improver can change to:
addl3—byc,a
for a savings of three bytes and over 400 nanoseconds.
this shortening. It cannot tell the difference between

However, c2 will not always succeed in

a=b+c;
return;
and

return(a = b +c),

may be required later)
since register r0 must be considered “live” (i.e., contains a value which
across the return statement.
of an element of a
The VAX-1]1 has six indexed addressing modes which yield the address
or double). The
one-dimensional array of a base type (char, skort, int, long, pointer, float,
statement

ali) = b&) * clk);

external or
where i, j, and & are declared register int and a, 5, and ¢ are double arrays 
(either
local). can be compiled into the single instruction: ~

“a

muld3

b{jJ,clk), ali]

oO

must be a register, the base address
Although the index specifier (e.g. iin the above example)
or another indexed mode. For
specifier can be any addressing mode except register, literal,

(+ +)fiJ, and (p+ +J[i) (or their
example, the C-language constructs a/i/, (sp)[il, (pill, e(ep+
+ +i), respectively) all can be
equivalents ¢(a+i), *(ep+i), o(--p+i), (p++ +i), and
type, pis a pointer to the same
done with a single VAX-11 address (where a is an array of base
ze or conveniently represent
type, and /is of type register int). It is usually difficult to recogni

(e.g., a/i/ where a is not
such constructs (e.g., @p+ +J/i/ is fun), or generate the possible cases
readily addressable).
ion trees of height one
‘fhe fact that the code generator can easily recognize only express

(two if OREG and UNARY

making

MUL nodes are taken into account) causes substantial difficulty in

ing.
use of indexed mode, three address instructions, and indirect address

the statement
trees of non-trivial height occur not infrequently (e.g. as a worst case,
a=b

Expression

+ (p+ +)i{i);

instruction
has an expression tree of height six, but can be compiled into the single

addi3__b,°(p) +[i),a
raised by forcing the
if p and i are register variables). The complexity of the code generator is
checks, special ©
compression of subtrees into single nodes which are then treated with special
code, etc.
ent, even though
The size and alignment attributes of data objects are logically independ

have imposed
previous hardware architectures (IBM 360, PDP-11, Interdata 8/32, ...)
although prons,
restrictio
such
alignment restrictions based on size. The VAX 11/780 has no
grams run faster with data aligned on natural boundaries.

The C language has little notion of

basic data types
alignment; because of run-time penalties, the VAX-11 C compiler aligns all the
on address boundaries which are a multiple of sizeof the basic type.

Due to questions about

on char c:10,.
alignment, both the language and the compiler have difficulty with the declarati

effects which cannot
The decision to naturally align most data items has urdesirable side

be ignored.

Consider the structure declaration

struct foo [
char c,

float f;

} bar;

is currently 8 bytes
On the PDP-11, sizeof (oo) is 6 bytes while on the VAX-11, sizeof (foo)
5 bytes in each case.
(the offset of f within bar is 2 and 4 respectively). sizeof (foo) could be

floats, the differing alignment
Although both machines use the same data formats for chars and
s cannot speak directly to
machine
imposed by the the VAX-11 C compiler means that the two
Since
information.
binary
one another using media whick record structures containing

.
alignment is important, we feel that it ought io be specifiable in the C 
language

Operating system conversion
rting software
A UNIX system running on a PDP-11/45 was used as the base for transpo
produced by members of
to the VAX-11/780. The software itself originated with the code
Programs were crossCenter 127, Computing Science Research, for the Interdata 
8/32.
absolute bit-string files
compiled, assembled, loaded, and put on magnetic tape in ¢p format,

the VAX-11/780.
were put on tape ‘n dd format. Tapes were then carried across the room to
(in assembly
An absolute tape boot (in machine language), «p boot and: primary disk boot
verifier,
disk
er,
formatt
(disk
utilities
lone
language), secondary disk buut (in C), and stand-a
tape-to-disk, disk-to-ta;

, disk-to-disk, and disk-to-console, all in C) were then used to bring

up the system.
er than expected.
Establishing an initial file system on the disk took long

was running USG issue 3 of the UNIX

The PDP-11/45

operating system with a "16-bit" file system and the

system. Also, C-language code on
VAX-11/780 was to have a Research version 7 °32-bit" file
be stored in a different order than Cthe VAX-11 expects the bytes of a 32-bit 
integer to
red herrings hard, and suffered. We
language code on the PDP-11}. We swallowed these two
em is to modify the program mkfs so
now know that the proper way to create an initial file syst

ng the proper bits, put that file on
that its output (on the bootstrap machine) is a file containi

ine.
tape, and use the tape-to-disk utility on the target mach
g system onto the hardware archiMapping the software architecture of the UNIX 
operatin
s. Commentary on these decisions foltecture of the VAX-11 required a number of 
decision
lows.

The

SCB

(system

context

base)

processor

the

user stack

register

contains

a page-aligned

physical

puts
memory address which is the base of the hardware
this vector at physical memory address zero.
the VAX-11/780
Operating system code, data, kernel stacks, and interrupt stack occupy
and data are loaded into
system segment (virtual addresses 80000000 to bfffffir). User code

segment

cero

and

(0 to 3fffffif)

interrupt vector.

is initialized

The

system

UNIX

in segment

one

(7ffffif to

calls
User processes pass arguments to system service code using the ordinary
40000000).
The
privileges.
kernel
gain
to
used
then
subroutine calling sequence. The chmk instruction is
does
but
stack,
kernel
the
chmk instruction switches the stack pointer sp from the user stack to

the value in ap to
- not change the argument pointer ap or the frame pointer fp. The kernel uses
values to be directly
copy the arguments into u.u_arg. The VAX-11 hardware allows the
addressed, but the kernel software requires the copy.
keeps swappable
The w area is a per-process data structure in which the operating system

information about a process.

The kernel virtual address of the u area must be a constant across

address 0160000; when
all processes. The PDP-11 implementation puts the wu area at kernel
space segmentation
process switching occurs the u area is switched by changing a kernel data
the u area could
register. Since the operating system can address user memory on a VAX-11,

be placed in (protected) user memory, say at address 0 or at 7fffe000.

However, it was desira-

s part of the w area,
ble for the first implementation to make the page tables for user segment
base of the u area
The
space.
system
in
lies
area
u
which creates timing problems unless the
the u area is
occurs,
g
switchin
process
When
.
was assigned kernel virtual address 80020000
translation
le
page-tab
the
ting
invalida
and
table
changed by changing the system-space page

cache for the appropriate pages.

process,
Since the operating system can directly address the meme-y of the cur:.nt user
macros
into
made
be
could
and
the procedures fubyte, subyte, fuword, etc., are unnecessary
with
(along
es
procedur
these
,
which would merely do the appropriate load or store. However
copyin and copyout) were kept to ensure that each access to user space is valid.
to
A VAX-11/780 internal processor register called the PCB (process context base) 
points
when
an area in which the VAX-11/780 saves the hardware state of the machine (96 
bytes)
switching context. This save area \.as put in the wu area as u_rsav.
The implementation of context switching required major effort. The VAX-11 has 
two
very nice instructions (svpetx, save process context, and Idpctx, load process 
context) which
facilitate context switching. Unfortunately, they do not impiement the 
mechanism which the
UNIX system expects. (The mechanism used by UNIX is so dispersed and 
intricately detailed
that it is hard to imagine any hardware which implements it directly.) The 
terptstion to drasti-

cally change the UNIX code has been resisted so far.
inated, but it took

more

than a week.

The

newer

The savwretudretu tar pit was VAX-

save/restore primitive

does

make

the

C-

language code prettier, but the assembly-language side (at least for the 
VAX-11) is just as dirty

as ever.

The

UNIX

context

switching

mechanism

requires

three state save

areas,

W.u_rsav,

also used for abnormal returns. The
u.u_ssav, and u.u_qsav because the seme mechanism is
of the
ctions use only a single state save area. To make use

VAX-11 context ‘switching instru
deal of microcode and bastardizes call
VAX-11 instructions, the software simulates a great
is certainly high on the list of things to
frames in a most ugly manner. Context switching

the PDP-11!).
rewrite in the second implementation (even for

to implement.
The procedures sureg and estabur were also tricky

They were designed with

fewer) of registers would be needed to map the
the assumption that only a small number (16 or
process requires 64 page table
of a user process, while on the VAX-11 a 32K

address space
entries. Furthermore, the memory

expand and getxfile.
Handling DMA

map

in
of a process is diddled in tricky ways, particularly

eneck.
I/O hardware was the other major implementation bottl

The UBA

ry page numbers, and physical addresses are
and MBA mapping registers contain physical memo
hardware which implements the mapping
hard to handle. It is not pleasant to deal with the
ing registers may be neither read nor
registers. If an I/O transfer is in progress then the mapp
by the transfer. As a result, the
written; this applies even to registers which would not be used
ng the current 1/O operation. Furthermap for the next I/O operation cannot be 
setup duri
the byte counter is only 16 bits wide.
more, a single transfer is limited to 64K bytes because
I/O operations. The solution to these
ple
Thus swapping a process to the disk can require multi
registers in each map to service both
problems involved permanently reserving the last 129
ters are available to map the system
swap and physical I/O operations. The remaining map regis
ECC error correction is currently
buffers, and are loaded at system initialization time. Disk
s on raw I/O cause process terminadone only for /O involving the system 
buffers. Disk error
tion, the swap area on disk had better be error-free.

entation for the VAX- 11/780
Like the UNIX system for the PDP-11, the current implem
when there

y and swaps processes to disk
maintains each process in contiguous physical memor
fragmentation

is not enough physical memory to contain them all.

Reducing external memory

a

a

g hardware for scatter loading is high on
to zero by utilizing the VAX-11/780 memory mappin
pass. To simplify kernel memory allocathe list of things to do in the second 
implementation
an assembly parameter which currently
tion, the size of the user-segment memory map is
text, data, and stack. This also deserves
allows three pages of page table or 192K bytes total for
to allow processes larger than physical
to be rewritten, both to allow varying process size, and
would mean dynamic wu area size if
memory through demand peging. Dynamic page table size
the page table remained part of the u area.
s a tedious simulation of the
The code in sendsig for sending a signal to a process involve
privilege modes upon termination
calls instruction due to the problem of “inward retum” across
of the kernel code readable by a
of the routine which handles the signal. Making a portion
a problem with the Bourne shell, the
user-mode process would simplify sendsig. Motivated by
signal number is passed as a parameter to the signalled routine.
uses the low-order bit of a
Interprocess communication via signals (signal and kill)
implies that a procedure which
machine address for something other than addressing. This
that every procedure must |
means
which
ry,
handles signals must start on an even byte bounda
a pseudo-op to the assembler to
start on an even byte boundary. The C compiler thus issues
on a VAX-11. It also imposes
memory
align the beginning of each procedure. This can waste
of conditional jump instrucion
a nontrivial requirement on the assembler, since if the resolut

alignment directive must also
tions can change the parity of the length of a procedure then the
distinct value
be handied like a conditional jump.

In hindsight, it would have been better if a

bottom bit.
(say +1 or -1) were used for ignore, rather than multiplexing the
n by zero. The sysThe VAX-11/780 provides a (non-maskable) trap for integer 
divisio
subscript
into a signal to the process. A similer situation exists for

tem would like to turn this
underflow, and reserved operand also
range trap. Integer overflow, floating overflow, floating

-10-

is needed with some other means for
need signal numbers. Perhaps only one “error” signal
interrupts, signals, asynchronous I/O, ar?
determining the true fault. The whole business of

attention.
the use of the hardware AST mechanism deserves more
involving the proc and
A bug was discovered in the UNIX code for process termination
only be noticed if a
would
but it
xproc structures. (The problem also existed on the PDP-11,
highly unlikely.)
is
which
process had accumulated more than 65535 ticks of system time,

When a process dies its resource
process CPU time) are temporarily
dents of the parent process. The
process issues a wait system call;

utilization statistics (currently only
saved so that they can be added to
actual accumulation is done by the
the child process is then completely

exit status, system, and
the totals for the descenkernel when the parent
erased. Tue kernel was

dy the scheduler to contain
overlaying the statistics in a part of the proc structure normally used
no harm. But “~ the
causing
ately,
immedi
the pointer p textp. Ordinarily the exit was processed
the

scheduler could sneak in after
system was loaded so that swapping was necessary, then the
interpret the timing data in the
child exited and before the parent read the statistics, and would

memory reference from
zombie xproc structure as a pointer. This invariably caused an illegal
kernel mode on the VAX-11/780.
a design quirk in
One of the greatest disappointments with the current system stems from
between floating-point
the FP-11 floating-point processor for the PDP-11. When convertir.

to be stored at the
and 32-bit integer, the FP-11 expects the high-order 16 bits of the integer

of the PDP-11,
lower memory address; this is not in line with the general "right to left” 
design
the PDP-11
for
code
which would place the low-order 16 bits in the lower memory address. C
e stores the least .
uses the FP-11 convention for storing beng integers. The VAX-11 hardwar
for the VAX-11
significant bit of any integer data type in the lowest addressed byte. C code
nted in the
represe
integers
long
ing
contain
files
uses the hardware convention. This means that
local convention are not binary compatible
UNIX system on the PDP-11. This is the
machines: char, short, float, and double all
(and the structure alignment problem noted

between a UNIX system on the VAX-11 and a
only exception for data types common to both
have a common representation. Except for this
earlier), disk packs containing 32-bit file systems,

Plus for the
tapes, etc., would have been interchangeable. The fact that DEC’s Fortran-IV
between
PDP-11 avoided the FP-11 convention, and that RSX-11 files are binary compatible
the VAX-11

and the PDP-11, is only salt on an open wound!

Subroutine libraries

libe. Conversion of the system-call
Most routines are merely

LI:

.word
chmk
bee
jmp
ret

interface routines was straightforward

but tedious.

0x0000
$nn.Ll
cerror

The routines printf, ecvt, and fevt were left to 1ibS and were not implemented 
in libe.
iibS. Conversion of the standard input/output library libS posed no problems 
except for
__doprnt, the routine which constructs character representations of other 
datatypes for the prin-

ting routines printf, Jprint/, and sprinyf. Since many programs spend 15% to 
20% of their execution time within __doprnt, it pays to code the routine for 
speed in assembly language. Packeddecimal instructions handle decimal, 
unsigned, and floating-point conversions. The algorithm

chosen for converting from floating-point to character string revealed a 
microcode bug in the
VAX-11/780's ashp (arithmetic shift and round packed) instruction. Under 
certain conditions
a carry from the rounded digit propagated both to the adjacent digit and to the 
digit eight places
further left. This usually caused an overflow, since the destination 
packed-decimal string was

-ll-

for the
spurious carry. DEC claims to have a fix
typically not long enough to represent the
cts
corre
meantime a five-instruction patch detects and
bug, but the FCO has not arrived. In the
the spurious overflow.
Commands
as, id.

8/32 was the model for an interCode developed by Center 127 for the Interdata
heuassembler uses an algorithm described in [3] with

pretation by a VAX-11/780 artist. The
jump
ristic improvement of [4] to resolve conditional

pseudoinstructions.

Variable-length,

~—”

files to
forced the relocation information in object
unaligned instructions and address constants
deducing
for each relocatable datum, rather than
include the explicit segment-relative address
the
between the position in the segment and
the address from a one-to-one correspondence
infor-

This caused a slight change in the header
corresponding position in the relocetion table.
mation within object files.
generated by the VAX-11 C compiler is
c2. The code tmprover for the assembly language
usage pass, performed once
A “backwards” register
based on a similar program for the PDP-11.
is live
addition. Knowing that no temporary register
and before anything else, was a major
where
pass introduces three-address instructions
across a backwards jump, the register usage
bs), extract field

jump on bit (jbe, jbs, jibe,
ever possible. It also recognizes situations where
pushal, pushab) instructions can be used.
(extzv, movzbl), and move address (moval, movab,
aob, acb was als extended.

instructions sob,
The code for insertion of fancy loop control
a
lic debugging routine was the writing of
adb. Tne most signifcant change to the symbo
outand
input
uctions. Additionally, the character
disassembler for VAX-11 nativeemode instr
initialized
radix for all numeric values. The radix is
put routines were modified to use a default
to sixteen.

sh.

interpreter.
The (Bourne) shell is the star.dard user command

It required by far the

it is not
portable program, for the simple reason that
largest conversion effort of any supposedly
rewritpainstakingly
be
to
language and had
portable. Critical portions are coded in assembly
in
routine
standard
functionally different from the
ten. The shell uses its own sbrk which is
the
giving
a signal to be passed a parameter
libe. The shell wants the routine which fields
a private routine. This was handled by
also
was
number of the signal being caught, signal
in the first place, doing away with the
having the operating system provide the parameter
sys(for constr cting the argurcent list to an ex2e
private code for signal Tie code in fixargs

tem call) bad te be dicdled.
ns

Jievimem

ijostat.

(physical

The

process

memory)

and

when

input/output

they should

status

have

commands

referred

consistently

to Mev/kmem

referenced

(kernel

virtual

by the kernel were allocated
jiostat also assumed that certain variables maintained
memory).
as part of a structure.
contiguously, even though they were not declared

pr.

bug that caused a division by zero
The command which formats and prints files had a
On a PMP-11
several files and the first file in the list did not exést.

when it was asked to print
2 VAX-11 it gives an unmaskable trap. ©
division by zero returns the dividend, but on
their arguments using the first parameter
cat, du. These two commands did not count
-1) could be

ent (argv/argc], initialized as
argc, but rather assumed that an additional argum
ss references the fixed end of the stack,
used as a pointer. On the PDP-11 the resulting addre

on the VAX-11, -1 is an illegal address.
preparation and phototypesetter commands
nroff troff. The source code for the document
produce properly ruaning version of these comis not portable; several weeks 
were required to
quite
it) constent “2° instead of sizeof(int) was
mands. Use of the explicit (or worse, implic
y
occup
ns
are adjacent in external declar.iio
common. The cede assumes that variables which
proge
tables are initialized by assembly-lsngua
contiguous memory at execution time. Several
thought it knew the

grams.

|

code which
Converting the tables was merely tedious, changing the

tia
alee
PI
wine oe ga

was created using the conver-

to provide version
SCCS. Version 4 of the Source Code Control System [5] is used
itself had not
SCCS
for
source
The
backup for software in case disastrous bugs are introduced.
ng. The
massagi
some
d
require
!
quite been converted to version 7 UNIX, and the header files

procedures for dynamic
PWB routines logname and pexec had to be simulated. The utility
and to remove PDP-11
storage allocation required some work to integrate them with libS
delta to bomb. The
dialect. The exit status of the dif’command changed in version 7, causing

The documentation
code implicitly assumed that all checksums were computed modulo 65536.
procedure safoi
The
"65535".
say
reaily
is incorrect: everywhere "99999" appears it should

paran.ter. Naturally, satoi
returns two values, storing one of them indirectly through a pointer
to track down.
day
a
and its callers did not agree on sizeof the stored value; this took

4.

Software portability
We thank the members

of Center

127, Computing Science Research, for their efforts in

re portable.
producing the basic software and for their recent efforts towards making the 
softwa
system for
g
runnin
a
create
quickly
can
The fact that peor‘e other than the original develcrers

a new machine is a tribute to how well the original work was done.
stumbled
Yct in our effort to transpuit a complete UNIX system to the VAX-11/780 we
g
lack or
seemin
across a large number of nonportable constructions and were dismayed by the

strongly recomapprapriste facilities to detect and prevent them. Based on our 
experience, we
er
ed sostint
enhanc
beil
and ks comp
andge
mend that the C langua
The actual arguments in a procedure call are type checked against the procedure 
declara1.

protion, and a “dummy” declaration which specifies types is permitted even if 
the called
cedure is not actually declared in the same compilation.

2.
3.

The

'—>’ operator is checked to insure that the structure element

on the right is a

member of a structure to which the pointer on the left may point.
A structure element may be declared with any name as long as the name is unique 
within
(The current requirement that a structure
the immediately surrounding structure.
element name must uniquely correspond to an offset from the beginning of the 
structure,
across ail structures in a compilation, creates naming problems and frequently 
leads *a

errors of the type noted in item 2 above.)

4.

The issue of alignment to an even-byte (or other) boundary is brought into the 
open, so
that arbitrary data structures can be accurately described.

There is a program called Unt [6] which, if conscientiously used throughout the 
life of a
piece of sc{vware, provides type checking which partially addresses the first 
two points in the

above list. The problem is that Jint is big, noisy, relatively recent and 
unknown, and (partially
as a result) infrequently used. There is little incen.ivs for the average 
programmer to use lint
as a matter of course. The authors believe that type checking belongs in the 
everyday compiler
as the defauli, where it is very inexpensive to implement. Those who wish to do 
“dirty” work
may request that type checking be disabled; those who wish to bless their dirty 
work may use
type casts.

We believe that these four enhancements would go a long way towards making C

langu-

age software portable as a rule rather than as an caception, thus preserving 
Bell Laboratories’

investment in present and future C software.

Bb

This memorandum

Face Pte i

format of an 2.out file required some effort.
ted nrofftroff programs on the VAX-1 1/780.

wai
Tees) ees

and Department 8234, for helpful comments and suggestions.

uns Aboud

Thomas B. London

Te

e

aneT

ng questions
Acknowledgments. Thank you, D. M. Ritchie and S. C. Johnson, for answeri
stand-alone utilities,
at key moments; G. K. Swanson, for assistance with boot procedures and
help in bringing up
for
Sharma,
K.
D.
and
J. F. Jarvis, for the mathematical function library,
127 and 135,
Centers
of
s
member
user-level commands. Additional thanks go to many other

Tees BaP emer

™

er

-13-

F Renew

ohn F. Reiser
HO-1353-tbi/jfr
Att:
References
Table 1

Maynard,

Mas-

sachusetts, 1977.
17, 7 (July
D.™M. Ritchie and K. Thompson, The UNIX Time-Sharing System, CACM
1974), 365-375. See also BSTJ 57, 6 (July-August 1978), 1905-1929.
Design
W. Wulf, R. K. Johnsson, C. B. Weinstock, S. O. Hobbs, and C. M. Geschke, The
of an Optimizing Compiler. American Elsevier, New York, 1975.
78J. F. Reiser, Common Instances of Pathological Span-dependent Instructions, TM
1353-3.
SCCS/PWB User's Manual, The Source Code Control System.
§.C. Johnson, Jint, a C Program Checker. Computing Science Technical Report 
#65, Bell
Laboratories, December 1977.

ne

Handbook.

ee

Architecture

aes

5.
6.

-VAX-11/780

wee

4.

Corporation,

oe

3.

Equipment

SD ae

2.

Digital

Vee

1.

ee

References

Se

Data

Bss

Total

ede

Text

2470

44040

79976

PDP-11
VAX—11
Interdata 8/32

PDP—ii

48064

Interdata 8/32

94574

39208

= 78216

11904

39448


19826
29492
32192

17656
23512
24920

74218
90524
=117718

PDP—I1
VAX—11
Interdata 8/32

21248
23408
35652

6254
9092
9032

$246
7§52
7560

32748
40052
52244

PDP-11
VAX—11
Interdata 8/32



VAX=11

34476

4292

131088

;

=

See

os

C, passl

ed

a

—

*

~ i

Le

a

C, pass2

grep

PpP—il

1936

Interdata 8/32

11950

1160

1936

15046

PDP-—11
VAX—-1l
Interdata 8/32

768
1140
1920

3856
5764
5768

11728
13788
23348

PDP—11

29312

6684

7842

43838

9408
_

10636
-

58836

6656

1578

2104

10338

ee

es

VAX—-11

A4
:
~

q

.

ls

nrofft

§

4

VAX—11
Interdata 8/32

:

4

ia

sort

a

4
‘
al
j

PDP-—11

VAX-11
Interdata 8/32

36360
-

6580
13886

1764
2208

2788
2792

Table 1. Loaded Program Sizes (in bytes)

:
4

7276

476

4864

11132
18886

A

ake

Bente

Se

M4

i

;

fa

/unix

System

Ee

See

Program

ee

Ss

-14-

Serre A

tel
pes gpae
pnctc

—

se

Dene aire

ii

RO

ere



Attachment: signature.asc
Description: PGP signature

Reply via email to