Compiler Construction before 1980
Literature references and annotations by Dick Grune, dick@dickgrune.com.
Last update: Tue Nov 29 11:15:38 2022.
These references and annotations were originally intended
for personal use and are presented here only in the hope
that they may be useful to others.
There is no claim to completeness or even correctness.
Each annotation represents my understanding of the text
at the moment I wrote the annotation.
No guarantees given; comments and content criticism welcome.
Peter Calingaert,
Assemblers, Compilers and Program Translation,
Computer Science Press/Springer-Verlag,
Potomac, Maryland 20854/Berlin,
1979,
pp. 270.
The only book on compiler construction that gives serious treatment to
assemblers, linkers and loaders.
P.J. Brown,
Writing Interactive Compilers and Interpreters,
John Wiley,
Chichester,
1979,
pp. 265.
A realistic and often amusing account of the writing of an interactive
Basic compiler.
Makes a strong case for recursive descent parsers (at least in
interactive compilers), where the author interprets "recursive
descent" very loosely.
Richard Bornat,
Compilers,
MacMillan,
Hampshire,
1979,
pp. 396.
A still well-readable book on a medium advanced level, written by someone
with ample actual experience.
The only book to treat interpreters in any depth.
William A. Barrett,
John D. Couch,
Compiler Construction: Theory and Practice,
Science Research Associates,
Chicago,
1979,
pp. 661.
An extensive, readable account of compiler construction.
Possibly the choice text for someone who actually wants to write his own
parser.
Lots about error recovery.
See second edition, 1986.
C. Hoffmann,
M. O'Donnell,
An interpreter generator using tree pattern matching,
in Sixth Annual ACM Symposium on Principles of Programming Languages,
January 1979,
pp. 169-179.
Author's abstract:
Equations provide a rich, intuitively understandable notation for describing
nonprocedural computing languages such as LISP and Lucid. In this paper, we
present techniques for automatically generating interpreters from equations,
analagous to well-known techniques for generating parsers from context-free
grammars. The interpreters so generated are exactly faithful to the simple
traditional mathematical meaning of the equations - no lattice-theoretic or
fixpoint ideas are needed to explain the correspondence. The main technical
problem involved is the extension of efficient practical string matching
algorithms to trees. We present some new efficient table-driven matching
techniques for a large class of trees, and point out unsolved problems in
extending this class. We believe that the techniques of this paper form the
beginnings of a useful discipline of interpreting, comparable to the existing
discipline of parsing.
C.W. Fraser,
A compact machine-independent peephole optimizer,
in Sixth ACM Symposium on Principles of Programming Languages,
ACM,
New York,
1979,
pp. 1-6.
Author's abstract:
Object code optimizers pay dividends but are usually ad hoc and
machine-dependent. They would be easier to understand if, instead of
performing many ad hoc optimizations, they performed a few general
optimizations that give the same effect. They would be easier to implement if
they were machine-independent and parametrized by symbolic machine
descriptions. This paper describes such a compact, machine-independent
peephole optimizer.
Richard L. Sites,
Machine-independent register allocation,
in ACM SIGPLAN '79 Symposium on Compiler Construction,
ACM SIGPLAN Notices,
vol. 14,
#8,
pp. 221-225.
Aug. 1979,
Provides an algorithm with two interfaces, one static, accepting the
description of a multi-level storage system, and one dynamic, accepting
a set of variables with their live ranges.
The output is a good assignment of variables to storage.
Is capable of handling Pascal variant records, FORTRAN common blocks
etc.
Also does register allocation.
Uwe F. Pleban,
The use of transition matrices in a recursive-descent compiler,
in ACM SIGPLAN '79 Symposium on Compiler Construction,
ACM SIGPLAN Notices,
vol. 14,
#8,
Aug. 1979,
pp. 144-151.
Describes a compiler for PL/85, a structured assembly language for the
Interdata 85.
The Interdata 85 features a transition matrix instruction TM, and the
compiler is built as a set of 21 mutually recursive transition matrices
with 223 action routines.
It is basically a transduction grammar implemented by LL(1) parsing
using transition matrices; the resulting compiler is very small and
fast.
Definitely an original design.
J. Fabri,
Automatic storage optimization,
in ACM SIGPLAN '79 Symposium on Compiler Construction,
ACM SIGPLAN Notices,
vol. 14,
#8,
pp. 83-91.
Aug. 1979,
Extension of the work of Yershov (1971): overlapping of variables by
graph colouring, this time applied to PL/I.
A renaming transformation and several loop transformations are
described.
S.L. Graham,
W. Joy,
O. Roubine,
Hashed symbol tables for languages with explicit scope control,
in ACM SIGPLAN '79 Symposium on Compiler Construction,
ACM SIGPLAN Notices,
vol. 14,
#8,
Aug. 1979,
pp. 50-57.
[[ preliminary abstract ]]
"Explicit scope control" means things like Modula-2 IMPORT,
EXPORT, Ada USE, etc. and even hypothetical constructs like
USE ALL_BUT Smurf FROM Module.
Uses assorted trickery to achieve all required operations in an
efficient fashion; this includes for example rearranging the hash chains
to encode visibility.
As far as I understand it, the authors do not mind scanning the entire
hash table at scope entry and exit; since they use a hash table size of
64 (who said the size of a hash table should be a prime not close to a
power of 2?), it might indeed not be a problem.
Since there is only one diagram, considerable trickery and no explicit
code, it's a bit of a puzzle.
J.P. Banâtre,
J.P. Routeau,
L. Trilling,
An event-driven compiling technique,
Commun. ACM,
vol. 22,
#1,
pp. 34-42.
Jan. 1979,
An n-pass compilation process is turned into a one-pass process
through the following expedient.
Each piece of code that uses a value which is not yet available is
encapsulated in a syntactic construct called an "activity";
each activity waits for one or more "events",
corresponding to the values that are not yet available.
When a value becomes available, the corresponding event is generated
(explicitly); this can be in normal code, or in an activity as the
result of another event.
All this is in effect data-flow programming.
Kuo-Chung Tai,
On the implementation of parsing tables,
ACM SIGPLAN Notices,
vol. 14,
#1,
Jan. 1979,
pp. 100-101.
Points out the obvious possibility to use hashing (using the <state,
token> pair as a key) to implement a parsing table, but also points out
the less obvious fact that this can implement parsing tables having
variable lengths of look-ahead strings.
Wilf R. LaLonde,
A finite state approach to operator classification,
ACM SIGPLAN Notices,
vol. 14,
#1,
Jan. 1979,
pp. 46-54.
Describes the construction of a FSA for distinguishing between unary
prefix, unary suffix and binary infix operators, where some of these may
have identical representations.
To be incorporated in a lexical analyser, to aid the syntax analyser.
C. Hoffmann,
M. O'Donnell,
An interpreter generator using tree pattern matching,
in Sixth ACM Symposium on Principles of Programming Languages,
Jan. 1979,
San Antonio, Texas,
pp. 169-179.
Author's abstract:
Equations provide a rich, intuitively understandable notation for describing
nonprocedural computing languages such as LISP and Lucid. In this paper, we
present techniques for automatically generating interpreters from equations,
analagous to well-known techniques for generating parsers from context-free
grammars. The interpreters so generated are exactly faithful to the simple
traditional mathematical meaning of the equations -no lattice-theoretic or
fixpoint ideas are needed to explain the correspondence. The main technical
problem involved is the extension of efficient practical string matching
algorithms to trees. We present some new efficient table-driven matching
techniques for a large class of trees, and point out unsolved problems in
extending this class. We believe that the techniques of this paper form the
beginnings of a useful discipline of interpreting, comparable to the existing
discipline of parsing.
Will D. Gillett,
Sandra Leach,
Embedding semantics in LR parser tables,
Softw. Pract. Exper.,
vol. 8,
1978,
pp. 731-753.
To avoid having to modify the grammar to allow semantic actions to be
performed in convenient places, the entries in the LR parser table are
extended to contain pointer to lists of semantic actions, in additions
to the parser actions.
This reduces considerably the size of the parser table.
No indication is given of how to specify a semantic action so it will be
associated with the correct LR parser table entry.
R.L. Sites,
Programming tools -- Statement counts and procedure timings,
ACM SIGPLAN Notices,
vol. 13,
#12,
Dec. 1978,
pp. 98-101.
Plea for including profiling facilities even in the first version of a
compiler: statement counts and procedure timings.
Shows many uses of these.
Claims they can be implemented efficiently but shows no simple and cheap
ways to do so.
V. Sinha,
K. Vonod Kumar,
Efficient evaluation of Boolean expressions,
ACM SIGPLAN Notices,
vol. 13,
#12,
Dec. 1978,
pp. 88-97.
Explicit treatment of code generation for the Boolean operators by
passing on and manipulating true and false labels.
O. Lecarme,
Parsing and parser generators,
in Le point sur la compilation,
IRIA,
Le Chesnay, France,
1978,
pp. 191-246.
[[ lots and lots of references ]]
M. Amirchahy,
D. Néel,
Le point sur la compilation,
(title in French: The State of Affairs in Compilation),
IRIA,
Le Chesnay, France,
1978,
pp. 365.
Collection of 10 lectures on compiler construction out of 13 given in
January 1978 in Montpellier.
The lectures cover the full width of compiler construction.
Unfortunately, the papers are presented in alphabetical order of the
first speakers, which is awkward.
There seems to be another edition around called 'State of the Art and
Future Trends in Compiler Construction' which features different page
numbers.
Jay M. Spitzen,
Karl N. Levitt,
Lawrence Robinson,
An example of hierarchical design and proof,
Commun. ACM,
vol. 21,
#12,
pp. 1064-1075.
Dec. 1978,
Pages and pages of formalism to prove correct what seems to be a simple
algorithm for common subexpression recognition or for keeping
memo-tables.
J.B. Phillips,
M.F. Burke,
G.S. Wilson,
Threaded code for laboratory computers,
SP&E,
vol. 8,
#3,
May-June 1978,
pp. 257-263.
A HP2100 is microprogrammed to work as a threaded-code interpreter.
This allows the entire (experimental) operating system, HISS, to be
written as treaded routines.
The source code is compiled by a macro-processor.
As it says in the text:
"More complex syntax is handled by more complicated macros".
Thomas G. Szymanski,
Assembling code for machines with span-dependent instructions,
Commun. ACM,
vol. 21,
#4,
April 1978,
pp. 300-308.
In addition to elaborating on the necessity of optimistic length
assignment (each span-dependent instruction starts in its shortest form)
for obtaining an optimal-size arrangement, the paper covers the
following.
1.
Even optimistic assignment fails if address fields can contain the
results of arbitrary expressions on labels: for example, if an label
address occurs negatively in an instruction operand, then lengthening
another instruction can reduce the size of the operand to fall below the
threshold, and can reduce an already lengthened instruction.
An algorithm for catching such pathological cases is given.
2.
A two-pass implementation of the optimistic algorithm is given in detail,
using dependency graphs (the naive implementation requires three passes).
3.
The general problem is proven to be NP-complete.
Forest Baskett,
The best simple code generation technique for WHILE, FOR, and DO loops,
ACM SIGPLAN Notices,
vol. 13,
#4,
April 1978,
pp. 31-32.
Shows how to avoid the unconditional jump at the end of naive code by
duplicating the test before the loop (does not show the jump into the
loop).
Argues why this is better.
Suggests to generate special labels for the loop labels, to aid code
optimization.
Andrew S. Tanenbaum,
Implications of structured programming for machine architecture,
Commun. ACM,
vol. 21,
#3,
March 1978,
pp. 237-246.
The demise of the goto and the advent of structured programming have
altered the mix of machine instruction generated by compilers, but
machine architecture has not followed suit.
To find the new mix, a very structured simple language SAL was
designed and many programs were written in it.
These programs were then analysed for the frequency of their features.
These features were then expressed as machine instructions of a new
machine, EM-1.
This resulted in a set of machine instructions with estimated
frequencies of static use; tables are given in detail.
The results confirm Knuth's observation that almost all operations done
by a program are very simple.
Huffman encoding of the operations based on the frequencies would lead
to opcodes of 3 bits, 4 bits, 5 bits, etc., which is unsuitable
hardware-wise (and software-wise).
To obtain byte-sized opcodes, the smaller opcodes are combined with
codes for frequent operands (since also the frequencies of the operands
are known).
For example, the opcode for PUSH_LOCAL occupies 4 bits, and the same
byte can contain the offsets 0 through 11; this leaves another 4 8-bit
opcodes to be used for rarer opcodes.
Code size is compared to that for the PDP-11 and the Cyber and in both
cases the EM-1 code is a factor 3 smaller.
R.S. Glanville,
S.L. Graham,
A new method for compiler code generation,
in Fifth ACM Symposium Principles of Programming Languages,
ACM,
New York,
Jan. 1978,
pp. 231-240.
Author's abstract:
An algorithm is given to translate a relatively low-level intermediate
representation of a program into assembly code or machine code for a target
computer. The algorithm is table driven. A construction algorithm is used to
produce the table from a functional description of the target machine. The
method produces high quality code for many commercially available
computers. By replacing the table, it is possible to retarget a compiler for
another kind of computer. In addition techniques are given to prove the
correctness of the translator.
S.C. Johnson,
A portable compiler -- Theory and practice,
in Fifth ACM Symposium on Principles of Programming Languages,
ACM,
New York,
pp. 97-104.
Jan. 1978,
The title is easily misread: not
"Theory and practice of a portable compiler" but
"A portable compiler -- An essay on theory and practice of compiler
construction".
The gist of the paper is that hardly any theory was directly applicable
to the design and construction of the portable C compiler, but that the
insights provided by this theory were invaluable.
The usual manifestation of this was that for a given problem the theory
gave a clean algorithm that was inapplicable because of overly naive
assumptions, machine ideosyncrasies, size limitations, etc., but which
could be patched so as to yield a workable algorithm.
Several examples are given.
For example, the Sethi-Ullmann algorithm for code generation for
expressions requires the resource requiements to be known exactly.
This is unfeasable for real-world instructions that use several
registers from several classes.
So the compiler uses approximations, which almost always work, but
occasionally don't.
In those cases the compiler just gives up.
The compiler consists of about 8000 lines, of which about 1600 are
machine-dependent, and of which about 800 have to be changed to port the
compiler to a different machine.
Knowing which 800 and how to change them requires considerable
understanding of the isuues involved.
A. Aho,
J. Ullman,
Principles of Compiler Design,
Addison-Wesley,
1977,
The "Green Dragon Book". Annotate!!!!???
R.S. Glanville,
A Machine Independent Algorithm for Code Generation and its Use in Retargetable Compilers,
PhD thesis,
University of California,
Berkeley,
Dec. 9, 1977,
pp. 124.
The program is considered as a sequence of arithmetic expressions, and
only the unary and binary arithmetic and the move instructions of the
machine are considered.
Both the instructions and the code to be translated (the intermediate
code) are initially represented in tree form; next, these trees are
linearized in prefix form.
The instructions are considered as rules of a grammar and the
intermediate code as a string to be parsed.
Note that this grammar has a very restricted form: its LHSs and RHSs
obey the following grammar:
<LHS> ::= <register> | <epsilon>
<RHS> ::= <constant> | <unary operator> <RHS> | <binary operator> <RHS> <RHS>
<constant> ::= <register> | <numeric constant> | <address>
A LHS <register> means that the instruction leaves a result in that
register, <epsilon> means that it does not produce a result, i.e., it is
an assignment operator.
Productions of <register> (r0, r1, etc.) are the only nonterminals in
the instruction grammar.
First an SLR(1) parsing table is constructed; shift-reduce conflicts are
resolved by shifting, reduce-reduce by taking the longest pattern.
Next the author proves that if application of this table allows object
code to be produced, that code is correct.
Two worries remain: the code generation can loop and it can get stuck
(block).
Looping can only occur through productions of the form
<register> ::= <register>,
since looping can only be caused by looping through reductions, and all
other rules either reduce the size of the stack upon reduction or are
not recursive.
Now the SLR(1) table is scanned for states leading to a
<register> ::= <register>
reduction, and a directed graph of these states is constructed.
Cycles in this graph (indications of looping) are removed by judiciously
modifying the reduce-reduce conflict resolutions decisions involved in
those states.
Note that this process has to be performed on the parsing table rather
than on the original grammar, since only those applications of the
<register> ::= <register>
rules that have made it into states in the table, past the conflict
resolution, have to be examined.
The grammar itself is full of loops anyway.
The blocking check in essence requires one to show that the grammar
expressed by the parse table is equivalent to that of a general
arithmetic expression (the intermediate code); this asks for the
equivalence of two grammars to be established.
Fortunately the grammars are very simple and have many special
properties, which are exploited.
The possibility of blocking is checked by a simple but powerful device:
for each state in the table in which an arithmetic expression is
expected it is checked that any arithmetic expression is acceptable.
(The author shows that this check is actually slightly too strong.)
The states that expect an expression are identified as those containing
an item of one of the following forms:
<RHS> ::= <unary operator> • <RHS>
<RHS> ::= <binary operator> • <RHS> <RHS>
<RHS> ::= <binary operator> <RHS> • <RHS>
(and their composites),
and the acceptability of any expression is tested by checking that all
tokens (operators) in FIRST(<expression>) are accepted.
The consideration that each such token again leads to a state which has
one of the above forms closes the proof.
Precise algorithms are given.
Unlike the loop check, the blocking check does not modify the parse
table; it is a straight yes-or-no test.
The case of an LR(0) parser is also considered; in that case, the
blocking check can profitably modify the parse table.
The remainder of the thesis contains examples for the PDP-11 and the IBM
370.
The (initially preferred) alternative of using postfix notation is also
examined and found wanting: not enough information is known early enough
to generate good code from postfix notation by doing left-to-right
parsing.
John Cocke,
Ken Kennedy,
An algorithm for reduction of operator strength,
Commun. ACM,
vol. 20,
#11,
pp. 850-856.
Nov. 1977,
Detailed algorithms for the reduction of multiplication to addition on
induction variables in strongly connected regions.
Guy Lewis Steele Jr.,
Debunking the `Expensive Procedure Call' myth, or, Procedure call implementations considered harmful,
in ACM National Conference,
Oct. 1977,
pp. 153-162.
High-paced, opinionated and readable oration for the implementation of
procedure calls in the MacLISP compiler: do tail recursion elimination
and do not use the hardware subroutine call instruction but rather roll
your own procedure calling sequence.
Callers should save their registers before calling, since they know what
is important.
After having shot to pieces the Boehm-Jacopini theorem that any program
can be implemented with structured constructs plus variables only, the
author comes with another theorem, attributed to McCarthy [1960], saying
that any program can be implemented with structured constructs without
adding variables.
The trick is to replace the GOTO by a procedure call (but there is no
trace of a continuation in the paper).
This is, of course, known as the stack-overflow method, but the author
relies on the compiler to do tail recursion elimination, and then the
problem goes away.
A tail-recursive implementation of a finite-state automaton serves as an
example, but the author admits that the new dress is only an optical
illusion.
A diatribe against eliminating unwanted concepts by outlawing constructs
closes the paper.
Lots of material for thought and more opinion.
R. Morrison,
A method for implementing procedure entry and exit in block structured high level languages,
Software -- Pract. Exper.,
vol. 7,
#4,
pp. 537-539.
July-Aug. 1977,
I think the author syas that using displays for static links is never
worth the effort and that a global base pointer, a local base pointer, a
top of stack pointer + occasional chasing down the static chain is much
simpler and at least as fast.
G. Yuval,
The utility of the CDC 6000 registers,
Software -- Pract. Exper.,
vol. 7,
#4,
pp. 535-536.
July-Aug. 1977,
Experimenting with the Pascal compiler for the CDC 6000, the author
finds that reducing the number of available registers from 8 to 5 causes
virtually no increase in code size (0 to 2 percent).
B. Houssais,
Verification of an ALGOL 68 implementation,
ACM SIGPLAN Notices,
vol. 12,
#6,
pp. 117-128.
June 1977,
The implementation is tested by running a series of generated test
programs.
Each program addresses a specific issue, for example assignments, and is
generated by a specific affix grammar.
The grammar is processed to yield a test program containing all
combinations allowed by the grammar.
For example, assignments are generated with all possible syntactic
constructs for its left-hand side combined with all possible constructs
for its right-hand side.
The affixes are used to keep track of the semantics of the generated
forms, which are restricted to integer values wherever possible.
30 grammars have been written, together about 1500 lines of code,
generating about 20000 lines of compact test code.
58 errors were detected this way in the Rennes Algol 68 implementation.
An attempt (actually preceding the presented research) was made to apply
this technique to the entire Algol 68 grammar, using probabilistic
choices rather than exhaustive combination, but the results proved
uncontrollable (not explained further).
B.A. Wichmann,
How to call procedures, or second thoughts on Ackermann's function,
SP&E,
vol. 7,
#3,
May-June 1977,
pp. 317-329.
The efficiency of about 60 procedure calling sequences from various
compilers are compared.
The calling sequences themselves are not given.
John Couch,
Terry Hamm,
Semantic structures for efficient code generation on a stack machine,
IEEE Computer,
vol. 10,
#5,
May 1977,
pp. 42-48.
Part of a set of articles in this issue of IEEE Computer about stack
machines.
Explains simple code generation for same, using recursive descent
routines.
Includes one optimization for the compilation of Boolean expresions: when
the result of a Boolean expression (or subexpression) is used for a jump
decision, the true and / or false labels are passed to the recursive
routines.
This avoids constructing the Boolean value when not needed.
G. Yuval,
Is your register really necessary?,
Software -- Pract. Exper.,
vol. 7,
#2,
pp. 295.
March-April 1977,
Experimenting with the BLISS/11 compiler for the PDP/11, the author
finds that reducing the number of available registers from 8 to 4 causes
an increase in code size of some 12 percent.
But even a reduction to 7 registers makes a difference of 1 to 2 percent.
A.V. Aho,
S.C. Johnson,
J.D. Ullman,
Code generation for expressions with common subexpressions,
J. ACM,
vol. 24,
#1,
pp. 146-160.
Jan. 1977,
[[ preliminary abstract ]]
First proves that generating code fpr common subexpressions is
NP-complete by showing it equivalent to the FNS (Feedback Node Set)
problem.
Next examines a number of heuristics: ....
W. Harrison,
A new strategy for code generation -- The general-purpose optimizing compiler,
in Fourth ACM Symposium on Principles of Programming Languages,
ACM,
New York,
Jan. 1977,
pp. 29-37.
Author's abstract:
This paper presents a systematic approach to the problem of generating good
code with a compiler that is easy to construct. A compiler structure is
proposed which relies on interprocedural data flow analysis, global
optimization, and an intermediate language schema to simplify the task of
writing the code generating portions of a compiler without sacrificing code
quality. This structure is contrasted with a more conventional structure to
explore the reasons why the new structure solves several problems inherent in
the conventional structure. Further advantages which accrue from the new
structure are also presented.
P.M. Lewis II,
D.J. Rosenkrantz,
R.E. Stearns,
Compiler design theory,
Systems Programming Series,
Addison-Wesley,
Reading, Mass,
1976,
pp. 647.
Elementary verbose introduction.
The book covers the theory of formal languages and parsers in a
slow-paced fashion.
Grammars are treated separately from parsers, which gives the book an
odd structure.
Precedence parsing is explained without mentioning the word
"precedence".
It has a valuable appendix on grammar transformations
and an extensive bibliography.
P. Branquart,
J.-P. Cardinael,
J. Lewi,
J.-P. Delescaille,
M. Vanbegin,
An Optimized Translation Process and Its Application to ALGOL 68,
in ???,
Lecture Notes in Computer Science #38,
Springer-Verlag,
Berlin,
1976,
pp. 334-???.
Detailed description of code generation for Algol 68 for the EL-X8.
The version of Algol 68 is that of MR101, 1969.
The code includes the garbage collector, and many other details.
H. Schorr,
Compiler writing techniques and problems,
in Software Engineering -- Concepts and Techniques,
ed. by Peter Naur, Brian Randell and J. N. Buxton,
Petrocelli/Charter,
1976,
pp. 252-260.
Compiler writing techniques are divided in four classes: ad-hoc,
table-driven, compiler writing machines, and large primitives.
These terms are to be understood as follows.
Table-driven: a formal description of an aspect of the input is
specified in the form of a table, and supplied to the compiler.
Compiler writing machines: the use of a special compiler writing
language, better suited to express the special problems of compiler
writing.
Large primitives: the use of separate, pre-existing programs to perform
steps in the compiler, for example the use of the assembler as a
primitive.
Ad-hoc: all programming that is not covered by one of the above.
All this is explained in the context of a specific compiler construction
problem: compiling one line of FORTRAN.
P. Henderson,
J.H. Morris,
A lazy evaluator,
in Third ACM Symposium on Principles of Programming Languages,
ACM,
New York,
1976,
pp. 95-103.
Author's abstract:
A different way to execute pure LISP programs is presented. It delays the
evaluation of parameters and list structures without ever having to perform
more evaluation steps than the usual method. Although the central idea can be
found in earlier work this paper is of interest since it treats a rather
well-known language and works out an algorithm which avoids full
substitution. A partial correctness proof using Scott-Strachey semantics is
sketched in a later section.
Andrew S. Tanenbaum,
A general purpose macro processor as a poor man's compiler-compiler,
IEEE Trans. Softw. Eng.,
vol. SE-2,
#2,
June 1976,
pp. 121-125.
Report of how the ML/I macro processor was used to compile SAL programs.
ML/I is a recursive macro processor that can do $epsilon$-free LL(1) parsing,
and possesses attribute lists which can be used as symbol tables.
SAL is a systems programming language.
John Bruno,
Ravi Sethi,
Code generation for a one-register machine,
J. ACM,
vol. 23,
#3,
pp. 502-510.
July 1976,
Prove that optimal code generation from a DAG for a one-register machine
is NP-complete, by proving it equivalent to SAT3.
J. King,
Symbolic execution and program testing,
Commun. ACM,
vol. 19,
#7,
July 1976,
pp. 385-394.
Symbolic execution is motivated and described as an intermediate form
between debugging by testing and program proving, in that it extends the
input sets used in testing by making some or all input values symbolic.
Such a symbolic input set represents an infinite number of actual test
sets.
Next a fairly thorough approach to symbolic execution is described, in
which are values that circulate in the program graph are accompanied by
a so-called "path condition", which tells under which conditions the
assembled value holds.
Since this leads to infinitely large value expressions, an interactive
symbolic executor called EFFIGY is introduced, which allows the user to
interactively simplify the path expressions, for example by indicating
which of the two branches of a particular if-statement is to be
investigated.
A moderate amount of mathematics is supplied to support all this.
A.V. Aho,
S.C. Johnson,
Optimal code generation for expression trees,
J. ACM,
vol. 23,
#3,
March 1976,
pp. 488-501.
Describe code generation by tree rewriting and dynamic programming, and
prove its optimality.
Lots of maths.
F.E. Allen,
J. Cocke,
A program data flow analysis procedure,
Commun. ACM,
vol. 19,
#3,
March 1976,
pp. 137-147.
Describes algorithms about reducible and irreducible flow graphs in
terms of intervals rather than in terms of the newer notion dominators.
Many (necessarily older) literature references.
William A. Wulf,
Richard K. Johnsson,
Charles B. Weinstock,
Steven O. Hobbs,
Charles M. Geschke,
The Design of an Optimizing Compiler,
American Elsevier,
New York,
1975,
pp. 165.
Detailed report of the construction of the Bliss/11 compiler for the
PDP-11.
Ben Wegbreit,
Property extraction in well-founded property sets,
IEEE Trans. Software Eng.,
vol. SE-1,
#3,
Sept. 1975,
pp. 270-285.
A property set is a partially ordered set with semantic extensions: it
has predicates on the elements and a semantic interpretation function.
It is well-founded if all increasing chains have the same upper limit.
They are used to model semantic properties of the variables and flow of
control in each position in a program.
The paper shows techniques to extract specifications of such
well-founded property sets automatically from the program.
Lots of math.
R. Dewar,
Indirect threaded code,
Commun. ACM,
vol. 18,
#6,
June 1975,
pp. 330-331.
Direct threaded code (Bell, "Threaded code", Commun. ACM, Vol. 16, No.
6, June 1973, pp. 370-372) consists of addresses to generated routines.
More in particular, a separate routine is generated for loading and
storing each variable in the program, for each loop, etc.
These routines have much code in common, which can be factored out and
stored in the library, and replaced by the addresses of that code.
These addresses now constitute threaded code, and the addresses of these
stretches of threaded code now form the program, as indirect threaded
code.
There are several advantages: 1. the generated code is smaller; 2. the
generated code consists of addresses only and is machine-independent; 3.
if the directly threaded routines are allocated dynamically, dead code
is removed by the garbage collector, since no other routine points to
it.
John A.N. Lee,
The Anatomy of a Compiler,
(2nd Edn.),
Van Nostrand Reinhold,
New York,
1974,
pp. 470.
Compiler construction the literary way.
Engaged, fascinating and annoying account of compiler construction in
the late sixties and early seventies, with references to Edgar
Allen Poe, the Rubayyat, Sherlock Holmes and, you guessed it, Alice in
Wonderland, written in a rich and occasionally perhaps overly ripe prose.
The techniques are mostly geared to compiling Fortran and PL/I, but
Algol with its call by name is not forgotten; no word about Algol 68.
The subjects and the terminology deviate so much from modern usage that
the modern student will require a translation.
For example, hash tables are explained but are called "posting systems"
(huh?), the top-down parsing algorithm is like nothing I've ever seen (it
seems deterministic CYK with a top-down component); the bottom-up
parsing, called "reductive analysis", is Dömölki's algorithm, called
Domelki's algorithm here.
All very interesting, but unusual.
For the connoisseur.
F.L. Bauer,
Historical remarks on compiler construction,
in Compiler Construction: An Advanced Course,
ed. by F.L. Bauer and J. Eickel,
Lecture Notes in Computer Science #21,
Springer-Verlag,
Berlin,
pp. 603-621.
1974,
Treasure trove of facts, factoids, literature references, etc.
Names Rutishauser as the first to fully describe the compilation process.
J.J. Horning,
What the compiler should tell the user,
in Compiler Construction: An Advanced Course,
ed. by F.L. Bauer and J. Eickel,
Lecture Notes in Computer Science #21,
Springer-Verlag,
Berlin,
pp. 525-548.
1974,
Detailed analysis of the communication between compiler and programmer,
in a batch environment, in which the compiler is the best tool for the
programmer to inspect his program.
C.H.A. Koster,
Using the CDL compiler-compiler,
in Compiler Construction, An Advanced Course,
ed. by F.L. Bauer and J. Eickel,
Lecture Notes in Computer Science #21,
Springer-Verlag,
Berlin,
1974,
pp. 366-426.
Extensive description of the affix grammar processor CDL and its application to
compiling ALGOL 68.
W.M. McKeeman,
Symbol table access,
in Compiler Construction, An Advanced Course,
ed. by F.L. Bauer and J. Eickel,
Lecture Notes in Computer Science #21,
Springer-Verlag,
New York,
pp. 253-301.
1974,
Four name list algorithms with scope handling are compared: lists with
linear search, sorted lists with binary search, trees and hash tables.
Each algorithm is programmed in full in XPL and runs are compared both
in theory and in practice.
The hash tables win and the sorted lists lose badly due to resorting at
scope exit.
M. Griffiths,
Run-time storage management,
in Compiler Construction, An Advanced Course,
ed. by F.L. Bauer and J. Eickel,
Lecture Notes in Computer Science #21,
Springer-Verlag,
New York,
pp. 195-221.
1974,
Clear summary of the subject, including:
activation records, static and dynamic linking, displays, parameter
addressing, implementation of labels, array indexing, field addressing
in structures, linked lists and garbage collection.
Jean Vuillemin,
Correct and optimal implementations of recursion in a simple programming language,
J. Comput. Syst. Sci.,
vol. 9,
#3,
Dec. 1974,
pp. 332-354.
Lots of theory about recursive functions and computation lattices.
Seems to propose lazy evaluation (1974!).
R.A. Freiburghouse,
Register allocation via usage counts,
Commun. ACM,
vol. 17,
#11,
Nov. 1974,
pp. 638-642.
Usage counts are established by a modification of an algorithm from
Gries' book for eliminating redundant computations.
Next it is shown how these usage counts can be used for register
allocation.
An extension to "non-linear regions" (more than one basic block) is
considered.
Frederick W. Weingarten,
Translation of Computer Languages,
Holden-Day,
San Francisco, Calif.,
1973,
pp. 180.
Actually only about parsing, but rather extensively so.
One of the few books to cover practical implemetations of non-linear
parsing techniques, for example Earley's.
Alfred V. Aho,
Jeffrey D. Ullman,
The Theory of Parsing, Translation and Compiling: Volume II: Compiling,
Prentice-Hall, Inc.,
Englewood Cliffs, N.J.,
1973,
pp. 543-1002.
In spite of the subtitle of the volume, the first 180 pages are a
continuation of volume I and are concerned with parsing.
Precedence functions, bounded-context, LR(k), parsing automata and the
underlying theory of deterministic parsing are treated in detail.
Like volume I, even after 20 years the book has lost nothing of its
original freshness.
D.E. Knuth,
F.R. Stevenson,
Optimal measurement points for program frequency counts,
BIT,
vol. 13,
1973,
pp. 313-322.
[ Kirchhoff's laws ]
Peter F. Stockhausen,
Adapting optimal code generation for arithmetic expressions to the instruction sets available on present-day computers,
Commun. ACM,
vol. 16 / 17,
#6 / 10,
June 1973 / Nov. 1974,
pp. 353-354 / 591.
Restricts Sethi and Ullman's algorithm (JACM Oct 1970) not to generate
instructions of the form $ R sub 1 # R sub 2 → R sub 2 $ for any
operator #, since machines do not normally have such operations.
Arnold L. Rosenberg,
Transitions in extendible arrays,
in ACM Symposium on Principles of Programming Languages,
ACM,
New York,
Oct. 1973,
pp. 218-225.
[ what is this about? ]
F.L. Morris,
Advice on structuring compilers and proving them correct,
in Symposium on Principles of Programming Languages,
ACM,
New York,
Oct. 1973,
Boston, Massachusetts,
pp. 144-152.
Author's abstract:
The purpose of this paper is to advise an approach (and to support that advice
by discussion of an example) towards achieving a goal first announced by John
McCarthy: that compilers for higher-level programming languages should be made
completely trustworthy by proving their correctness. The author believes that
the compiler-correctness problem can be made much less general and
better-structured than the unrestricted program-correctness problem; to do so
will of course entail restricting what a compiler may be. The essence of the
present advice is that a proof of compiler correctness should be the proof
that a diagram of the form commutes. (The symbols in the diagram refer to the
example to be given below.) It is of course not very startling to regard a
compiler as a function assigning target language programs to source language
programs; however, the rest of the diagram does have a non-vacuous content: it
insists that 'semantics' should be an actual function assigning definite
mathematical objects ('meanings') to programs--usually partial functions of
some fairly complicated type. That is, it rejects interpretive forms of
semantic definition, such as those provided by the Vienna definition method,
or by sets of reduction rules of some form, which only provide for any given
pair of program and input data some prescription for finding the output (if
any). The bottom, 'decoding' side of the diagram is merely an allowance for
the highly probable eventuality that the meanings of target language programs
are not identical with, but only represent in some adequate way the meanings
of the source language programs from which they were compiled.
Daniel G. Bobrow,
Ben Wegbreit,
A model and stack implementation of multiple environments,
Commun. ACM,
vol. 16,
#10,
pp. 591-603.
Oct. 1973,
The paper consists of three parts: primitives, implementation and
extensions.
Blocks and procedures are unified for this purpose in that both have an
activation record when activated, called a frame here.
A frame consists in principle of two parts: a basic frame, constructed by
the caller, which contains the parameters and a reference count, and a
frame extension, constructed by the callee, which contains
a "control link" (CLINK, dynamic link),
an "access link" (ALINK, static link),
a "binding link" (BLINK, link to the basic frame),
a return type indication,
the address of a routine to return a return value,
a control point pointer (for backtracking), and
an in-use flag.
On this data structure, seven operations ("primitive control functions")
are defined, which are then used to write a number of more usual
non-primitive control functions in Algol-like languages, LISP-like
languages, languages with coroutines and languages with backtracking.
Several examples are given, including the 8 queens problem.
The implementation section first explains the use of the reference and
in-use counts.
Next an optimistic implementation on a stack is described: each frame is
allocated on the stack, but is freed if its reference or use count
becomes zero; if the freed frame is on the top of the stack, the stack
shrinks, otherwise there will be a hole in the stack.
This requires garbage collection, and compactification of the stack.
Several techniques are proposed to postpone the garbage collection.
The model is then extended for shallow binding, label variables,
cooperating sequential processes and multiple processors.
J.R. Bell,
Threaded code,
Commun. ACM,
vol. 16,
#6,
June 1973,
pp. 370-372.
The source program is translated to a sequence of subroutine addresses,
and a machine register R is designated as "position counter" in this
sequence.
Each of the subroutines ends in the "threaded-code instruction", an
instruction which sets the program counter to the address under R and
then increments R by the size of a pointer.
The corresponding PDP-11 instruction is JMP @(R)+.
The subroutines can be passed parameters by including them of their
addresses in the sequence and picking them up using the same addressing
mode @(R)+.
The linker can be used to select only those subroutines from the library
that are actually used by the sequence.
J.D. Ullman,
Fast algorithms for the elimination of common subexpressions,
Acta Informatica,
vol. 2,
#3,
1973,
pp. 191-213.
"Common subexpressions" here means loop-invariant expressions.
Several algorithms to obtain these are described in detail and proven
correct.
Their complexity is also assessed: $ O ( n sup 2 ) $ for the general
case and $ O ( n ~ ln ~ n ) $ for reducible flow graphs.
Aho, Alfred V.,
Ullman, Jeffrey D.,
The Theory of Parsing, Translation and Compiling: Volume I: Parsing,
Prentice Hall,
Englewood Cliffs, N.J.,
1972,
pp. 542.
The book describes the parts of formal languages and automata theory
relevant to parsing in a strict mathematical fashion.
Since a considerable part of the pertinent theory of parsing had already been
developed in 1972, the book is still reasonably up to date and is a veritable
trove of definitions, theorems, lemmata and proofs.
The required mathematical apparatus is first introduced, followed by a
survey of compiler construction and by properties of formal languages.
The rest of the book confines itself to CF and regular languages.
General parsing methods are treated in full: backtracking
top-down and bottom-up, CYK parser and Earley.
Directional non-backtracking methods are explained in detail, including
general LL(k), LC(k) and LR(k)
precedence parsing and various other approaches.
A last chapter treats several non-grammatical methods for language
specification and parsing.
Many practical matters concerning parser construction are treated in
volume II, where the theoretical aspects of practical parser
construction are covered; recursive descent is not mentioned, though.
B.W. Pollack,
Compiler techniques,
Auerbach Publishers,
Philadelphia,
1972,
pp. 558.
A compilation of papers on compiler construction from between 1960 and
1970.
Ralph E. Griswold,
The macro implementation of SNOB0L4 -- A Case Study of Machine-Independent Software Development,
W.H. Freeman and Company,
San Francisco,
1972,
pp. 310.
Engaging description of the implementation of SNOBOL4 through the macro
implementation language SIL, SNOBOL4 Implementation Language.
SNOBOL4 is used as a pattern mathcing language, but its basis is deeper.
Fundamentally it consists of a flow of control mecahnism and a pattern
matching mechanism.
A SNOBOL4 program consists of a list of instructions, one to a line.
Each instruction line carries three items (in principle): the
instruction, a success label and a failure label.
Each instruction can succeed or fail; if it succeeds, execution
continues at the success label, if it fails, execution continues at the
failure label.
Success and failure labels that point to the textually following line
can be omitted.
The pattern matching mechanism consists of a number of operators that
compose, decompose and match strings, strings being the only data type
in SNOBOL4.
These actions can succeed or fail, thereby controlling the execution
flow.
The SNOBOL4 system is an interpreter for SNOBOL4 programs.
It translates a SNOBOL4 program into a single binary tree stored in a
single array.
Each node consists of a prefix field describing the type of the node,
and two further fields which may hold pointers and/or data, as
determined by the prefix.
Unsurprisingly, this internal code is called prefix code.
The translation is done using a naive top-down parser; expressions are
handled by moving each operator node encountered up the tree to the
place where its precedence makes it fit.
The SNOBOL4 system is written in SIL, a relatively simple SNOBOL4-aware
macro language.
Portability was a major concern in the design of SIL.
The macros are in a relatively universal format and can be expanded
relatively easily into local assembly code on various machines.
Implementations on the IMB360 and the Cyber6000 are treated in
considerable detail.
The possibility of a microprogrammed SNOBOL4 machine are also discussed;
the tentative conclusion is that a simpler, lower-level SIL would be
more suitable for that purpose.
J.S. Rohl,
J.A. Linn,
A note on compiling arithmetic expressions,
Computer J.,
vol. 15,
#1,
Feb. 1972,
pp. 13-14.
The authors point out that there are only two binary operators, + and
$times$.
Their inverses, - and /, are unary operators: $ a - b $ is
actually $ a + ( - b ) $.
Next they make + and $times$ unary too, so
$ a + b $ is actually $ + a + b $ and
$ a - b $ is actually $ + a - b $.
This allows rearranging the terms, within certain rules.
Some of these rearrangements are more profitable than others, given the
available set of machine instructions, and rules are given to choose the
one that uses minimal storage on a single accumulator machine.
James B. Maginnis,
Elements of Compiler Construction,
New York,
Appleton-Century-Crofts,
1971,
Interesting for its naive descriptions and parsing methods.
Covers the translation of a simple FORTRAN-like language to a simple
MIX-like imaginary machine called SHAM.
Treats lexical analysis by hand-constructed FS automaton,
expression analysis by hand-constructed precedence parsing and
syntax analysis using a general backtracking bottom-up parser
(called "bottoms-up" here!).
The efficiency of the bottom-up parser is increased by precomputing (by
hand) a table for the FIRST sets.
The code produced is described by "production descriptors", in what is
equivalent to a transduction grammar.
There are chapters on data structures, macros, subroutine calling and
input/output.
D. Gries,
Compiler Construction for Digital Computers,
John Wiley,
New York,
1971,
pp. 480.
For a long time, this was the only serious book about compiler
construction; entire generations of compiler constructors have grown up
with it and they have not regretted it.
The book looks old-fashioned, but is still quite worth reading.
John Levine writes: "Contains one of the best available descriptions of the
IBM card image object format."
A.P. Yershov,
The Alpha Automatic Programming System,
Academic Press,
London,
1971,
pp. 247.
Detailed and often candid report on the construction of the Alpha
compiler for the Input language for the Russian M-20 computer, developed
in the years 1959-1964.
Input is a somewhat watered-down version of Algol-60, the main
difference being that recursion is not allowed.
The M-20 is a three-address machine with 4096 words of 45 bits each for
core memory.
The text consists of 13 chapters, written by 12 participants.
Except for the terminology, it reads as a modern report on a large
software project.
In spite of all the good effort by the translator, the text regularly
reads easier when some words are replaced by near-equivalents, for
example "various" by "different", "cumbersome" by "complex", etc.
The compiler consists of 24 passes, called blocks, which reside on a
magnetic tape.
Each block is read in from tape, reads the intermediate program text +
tables from drum where they were left by the previous block, does its
work and writes the new program text + the new tables back to drum for
the next block.
The first block reads the source program from punched cards, the last
block punches the object code on cards (or leaves the object loaded in
memory).
The compiler produces about 100 machine instructions per minute.
Hashing is used for the symbol table, dynamic memory allocation is used
to avoid size limitations, common subexpressions are recognized, graph
colouring is used to allocate variables, and much more.
The allocation of variables is especially significant, since all
variables are global, there being no recursion.
An interference graph (called a graph of incompatibilities here) is used
to find out which variables can share a location.
Since the primary purpose of the system is number crunching, at least
20 percent of the compiler effort was spent on optimizing for-loops and
multi-dimensional array indexing.
The following graph colouring algorithm is proposed (Yershov-Kozhukhin).
The idea is to successively merge pairs of nodes (which will eventually
obtain the same colour) until a fully k-connected graph remains.
This graph can be coloured in only one way: all nodes different,
requiring k colours.
Next the merged nodes pairs are unmerged, resulting in the original
graph, k-coloured.
The question is, which node pairs to merge.
One cannot merge a node with one of its direct neighbours, since that
would result in a graph in which nodes were connected to themselves;
such a graph cannot be coloured.
The next best is to merge a node A with one of its neighbours'
neighbours B that is not also a direct neighbour (B is A's NN).
Usually, this still leaves many choices for A and B.
Three variants are discussed:
1.
A is the node with the lowest ordinal number and
B is its NN with the lowest ordinal number;
2.
A is the node with the lowest ordinal number and
B is its NN with which A has the largest number of neighbours in
common; such a merge removes the largest number of arcs for A;
3.
The entire graph is searched for the pair of NNs A and B that
share the largest number of neighbours;
such a merge removes the largest number of arcs from the graph.
Although the methods are listed above in order of increasing
effectiveness, method 1 was used since the difference in effectiveness
is not large and the difference in ease of programming is.
Jay Earley,
Howard Sturgis,
A formalism for translator interactions,
Commun. ACM,
vol. 13,
#10,
pp. 607-617.
Oct. 1970,
Explains the T-diagram notation for program activation composition and
provides considerably more theorems about it than I have ever seen used
in practice.
H.L. Morgan,
Spelling correction in systems programs,
Commun. ACM,
vol. 13,
#2,
Feb. 1970,
pp. 90-93.
Uses different vocabularies for different contexts, to do context-dependent
error correction.
For example, UF will be corrected to IF in keyword context, but not elsewhere.
J. Cocke,
Global common subexpression elimination,
ACM SIGPLAN Notices,
vol. 5,
#7,
July 1970,
pp. 20-24.
Proc. Symp. of Compiler Optimization.????
R. Sethi,
J.D. Ullman,
The generation of optimal code for arithmetic expressions,
J. ACM,
vol. 17,
#4,
pp. 715-728.
Oct. 1970,
Full theory of code generation for arithmetic expressions, with proofs.
Sethi-Ullman numbering,
extension for the use of commutative operators,
extension for the use of commutative and associative operators,
extension for doing stores optimally when out of registers.
J. Cocke,
J.T. Schwartz,
Programming languages and their compilers,
Courant Institute, NYU,
New York,
1970,
pp. 767.
Unfinished notes.
Over 700 pages of almost unreadable text (or did I get a preprint?).
F.R.A. Hopgood,
Compiling Techniques,
Elsevier Publ. Comp.,
Amsterdam,
1969,
pp. 126.
A classic.
Explains the state of the art in 1969 in the simplest possible terms.
Edward S. Lowry,
C.W. Medlock,
Object code optimization,
Commun. ACM,
vol. 12,
#1,
Jan. 1969,
pp. 13-22.
Thoroughly modern treatment of object code optimization in the Fortran H
compiler for OS/360.
Features include
basic blocks,
dominance relations,
loop analysis,
data flow analysis,
common subexpression elimination,
code motion,
induction variables,
subsumption, and
register allocation,
each explained in less than a page.
Walter L. Johnson,
James H. Porter,
Stephanie I. Ackley,
Douglas T. Ross,
Automatic generation of efficient lexical processors using finite state techniques,
Commun. ACM,
vol. 11,
#12,
Dec. 1968,
pp. 805-813.
Identifies and recognizes lexical analysis as a separate unit, amenable to
automation.
A regular-expression-like language is defined, which allows non-deterministic
automata to be defined, which are then made deterministic by the system.
The regular expressions include bounded repetition: *5 for a maximum
of 5 characters.
Semantic actions are attached to some rules of the FS grammar.
A variant of the subset construction is described that requires the
unique determination of the states in which a semantic action is required.
Jerome Feldman,
David Gries,
Translator writing systems,
Commun. ACM,
vol. 11,
#2,
Feb. 1968,
pp. 77-113.
Grand summary of the work done on parsers (with semantic actions)
before 1968.
T.E. Cheatham Jr.,
The Theory of Construction of Compilers,
Computer Associates,
Wakefield, Mass.,
1967,
pp. ~480.
The copy I saw was a glued stack of lecture notes, type-written,
interspersed with photo-copies of well-known papers.
The book is actually about parsing; Chapter X, about "models hosting the
syntactic and semantic specification of programming languages",
"representations of computation sequences", and "code selection",
promised in the Table of Contents, was absent from the copy I saw.
The book gives an interesting first-hand account of the trains of
thought that were involved in the first advanced program text handling.
All steps are made very explicit and are illustrated with many, often
entertaining approaches.
Chapter I includes Floyd's paper on Non-deterministic Algorithms, and an
introduction to AMBIT/G, a flowchart language, used extensively in the
book as pseudocode.
Chapter II covers all the ins and outs of context-free grammars, with
many examples of linear, left-recursive and right-recursive grammars.
An algebra of regular expressions is given, drawings of automata,
Greibach normal form, algebraic formulation of a grammar, FIRST, LAST,
INCLUDES, and other sets.
Chapter III treats the production process using CF grammars.
The grammars can be "augmented", and then include conditions,
interpretations (semantics), and code to be output.
This naturally leads to transduction grammars, and to Knuth's paper
"Semantics of Context-free Languages".
Chapter IV takes the program-generating algorithms from Chapter III, and
turns them into non-deterministic program-analyzing algorithms.
This leads naturally to non-deterministic parsers.
The top-down, left-corner and bottom-up parsing techniques are explained
using dominoes which derive from the grammar and which are used to
"pave" parse trees, and pages and pages are filled with games played
according to rules and heuristics; even tournaments between the various
techniques are organized.
Next the (non-deterministic) algorithms are laid out ("written" would be
the wrong word) pittoreskly in AMBIT/G.
The chapter closes with more algorithmic specifications of the same.
Chapter IV+ takes on deterministic syntax analysis, based on DPDAs in
the traditional way.
The top-down and bottom-up parsing techniques are treated uniformly,
using virtually the same kind of automata; left-corner is omitted here.
Much formal language theory about LL(0) (!), LL(1), LR(1) and RL(1) (!)
follows.
The construction of DPDAs for these grammars is treated extensively.
Chapter V shows a recursive-descent-like implementation of a top-down
parser, still using the DPDA.
The recursion is all out in the open, using an explicit stack.
Full construction of the DPDA is given.
The second part of the chapter shows how to handle left-recursive
non-terminals.
Again everything is explained from first principles.
Chapter VI gives the same treatment to simple precedence and operator
precedence grammars.
Chapter VII introduces what is called "reductions analysis" and what
turns out to be Floyd productions.
Floyd productions are considered a programming language for writing
parsers, and a machine that processes Floyd productions is presented.
The Floyd productions for a given grammar are derived by hand,
systematically and using insight for optimizations.
Chapter VIII considers grammars for and parsing of two-dimensional
languages, languages in which the grammar includes up and down
movements; there also exist movements like "undo last movement" and
"back to base".
The author thinks mainly of mathematical formulas with sub- and
superscripts, big sigmas, integrals, etc..
The movements do not show in the input to be parsed: they have to be
retrieved by the parsing process, apparently for prettyprinting.
The movements are essentially ε productions and cause much trouble in
creating the Floyd productions.
Chapter IX presents the insight that the grammar of lexical tokens is
often simpler than that of programming languages (no Chomsky hierarchy
anywhere) and that tokens can be recognized by transition diagrams.
A lexical analyzer for a FORTRAN-like language is shown.
Chapter X was missing, unfortunately; it may not exist at all.
A book rich in ideas, presented in a very fresh form.
Slow-paced, requires thorough reading.
The book will still surprise the modern reader.
Ikuo Nakata,
On compiling algorithms for arithmetic expressions,
Commun. ACM,
vol. 10,
#8,
Aug. 1967,
pp. 492-494.
[ Register allocation during compilation of arithmetic expressions. ]
R.M. McClure,
TMG -- A syntax-directed compiler,
in 1965 20th National Conference,
ACM,
1965,
pp. 262-274.
The syntax is described by a sequence of parsing routines, which can
succeed, and then may absorb input and produce output, or fail, and then
nothing has happened; this requires backtracking.
Each routine consists of a list of possibly labeled calls to other
parsing routines of the form $ routine_name / failure_label $.
If the called routine succeeds, the next call in the list is performed;
if it fails, control continues at the $failure_label$.
An idiom for handling left recursion is given.
This allows concise formulation of most compiler input.
Basic symbol table handling and arithmetic facilities are available, but
the idea is to have TMG produce unsophisticated output, to be processed
further by ad-hoc programs.
Rumor has it that TMG stands for "transmogrify".
Peter Naur,
Checking of operand types in ALGOL compilers,
BIT,
vol. 5,
pp. 151-163.
1965,
Uses symbolic interpretation to do type checking in ALGOL 60, without
calling it that, of course.
W.M. McKeeman,
Peephole optimization,
Commun. ACM,
vol. 8,
#7,
July 1965,
pp. 443-444.
Considerable improvement was achieved in the Gogol translator by peephole
optimization, the principles of which are explained briefly.
No earlier references.
Alan Batson,
The organization of symbol tables,
Comm. ACM,
vol. 8,
#2,
pp. 111-112.
Feb. 1965,
Stores the identifiers with their info blocks in an open hash table
(called a scattered table here).
Nested redefinitions are entered as if they are new identifiers.
When retrieving, scanning of the hash table continues until a blank
entry is found; the last match is then the identifier info you want.
The paper does not explain how to exit a block.
This would seem to be a problem.
Stephen Warshall,
Robert M. Shapiro,
A general-purpose table-driven compiler,
in AFIPS 1964 Spring Joint Computer Conference,
vol. 25,
pp. 59-65.
1964,
They mean it: the entire compiler is table-driven (except for the
assembler).
The compiler has five phases: syntax analysis, which creates trees;
macro generation, which creates a stream of macro instructions from the
trees; in-sequence optimization on the macro stream; code selection from
the optimized macro stream; and assembly.
The first four phases are table driven, and are language- and
machine-independent; the assembler is ad-hoc, language-independent and
machine-dependent.
All in all, a very modern set-up.
The syntax analyser is a top-down backtracking analyser fed with a table
containing the grammar of the input language in BNF; it produces a
correct parse tree for the grammar, except that left- and
right-recursive repetition results is a "bushy" tree: all repeating
nodes are on the same level.
The macro generator is fed a table of code segments in the language GSL,
Generator Strategy Language.
This table specifies for each tree type (= grammar symbol) which actions
must be performed to generate macros.
Statements in GSL specify a node type, an number of conditions on
parents and children, and one or more actions to be performed.
Each piece of code may be visited multiple times for each node, each
next visit performing the next action, thus allowing multi-visit
attribute grammar programming.
The in-sequence optimizer and the code selector are fed tables with
information about each of the macros, written in MDL, Macro description
Language.
The in-sequence optimizer is a table-driven peephole optimizer.
The code selector has a code segment in CSL, Code Selection Language,
for each macro.
CSL statements can test the operands, the operands of the operands, etc.
Essentially they do bottom-up rewriting of the macro stream.
The mode of operation of the compiler is that for each user program the
proper BNF table is loaded (= put in the card reader), followed by the
proper GSL table; next comes the MDL code, selected for the desired
compromise between compilation and run-time speed, and then the user
program is loaded.
The authors report that the compiler "is not as fast as some we have
seen, but does not seem prohibitively slow", and that the code is "quite
alarmingly good".
R.M. Graham,
Bounded context translation,
in AFIPS 1964 Spring Joint Computer Conference,
AFIPS Press,
Montvale, N.J.,
1964,
pp. 184-205.
Explains operator precedence, both with tables and functions, and
Floyd-Evans productions, and shows how the parser code can be extended
with translation code, so that the parser immediately produces one
of several kinds of parse trees.
Likewise a machine code translation of the program being parsed can be
produced.
James P. Anderson,
A note on some compiling algorithms,
Comm. ACM,
vol. 7,
#3,
pp. 149-150.
March 1964,
This very small note describes two code generation algorithms.
One is table-driven and uses two stacks, an operand stack and an
operator stack; the operand stack holds where the operand can be found,
for example in a named variable, in a register, etc.
The table is indexed by a (top, top-1, operator) triple; the entries
specify modifications to both stacks and code to be generated.
The table is (apparently) created by hand; an example is given.
The second algorithm does a simple kind of tree rewriting, using entries
from the same table at well-chosen nodes in the tree.
The table is then indexed by (left operand, right operand, operator)
triples, and the entries specify modifications to the tree and code to
be generated.
R.A. Brooker,
I.R. MacCallum,
D. Morris,
J.S. Rohl,
The compiler compiler,
Annu. Rev. Autom. Program.,
vol. 3,
1963,
pp. 229-322.
One of the first extensive descriptions of a compiler-compiler.
Parsing is done by a backtracking non-exhaustive top-down parser using a
transduction-like grammar.
This grammar is kept in an integrated form and modifications can be made
to it while parsing.
Melvin E. Conway,
Design of a separable transition-diagram compiler,
Commun. ACM,
vol. 6,
#7,
July 1963,
pp. 396-408.
Describes a Cobol compiler written in assembler, for the Borroughs 220.
The compiler consists of a pipe line of six coroutines:
"basic symbol reducer" (lexical analyser),
"name reducer" (name list),
"diagrammer" (syntax analysis),
"data structure interpreter" (Data Division checker),
"data property interpreter" (constructs data layout),
"generator" (code generation),
all described in reasonable detail.
The coroutine mechanism is explained, but the idea and the name are
assumed to be known; no literature reference to earlier use is given.
The parser is Irons', made deterministic by a No-Loop Condition and a
No-Backup Condition.
It follows transition diagrams rather than grammar rules.
Symbolic Languages in Data Processing,
(Symposium International Computation Centre),
Gordon and Breach,
New York,
1962,
pp. 849.
Incredible collection of early papers and panel discussions about syntax,
symbol manipulation, compiler construction, programming language design,
portability, etc.
With transcripts of panel discussions with the great names:
Dijkstra, Strachey, Naur, van Wijngaarden, Ingerman, Samelson, Wilkes,
to name few.
The archaic terminology is sometimes an obstacle to understanding,
though.
Robert S. Ledley,
James B. Wilson,
Automatic-programming-language translation through syntactical analysis,
Commun. ACM,
vol. 5,
#3,
March 1962,
pp. 145-155.
A tutorial on automatic-programming-language translation.
First, an English-to-Korean (!) translation system is described in
detail, in which parts of the Korean translation are stored in
attributes in the parse tree, to be reordered and interspersed with
Korean syntactic markers on output.
The parser is a variant of Irons', for which a full non-recursive flow
chart is given.
The technique is then transferred to the translation of Algol 60 to a
simple assembler code.
Bruce W. Arden,
Bernard A. Galler,
Robert M. Graham,
An algorithm for translating Boolean expressions,
J. ACM,
vol. 9,
#2,
Feb. 1962,
pp. 222-239.
Produces optimized translations, taking into account the fact that, for
example, the second operand of AND need not be computed when the first
evaluates to false.
The implementation is performed by jumping to the next relevant place
in the code.
H.H. Bottenbruch,
A.A. Grau,
On Translation of {Boolean} Expressions,
Commun. ACM,
vol. 5,
#7,
1962,
pp. 384-386.
Four techniques are presented for the translation of Boolean expressions,
1. straightforward, as with arithmetic expressions;
2. using jumps on conditions;
3. using a decision tree;
4. indexing the precomputed truth table.
The decision tree is computed as follows.
The Boolean expression is instantiated to two new expressions, by setting the
the first Boolean variable in it to true and false, resp.
Then code is generated that computes the first expression if the variable is
true, etc.
This process is repeated recursively until a single variable remains.
The authors also point out that on a machine with a word length of $2^n$, all
$2^n$ results of a Boolean expression E with at most n variables (the
truth table) can be computed in one computation over E.
To this end the ith variable is replaced by an integer in which the kth
bit is on if the ith bit is on in k.
In this way each bit in the result represents the value of E for one
combination of values for its variables.
R.A. Brooker,
D. Morris,
A general translation program for phrase structure languages,
J. ACM,
vol. 9,
Jan. 1962,
pp. 1-10.
In spite of its title, it describes a compiler compiler; no word about
parsing.
All examples are with regular grammars (translates Autocode, which is
regular).
Robert W. Floyd,
An algorithm for coding efficient arithmetic operations,
Commun. ACM,
vol. 4,
#1,
Jan. 1961,
pp. 42-51.
A heuristic algorithm for compiling arithmetic expressions into efficient code
is developed and expressed in 7 mostly full-page flowcharts.
Basically a subexpression is found by scanning for a closing parenthesis, and
then code is generated backwards for the section until the matching opener.
The run-time result is stored in a a temporary variable, which is then
inserted in the expression as a replacement, and the action is repeated.
Several considerations go into deciding exactly to which closing parenthesis
to scan.
Considering that machines have subtraction and division instructions that
require the right operand to be in memory, it is advantageous to compile the
right operand first.
Array indexes, however, must be computed immediately.
Also, a method is provided to expand exponentiations into sequences of two
multiplications.
Kirk Sattley,
Allocation of storage arrays in Algol 60,
Commun. ACM,
vol. 4,
#1,
Jan. 1961,
pp. 60-64.
Memory is considered as an array Memory[].
Array dope vectors are described.
Non-own arrays have non-own dope vectors and are allocated under a stack
regime at one end of memory
"Own" arrays have "own" dope vectors allocated in global store are allocated
at the other end of memory.
They are left there at block and routine exit, and since their "own" dope
vectors survive, they are found back upon reentry.
Data structures and routines for handling all this are given.
P.Z. Ingerman,
Dynamic declarations,
Commun. ACM,
vol. 4,
#1,
1961,
pp. 59-60.
A routine for resizing and reallocating a possibly multi-dimensional array,
using row-major order in source and target array.
E.T. Irons,
A syntax-directed compiler for Algol 60,
Commun. ACM,
vol. 4,
#1,
Jan. 1961,
pp. 51-55.
The first to describe a full parser. It is essentially a full
backtracking recursive descent left-corner parser.
The published program is corrected in a Letter to the Editor by
B.H. Mayoh, Commun. ACM, vol. 4, no. 6, p. 284, June 1961.
A.A. Grau,
Recursive processes and ALGOL translation,
Commun. ACM,
vol. 4,
#1,
pp. 10-15.
Jan. 1961,
Recognizes that since the syntax of ALGOL is recursive, the translator
must also be recursive.
Implements this using push-down lists (since he has no ALGOL
compiler!).
H.D. Huskey,
W.H. Wattenburg,
A basic compiler for arithmetic expressions,
Comm. ACM,
vol. 4,
#1,
pp. 3-9.
Jan. 1961,
An 8 x 7 precedence parsing table + 5 pages of undocumented FORTRAN
program.
Seems very ad-hoc.
E.W. Dijkstra,
Recursive programming,
Numer. Math.,
vol. 2,
pp. 312-318.
1960,
In his bid for "recursive programming", i.e., the unlimited use of a routine
before the previous call of it has finished, the author extends the idea of a
stack, pioneered by Bauer & Samelson ("Sequential formula translation",
Commun. ACM, 1960) to the calling of routines.
The parameter values are put on the stack, including room for a return
value, "link information" is put on the stack as the last parameter, and
the routine is called.
The author then points out that is routine can again evaluate
expressions, for which it can use the same stack, and even call
routines, for which it can again use the same stack, etc.
Based on the thought that is should not make any difference if a value
was put on the stack from a memory location or from a call of a routine,
the link data must contain a "dump" of the CPU state, so that the state
can be restored completely upon return from the routine.
This dump then includes the previous stack pointer which can be used as
a parameter pointer, to reach the stacked parameters (no offset -N yet)
and a form of return address.
Algol 60 makes further demands, since it allows access to non-local
variables.
This necessitates passing the parameter pointer of the lexically
enclosing routine as a parameter as well, the static link.
Since the static link provides access to the parameter list of the
lexically enclosing routine, which again contains a static link, a chain
is formed.
An alternative implementation is suggested in the form of an array of
these static links, the display.
The author also claims that the nesting depth must be stored in the link
information area, but this is not necessary, since the compiler knows
the nesting depth and can incorporate it in the generated code.
The author suggests that the required instructions (called "orders"
here) should be part of a machine's hardware, as was subsequently
implemented in the Electrologica X-8 (and then in the PDP-45, etc.).
All this is described in running prose, without the use of diagrams,
which explains why the paper was not immediately understood by everyone.
R.A. Brooker,
D. Morris,
An assembly program for a phrase-structure language,
Computer J.,
vol. 3,
Oct. 1960,
pp. 168-174.
A simple compiler compiler.
J. McCarthy,
Recursive functions of symbolic expressions and their computation by machine -- Part I,
Commun. ACM,
vol. 3,
#4,
April 1960,
pp. 184-195.
Description of LISP as a functional language, including interpreter and
implementation, starting from basic (mathematical) principles.
Basing his language on functions and expression, the author first
introduces the conditional expression, and shows that it allows all kinds
of functions to be defined as expressions, including recursive ones.
To obtain expressions that are functions rather than be used as the
right-hand side in the definition of one, one needs "lambda" notation.
This leaves one with an anonymous expression, which cannot use itself,
preventing recursive functions represented as expressions.
This is remedied by introducing the "label" notation label(a, E),
which is the expression E with the proviso that any occurrence of
a in E refers to E.
Next the building blocks of S-expressions (Symbolic expressions) are
introduced: atoms and cons pairs (no name given in the paper), plus
their list notation.
The functions on (the paper writes "of") S-expressions are written as
M-expressions (Meta-expressions), in a notation similar to but visually
different from S-expressions.
They are the usual atom, eq, car, cdr, and cons.
Examples of recursive M-functions are given: subst, equal,
append, etc.
Then the step is taken to write M-expressions in the same notation as
S-expressions, thus obtaining (recursive) S-functions.
This is where the parentheses hit the fan.
The author writes: "This notation is writable and somewhat readable".
Prime examples are apply and eval; an S-expression interpreter
is displayed in less than one column of text, and explained in two.
By then higher-order functions become obvious; the example is
maplist.
S-expressions are implemented as trees, or actually as DAGs.
The author argues extensively that they use pointers and are not
implemented in arrays: 1. unpredictable size; 2. single nodes can be
freed; 3. DAGs are possible.
The implementation of association lists is special, and contains more
info than is required by the above model.
(Nodes are called "registers" in the paper.)
Free nodes are kept in a free-storage list.
When the system runs out of space, a mark & scan garbage collector tries
to find more memory.
(The term "garbage collection" is not used, and neither is "mark &
scan"; the algorithm is described as a side issue.
It is not explained how the marking is done, but the text seems to
suggest that memory is scanned iteratively, marking more and more nodes
until noting changes any more.)
Handling of recursive functions was found to be a problem, which was
solved by the introduction of a "public push-down list" for storing the
temporary data of a routine when it is called while still active.
(Again the word "stack" does not appear.)
As an alternative formalism for computation, strings are examined, with
basic operations first and rest (the head and tail of
today), but it is briefly shown that (structured) S-expressions are more
convenient.
Finally the author shows how S-expressions can be converted to "the
usual form of computer programs"; this "usual form" turns out to be ...
flowcharts!
There doesn't seem to be a Part II.
K. Samelson,
F.L. Bauer,
Sequential formula translation,
Commun. ACM,
vol. 3,
#2,
pp. 76-83.
Feb. 1960,
The authors show that the information gathered in Rutishauser's
algorithm by moving forward and backward through the text can be managed
more conveniently and efficiently by putting it stackwise in a cellar.
They show this for expressions and statements alike, and translation
schemata are supplied.
The authors indicate that a second cellar, called the numbers cellar
will be needed during execution of the expressions.
Next, they want to translate array indexes in for loops while doing
stength reduction, and find it impossible to do with a stack, which is
hardly surprising today, since it is obviously not a context-free
operation.
P.B. Sheridan,
The arithmetic translator-compiler of the IBM FORTRAN automatic coding system,
Commun. ACM,
vol. 2,
Feb. 1959,
pp. 9-21.
Amazingly compact description of a optimizing Fortran compiler;
concentrates on the translation of arithmetic expressions.
The expression is first turned into a fully parenthesized one, through a
precedence-like scheme ('+' is turned into ')))+(((', etc.).
This leads to a list of triples (node number, operator, operand).
This list is then reduced in several sweeps to eliminate copy operations
and common subexpressions; these optimizations are machine-independent.
Next several machine-dependent (for the IBM 704) optimizations are
performed.
Etc.
J. Strong,
J. Wegstein,
A. Tritter,
J. Olsztyn,
O. Mock,
T. Steel,
The problem of programming communication with changing machines, Part 1,
Commun. ACM,
vol. 1,
#8,
pp. 12-18.
Aug. 1958,
UNCOL
Eldridge S. Adams,
Stewart I. Schlesinger,
Simple automatic coding systems,
Communications of the ACM,
vol. 1,
#7,
pp. 5-9.
July 1958,
Proposes a simple translator for arithmetic expressions, as an add-on to an
assembler.
The parsing and translation algorithms used are not explained explicitly,
since the authors think "anyone experienced in programming can make a simple
automatic coder based on the foregoing discussion."
I can't without guessing a lot of information, and the sample code the
translator produces is much better than follows from the scant explanation
given.
Maurice V. Wilkes,
David J. Wheeler,
Stanley Gill,
The Preparation of Programs for an Electronic Digital Computer,
(1st/2nd Edn.),
Addison-Wesley,
Reading, Mass.,
1951/1957,
pp. ???/238.
The book consists of three parts and numerous appendices.
Part one describes the machine and its programming, part two describes
the standard library routines prepared for it, and part three shows many
programs using these routines.
This is the pre-assembler era.
Programs are typed in a form very close to base 32, on five-holes
punched tape, using a teletype; the numbers they contain (mostly
addresses) can be typed in decimal; this makes the programs more or less
readable.
The resulting tape is read into the machine by pushing the Start Button.
This loads a fixed 40-word loader into memory from a fixed piece of
hardware, called the uniselectors, and activates it.
This loader reads the tape, converts the decimal numbers and puts the
instructions (called "orders") on the tape into memory, in 17-bits
words; it knows a number of pseudo-instructions to control the placement
process, including one that stops the loading process and starts the
program.
The book describes mainly the EDSAC, used at the University Mathematical
Laboratory, Cambridge, England.
An instruction consists of
a 5-bits opcode, typed by its numerically corresponding teletype letter
key,
an indirection bit, typed as S,
a 10-bits memory address, typed decimally using the figure shift of the
teletype, and
a Special-Indication bit, typed as $pi$.
(The letter shift of their teletype contains the 26 capital letters
roughly in qwerty arrangement, $pi$, $theta$, $DELTA$, $phi$, Erase (31)
and Blank Tape (0)).
The machine knows two interpretations of the contents of a word: as an
instruction and as a 16-bits+sign-bit floating-point number.
Floating point numbers are between $-1$ and $ 1 - 2 sup { - 16 } $; all
others have to be handled by explicit scaling by the programmer.
The loader only supports the instruction format; numbers must be entered
(when occurring inside a program) as instructions with the same bit
pattern.
Additionally, integers are needed, of course; an integer i is
represented as a floating point number $ i times 2 sup { - 16 } $, and
the corresponding allowances must be made by the programmer.
The machine has three registers, the (addition) accumulator $Acc$, the
(multiplication) register R and the (index) register B; memory
locations are also called registers.
The main instructions are (with $M[]$ representing the 1024-words memory):
opcode address semantics
A n $ Acc +:= M[n]; $
S n $ Acc -:= M[n]; $
T n $ M[n] := Acc; ~ Acc := 0; $
U n $ M[n] := Acc; $
H n $ R := M[n]; $
V n $ Acc +:= R * M[n]; $
F n goto n
G n if $Acc$ < 0 goto n
E n if $Acc$ >= 0 goto n
J n if B /= 0 goto n
B n $ B := n; $
K n $ M[n] := Opcode("B") + B ; $
(The semantics of the instructions are expressed in words in the book,
since programming languages had not been invented yet.)
Most of these instructions are affected by the Special-Indication bit.
As for G and E, when the bit is on, they clear $ACC$ when the jump
is taken; as for A, S, T and U, when the bit is on, they act on
double words in memory.
All instructions are affected by the indirection bit (written S); if
it is on, the contents of the B register is added to the address in
the instruction before it is executed, for example
AS n $ Acc +:= M[n+B]; $
US n $ M[n+B] := Acc; $
FS n goto $n+B$
BS n $ B := n+B; $
Note, especially, the effect of the last example; neat.
The authors emphasize the usefulness of the B register by showing that
it avoids the practice of modifying an instruction in a loop in order to
access a sequence of memory locations (which was how loops were
programmed before).
Each instruction punched on the input tape must be terminated by a
letter; the normal terminator is F; D can be used as an abbreviation
of $ pi F $, setting the Special-Indication bit to one.
The terminators are interpreted by the loader.
Pseudo-instructions can have other terminators.
One such terminator is $theta$, which causes the loader to add the
contents of location $M[42]$ to the instruction being loaded.
This is used in relocation, as follows.
The pseudo-code $ G ~ K $ is typed just before the body of a subroutine
on the tape; it causes the loader to store the present loading address
in $M[42]$.
This provides relative addressing in the subsequent routine: for
example, a jump $ F ~ 13 ~ theta $ will jump to location 13 in the
present routine.
Other bases for relocation are provided with the terminators H, N,
etc, which will add $M[45]$, $M[46]$, etc. to the instruction being
assembled.
In fact, almost any letter $ alpha $ with a teletype code
$ T ( alpha ) $ can be used as a terminator, and will add
$ M [ 24 + T ( alpha ) ] $.
(F as standard terminator was chosen because $ T ( 'F' ) = 17 $ and
location $M[41]$ is always zero in the standard!
Tricks of the time!)
Standard subroutine linkage uses the B register.
The calling sequence starting from absolute location p, of a routine
at absolute location q is:
B p F
F q F
Now, if the called routine does not modify the B register, it can
return any time by simply doing $ FS ~ 2 ~ F $, which returns to the
location $ p + 2 $.
Otherwise, the routine has to save the value of B.
The standard uses the K instruction, as follows.
Upon entry to the routine, it executes $ K ~ r ~ F $, to store the
instruction $ B ~ b ~ F $ in location r, where b is the value of
B, and r is the address of the one but last instruction of the
routine.
When the routine exits, it meets the exit sequence (at locations r and
$r+1$):
B b F
FS 2 F
the first of which was planted there by the K instruction.
Needless to say, this does not work for recursive routines.
The authors distinguish two kinds of parameters to routines: preset
parameters and program parameters.
The first are generic parameters and modify the routine as it is being
loaded.
It uses the relocation mechanism described above, using the H, N,
etc. terminators.
This allows the routine being loaded to be instantiated for, for example,
the length of the array it handles; in fact, relocation is just
instantiation for a specific location!
Normal (program) parameters are conventionally located just after the
call, and are picked up from there by the called routine, using the B
register.
Program parameters are usually constants; a run-time variable parameter
is passed in $Acc$.
If more than one variable parameter is present, the transfer has to be
designed specifically.
A remarkable form of threaded code is explained in Section 2-22.
In it, one or more instructions are passed as constant program
parameters.
The called routine, called an "interpreter", then stores the instruction
in one or more locations in its body, thus modifying itself, as if a
routine had been passed as a parameter.
This is used to implement a complex number manipulation routine, which
can handle $CAcc+:=c;$, $CAcc-:=c;$, $M[n..n+3] := CAcc;$, etc., where
$CAcc$ is a (software) complex accumulator, all this in 13 instructions.
When using the routine, single length instructions are passed as
parameters, which are then used repeatedly to construct the complex
operations.
To construct a complete program, the user writes or collects (from the
library) a number of subroutines and a master routine (driver).
These subroutines are stand-alone in that they do not (usually) call
other subroutines, there being no facility for external references.
There is one sub-subroutine, $R9$, which should always be loaded to
location 56 and which is freely called by the subroutines; there is no
need for external reference resolution of $R9$, since its location is
fixed by convention.
The user then prefixes each of these subroutines by a load
pseudo-instruction $ PZT ~ n ~ K $, where n is the address of the
first location at which the routine should be loaded.
The subsequent values of n should be chosen so that routines do not
overlap; this is the responsibility of the programmer.
The master routine consists of calls of the subroutines, whose addresses
have just been fixed by the programmer.
A loader instruction $EZPF$ terminates the tape and starts the program.
Note that there are only three levels of routines, the master routine,
the subroutines, and $R9$.
It is not unusual for routines to have multiple entry points.
The chapter on "automatic programming" is an eye-opener.
The notion of a programming language does not yet exist, and "automatic
programming" is a set of techniques that simplifies the preparation of
programs.
One example is a loader extension, called the "assembler subroutine".
It stores the addresses of the start locations of (numbered) routines in
a table while they are being loaded.
This obviates the need to supply load positions by manual computation.
The table can then be consulted at run time by the master routine to
find and call its subroutines.
A more advanced subroutine patches the code with the correct addresses,
using a form of back-patching; this does away with the table.
It is similar to the "floating addresses" of Wilkes (1953), adapted to
the fact that bit 6 is no longer available; hence the back-patching.
Another extension allows constants to be specified, which are then
converted to bits and placed in available free locations; further
address updating is of course required.
Such advanced loaders are called "conversion routines", since they
convert a punched program to an internal program.
The last section on "automatic programming" considers the possibility to
input mathematical formulas to such a conversion routine, but the
authors express the feeling that the wish for such facilities (including
those described above) stem from inadequate standards set forth by the
computing centre.
After pointing out the inefficiencies incurred by these methods in some
systems, they write:
"Even in more favourable cases, experienced programmers will be able to
obtain greater efficiency by using more conventional methods of
programming."
A large part of the book is concerned with programs for a large
assortment of numerical functions; the difference between computer
science and numerical analysis has not yet been established.
A "computer" is still a professional who performs numerical computations
using "work sheets" (pg. 92).
Heinz Rutishauser,
Some programming techniques for the ERMETH,
J. ACM,
vol. 2,
#1,
pp. 1-4.
Jan. 1955,
Suggests several ways of programming, specifically for the ERMETH
(Elektronische RechenMachine der Eidgenossischen Technischen Hochschule,
= Electronic Computing Machine of the Swiss Federal Technical
University).
The first starts from a flow chart.
The contents of the boxes are input to an "assembler routine", together
with descriptions of the connections.
The assembler routine creates a program in memory.
The second considers the use of "pseudoorders", today called macros.
Suggested use is for inlining of subroutines, and for altering a program
written for single-length reals to one for double-length reals.
The third considers a rudimentary programming language.
It is suggested to appropriate numbers be assigned to words like "if",
"(", etc., to the effect that mathematical statements like
if a < 0 then 0 => a
can be expressed as number sequences, can be input to a routine which
can then construct a program.
(In modern terms, lexical analysis has to be done by the programmer by
hand.)
The fourth describes loop unrolling, called "program stretching" here.
Speed-ups between 3 and 8 are reported for matrix inversion.
M.V. Wilkes,
The use of a `floating address' system for orders in an automatic digital computer,
Proc. Cambridge Philosophical Society,
vol. 49,
1953,
pp. 84-89.
The loader of the ESDAC is extended with a "floating address
subroutine", which accepts labels of the form $ G ~ n ~ K $.
During loading the location of label n is stored in location
$ 45 + n $ (I think).
So a directory is being built up.
Any instruction using a label n is marked by setting bit 6 (counting
from 0 from the left), which is not used (but which became the
indirection bit later on), and its address bits are set to n.
When the routine has been loaded, the floating address subroutine comes
in, scans memory and if it finds bit 6 set, updates the address, using
the old address as an index in the directory.
(How does this avoid messing up constants that happen to have bit 6 on?)
A second part of the paper examines the idea of "synthetic orders" (=
macro instructions), and points out the usefulness of labels in
conjunction with synthetic orders.
Heinz Rutishauser,
Automatische Rechenplanfertigung bei programmgesteuerten Rechenanlagen,
(in German: Automatic computation plan construction for program-controlled computation devices),
Birkhäuser Verlag,
Basel,
1952,
pp. 45.
This may be the first book on compiler construction. (Check??)
The author explores the possibilities of having the computer produce the
assembly code the programmer usually writes by hand.
Two advantages are pointed out: simplification of programming and
avoiding bugs.
The machine (presumably an ERMETH) is normally programmed by punching
binary machine code directly into paper tape, which is then read and
executed.
Most of the programs are of a numerical nature and arithmetic
expressions are important.
The author's idea is to have a program that reads an expression and
produces the binary code for that expression; in present terms that is a
compiler.
First an encoding is designed for the expression (characters do not yet
exist): variables (operands) are expressed by their machine addresses
(<1000), operators by the numerical values of their machine instructions
(>10000), the open parenthesis as the instruction
"Load cell 0 into accumulator" and the closing parenthesis as
"Store accumulator in cell 0".
So the programmer has to do the lexical analysis, memory allocation and
instruction selection.
The expression, which must be fully parenthesized and may contain binary
operators only, is now read as a block of numbers (a process for which a
program exists) and is processed in a series of scans.
The first scan determines the maximum height of the "Klammergebirge", the
parentheses mountain range.
This mountain range is established as follows: its height starts at 0,
is raised by 1 for each open parenthesis and operand, and is lowered by
1 for each closing parenthesis and operator and at the end of the
expression.
In "mathematically meaningful" (=syntactically correct) expressions, the
height never becomes negative and ends at 0.
The second scan finds the highest mountain peak, which is the first
operand of the first deepest non-nested subexpression.
Code for the subexpression is generated, cleverly combining the
addresses of the operands with the instructions code which represent the
operators and the parentheses.
The subexpression includes all operators and operands inside the
parentheses, so constructions like (a+b-c+d) are allowed; so is (a*b-c*d)
actually, but it will be processed as (((a*b)-c)*d).
The code stores the result of the subexpression in a temporary memory cell,
a different one for each subexpression.
Then the subexpression, including its parentheses, is replaced by the
address of the cell, which now functions as just another variable; the
rest of the expression is shifted left to fill the vacated space.
Now the second step is repeated to find the next subexpression of
maximum height.
When none is found, the maximum height is lowered by 1 and the process
is repeated, until the expression has disappeared completely and its
code has been transferred completely to the output tape.
This process provides parsing of a nesting expression without using a
stack.
Similarly, translations of (nesting) for-loops are provided, using
special and clever encodings for the indexes of the array elements.
Basically, the nesting for-loops are numbered with their depths, and
an indexed element is described as
{array address, list of (multiplication factor, level) pairs}.
For example, a description {a, ((50, 0), (1, 2))} describes the cell
with address a + 50*i0 + 1*i2, where i0 is the controlled variable
(unnamed in the program) of the outermost (level 0) for-loop, and i2
that of the level 2 for-loop.
Loop unrolling is considered essential(!) and time reductions to 40% are
quoted for matrix manipulations.
Since compilation of expressions is destructive, expressions in
for-loops to be unrolled must be copied before being compiled.
Flow charts for most of the algorithms are provided.
The author recognizes that the input (= the program) can be used both
for interpretation and for compilation (pg. 14) although neither term is
used, of course.
Compilation is correctly recognized as an optimization and as a form of
precomputation.
Heinz Rutishauser,
Über automatische Rechenplanfertigung bei programmgesteuerten Rechenanlagen,
(in German: On the automatic computation plan construction for program-controlled computation devices),
Z. Angew. Math. Mech.,
vol. 31,
#8/9,
1951,
pp. 255.
Eight-line summary of a talk, explaining that the conversion of formulas
to assembler instructions, hitherto done by hand, could be performed by
a program.
D.J. Wheeler,
Programme organization and initial orders for the EDSAC,
Proc. Roy. Soc. A,
vol. 202,
1950,
pp. 573-589.
For "initial orders" read "initial program loader".
Describes, and gives in full, two IPLs, version 1 and version 2, both
for the pre-B-register era.
Both are downloaded from the uniselectors at Program Startup.
For the EDSAC see Wilkes (1957).
The first IPL (30 words) does address conversion from decimal to binary
(actually base 32) only.
Terminators are S for "short" and L for "long".
The program starts loading at location 31; there is no provision for
placement control orders.
The length of the program must be specified as the first entry on the
program tape.
All this limits its use to simple, monolithic programs.
The second IPL (40 words) has almost all the "modern amenities":
relocation according to 12+2 different counters;
special relocation inside the current routine, based on terminator
$theta$ (location 42, M[42] can be set using control combination
$ G ~ K $, which must therefore precede each subroutine;
placement control combinations, to start loading code at specified
locations, both absolute (control combination $ T ~ m ~ K $ for loading
starting at m) and relative to the current routine (control
combination $ T ~ Z $ for loading at where $M[42]$ points to;
the short/long bit is set using relocation based on terminator F
(location 41, M[41] == 0) for "short" and "D" (location 43, M[16] ==
00000000000000001 = $ 2 sup { - 16 } $).
Relocation is explained explicitly as load-time parameterization, using
pre-set parameters.
Note that there still is no way to include numbers; they still have to
be converted by hand to the corresponding instructions ("pseudo-orders")
and heavy use is made of the fact that the non-existing instruction code
P has the value 0.
Note also that the mnemonic S and L indicators have been replaced by
a 1950's trick.
For a routine call, one fetches the "value" $ A ~ n ~ F $ in the
accumulator and jumps to the routine.
The routine entry then added $M[2]$, which conventionally contains the
"value" $ U ~ 2 ~ F $ to the accumulator and stores the result
$ E ~ n + 2 ~ F $ in the last location of the instruction.
Return is then automatic.
The paper includes a driver for a complicated trigonometric formula, as
an example.
|