Formal Languages, Automata and Computability
Literature references and annotations by Dick Grune, dick@dickgrune.com.
Last update: Thu Aug 30 19:46:25 2012.
These references and annotations were originally intended
for personal use and are presented here only in the hope
that they may be useful to others.
There is no claim to completeness or even correctness.
Each annotation represents my understanding of the text
at the moment I wrote the annotation.
No guarantees given; comments and content criticism welcome.
Peter Linz,
"An Introduction to Formal Languages and Automata",
4nd ed.,
Jones and Bartlett,
Sudbury, Mass.,
2006,
pp. 377.
Very accessible, but correspondingly limited.
Intuitive and often even colloquial explanations, supplemented by
convincing but not exhaustive proofs.
In addition to the usual fare  finite and pushdown automata, regular
and contextfree languages, Turing machines, etc. it contains a
chapter on other models of computation recursive functions, Post
systems, rewriting systems, matrix grammars, Markov algorithms, Lsystems
and a brief chapter on computational complexity.
Good, soft introduction, but, given the subject, still requires
considerable studying.
Franz Baader,
Tobias Nipkow,
"Term Rewriting and All That",
Cambridge University Press,
Cambridge,
1999,
pp. 313.
Michael Sipser,
"Introduction to the Theory of Computation",
PWS,
Boston,
1997,
pp. 396.
Makes a serious attempt to be student/readerfriendly, by giving proof
ideas rather than have the reader plod through completely workedout
proofs.
Although the attempt is successful to a certain extent, the book is
still terse and formal, and it requires a motivated reader.
Thomas A. Sudkamp,
"Languages and Machines  An Introduction to the Theory of Computer Science",
2nd ed.,
Addison Wesley,
pp. 569.
1997,
Dryasdust and probably curriculumdriven text book on the subject, low
on motivation.
For example, although the author proves that w [ w ] and
a sup i b sup j a sup i b sup j are not contextfree he nowhere
points out the significance of this: the impossibility to express
declarationapplication checking or parameter number checking in a CF
grammar.
I also miss the more juicy theorems, for example, "Every recursively
enumerable language can be obtained by applying a homomorphism to the
intersection of two contextfree grammars", or "The stacks in
contextfree production form a regular language" (although the latter
may be in there somewhere, hidden).
Also the wellknown name "uvwxy theorem" for the pumping lemma is not
mentioned.
But there are redeeming factors: the explanations are quite
understandable and still rigourous, the coverage is wide and there is a
rich set of examples and exercises.
Excellent studying material for the already motivated student.
Hanne Riis Nielson,
Flemming Nielson,
"Semantics with Applications  A Formal Introduction",
John Wiley & Sons,
Chichester,
1992,
pp. 240.
Compares the theories of operational semantics, denotational semantics
and axiomatic semantics, mainly by analyzing the while statement in each
of them.
Lots of math, as can be expected, but wellargued.
D.Kozen,
"A completeness theorem for Kleene algebras and the algebra of regular events",
in Proc. LICS 1991 ????,
pp. 214  225.
John Carroll,
Darrell Long,
"Theory of Finite Automata, with an Introduction to Formal Languages",
PrenticeHall Internat. Ed.,
Englewood Cliffs, NJ 07632,
1989,
pp. 438.
Extensive and easily readable on finite automata, limited and easily
readable on formal languages and parsing.
György E. Révész,
"Introduction to Formal Languages",
McGrawHill,
Singapore,
1985,
pp. 199.
This nifty little book contains many results and elementary proofs of
formal languages, without being "difficult".
It gives a description of the ins and outs of the Chomsky hierarchy,
automata, decidability and complexity of contextfree language
recognition, including the hardest CF language.
Parsing is discussed, with descriptions of the Earley, LL(k) and
LR(k) algorithms, each in a few pages.
V.J. RaywardSmith,
"A first course in formal languages",
Blackwell Scientific,
Oxford,
1983,
pp. 123.
Very useful intermediate between Révész and Hopcroft and Ullman.
Quite readable (the subject permitting); simple examples; broad
coverage.
No treatment of LALR, no bibliography.
Robert McNaughton,
"Elementary Computability, Formal Languages, and Automata",
PrenticeHall,
Englewood Cliffs, NJ,
1982,
pp. 400.
Actually an introduction to theoretical computer science.
Difficultly written, heavy on formalism.
Has some interesting unusual material, e.g., theorems about how adding
parentheses to a grammar affects possible ambiguity (the author's
specialty).
No serious treatment of parsing.
Christoph M. Hoffmann,
Michael J. O'Donnell,
"Pattern matching in trees",
J. ACM,
29,
#1,
pp. 6895.
Jan. 1982,
Full automata theory of (mainly bottomup) pattern matching in trees.
Algorithms are compared: bottomup tabledriven is usually best.
Many literature references.
John E. Hopcroft,
Jeffrey D. Ullman,
"Introduction to automata theory, languages, and computation",
AddisonWesley,
Reading, Massachussetts,
1979,
pp. 418.
Successor to Hopcroft & Ullman, 1969, and mandatory reading material for
anybody interested in the subject.
Claims no longer to be an exhaustive treatment of the subject, but is
still pretty exhausting.
Covers grammars, Turing machines, undecidability, LR grammars,
computational complexity, NPcompleteness.
Remarkably, the book does not mention LL grammars.
Michael A. Harrison,
"Introduction to Formal Language Theory",
AddisonWesley,
1978,
pp. 594.
Contains a wealth of interesting facts, ideas and theorems on formal
languages, with considerable attention to parsing.
Informative: puts roughly equal emphasis on ideas and formalisms.
Major drawback: several proofs depend on earlier exercises, but no solutions
to the exercises is given.
The theorem $L_0 = h(L_2 union L_2) is in Section 10.3 (hard to find from the
index).
R.D. Tennent,
"The denotational semantics of programming languages",
Commun. ACM,
19,
#8,
pp. 437453.
Aug. 1976,
A. Salomaa,
"Formal Languages",
Academic Press,
New York,
1973,
pp. 322.
Contains, among much other knowledge, the theorem that any Type 0 language can
be obtained as the intersection of two CF languages followed by an erasing
homomorphism.
F. Gécseg,
I. Peák,
"Algebraic Theory of Automata",
Akadémiai Kiadó,
Budapest,
1972,
pp. 326.
Richard Y. Kain,
"Automata Theory: Machines and Languages",
McGrawHill Book Comp.,
New York,
1972,
pp. 301.
Th author is fully aware that he is explaining a difficult subject, and
that he is explaining it to electrical engineering students.
The first serious formalism appears on page 35.
He formulates clearly some principles and bits of wisdom that since have
been lost: every actual machine is a finitestate machine; the state of
a machine is both a summary of its past and a prediction of its future.
The book covers:
finitestate machines;
Turing machines;
linearbounded automata;
pushdown automata;
other machine models (twoway pushdown; many pushdown tapes; counters;
and stack automata (which see below));
operations on languages;
solvable linguistic questions;
unsolvable linguistic questions.
The book has 17 pages on "stack automata", which are pushdown automata
that can examine the full contents of the pushdown stack.
It is shown that such automata can recognize all contextsensitive
languages and that there is at least one noncontextsensitive language
that can be recognized by a proper variant of these stack automata.
One misses a key to the exercises.
The author gives clear descriptions of Post's correspondence and normal
problems; a summary follows here.
A Post correspondence system is a set of word pairs; the set is used as
follows to produce two string simultaneously.
Both strings start empty; as often as desired, a pair in the set is
considered, its first word is concatenated after the first string and
the second word after the the second string.
The Post Correspondence Problem is: given a Post correspondence system,
is it possible to make two equal strings?
Remarkably (but fundamentally) the problem is recursively unsolvable.
This is equivalent to the Post normal system.
A Post normal system is again a set of word pairs; the set is used as
follows to transform one string into another.
If the string starts with the first word in a word pair, that word is
removed from the string and the second word of the pair is appended to
the end, until all the original text has been replaced.
If the process gets stuck the original string is rejected.
The Post Normal Problem is: given a Post normal system,
is there a string that is transformed into itself?
J. Doner,
"Tree acceptors and some of their applications",
J. Comput. Syt. Sci.,
4,
#??,
1970,
pp. 406451.
John E. Hopcroft,
Jeffrey D. Ullman,
"Formal Languages and their Relation to Automata",
AddisonWesley,
Reading, Mass.,
1969,
pp. 242.
Classical text in formal languages, full of solved and unsolved problems.
Michael A. Arbib,
"Theories of Abstract Automata",
PrenticeHall,
Englewood Cliffs, N.J.,
pp. 412.
1969,
A very classical text: excellent and minute explanations for students
with much time and patience.
Explains everything from the original problems (in engineering,
electronics, etc.).
Excellent chapter on algebra: sets, graphs, groups and semigroups.
The terminology has aged somewhat, it seems: the word
"deterministic" is not used at all and the Chomsky hierarchy is
not mentioned.
Many examples.
A. Salomaa,
"Two complete axiom systems for the algebra of regular events",
J. ACM,
13,
#1,
pp. 158169.
1966,
A. Ginzburg,
"Algebraic Theory of Automata",
ACM Monograph series,
Academic Press,
New York,
1968,
pp. 165.
Alfred V. Aho,
Jeffrey D. Ullman,
"The Theory of Languages",
Mathematical Systems Theory,
9,
#2,
pp. 97126.
1968,
