Program Creation -- Theory, Methodology and Tools
Literature references and annotations by Dick Grune, dick@dickgrune.com.
Last update: Sun Nov 20 22:04:39 2022.
These references and annotations were originally intended
for personal use and are presented here only in the hope
that they may be useful to others.
There is no claim to completeness or even correctness.
Each annotation represents my understanding of the text
at the moment I wrote the annotation.
No guarantees given; comments and content criticism welcome.
Nicolas Carriero,
David Gelernter,
A computational model of everything,
Commun. ACM,
vol. 44,
#11,
Nov. 2001,
pp. 77-81.
The authors divide computing into two orthogonal components:
computation, which includes computers, printers, etc., in short anything
that does something useful for computation; and coordination, which
includes data, files, printer commands (files to be printed), etc., in
short anything that requires or assists in computation.
(Is this still orthogonal? DG)
The authors offer two models for coordination, Tuple Space and
Lifestreams, and one for computation, symmetric programming languages.
A second model for computation, human thought, is left for the last
paragraph (and for the next 20 years).
Tuple space is like Linda tuple space, except that it is recursive
(tuples can contain tuple spaces) and ubiquitous (it is the only way to
represent data).
Lifestreams are somewhat similar to tuple spaces, but are ordered
chronologically.
Items can be anonymous, and can be accessed by contents, classification
or what not.
Both have the property that items can be added and retrieved by any
process that has access to the tuple space or lifestream.
Symmetric programming languages treat space and time symmetrically.
For example, just as a block in C can hold three variables
x, y, and z, which exist concurrently and occupy space, a
block in a symmetric language can hold three processes which exist
concurrently and occupy time.
(I think I do not understand this part.)
Cristóbal Pareja-Flores,
J. Ángel Velázquez-Iturbide,
Calculating encoding and decoding functions for prefix codes,
ACM SIGPLAN Notices,
vol. 34,
#4,
April 1999,
pp. 54-60.
Uses Partsch's method of program transformation to derive the desired
functions systematically.
Philip Wadler,
How to declare an imperative,
ACM Computing Surveys,
vol. 29,
#3,
Sept. 1997,
pp. 240-263.
Monads are a way to define the semicolon sequencer of imperative
languages as a binary operator in a functional language.
Not surprisingly, the arguments of the ; operator are quite
complicated.
The main purpose of the ; operator is to gather and pass on "global"
state, of type State.
The left operand is a function of type
(State → ($alpha$, State)),
the right operand is a function of type
($alpha$ → (State → ($beta$, State))),
and the result is a function of type
(State → ($beta$, State)), as far as I can decipher.
Considerable syntactic sugar is applied to hide the gory details.
Many examples are given.
Philip Wadler,
Lazy versus strict,
ACM Computing Surveys,
vol. 28,
#2,
June 1996,
pp. 318-320.
Tutorial on the state of the art in lazy versus strict.
First the difference is explained.
Next it is stated that monads combined with continuations can be used to
introduce strict semantics in a lazy language and lazy semantics in a
strict language.
Finally, taking Church's $ lambda $ calculus as a model for the lazy
languages and Plotkin's $ lambda sub v $ calculus as a model for the
strict languages, and extending them towards each other using the above
techniques, the difference can be isolated in one very simple axiom, all
the other axioms being the same in both augmented theories.
No proofs, just a sketch; the paper ends in a visual pun with the words
untied and united.
Branko Soucek,
The Iris Group,
Dynamic, Genetic and Chaotic Programming -- The Sixth Generation,
John Wiley,
1992,
pp. 544.
[w.00418.IG]
Ole-Johan Dahl,
Verifiable Programming,
Prentice Hall,
1992,
pp. 350.
More of the same.
Peter Fritzson,
Nahid Shahmehri,
Mariam Kamkar,
Tibor Gyimothy,
Generalized algorithmic debugging and testing,
ACM Letters on Programming Languages and Systems,
vol. 1,
#4,
Dec. 1992,
pp. 303-322.
[ see sigplan june 1991 ??? ]
Ronald E. Prather,
The semantics of CAT,
ACM SIGPLAN Notices,
vol. 27,
#6,
pp. 54-63.
June 1992,
Based on category theory.
Antoni Diller,
Z: An Introduction to Formal Methods,
John Wiley,
Chichester,
1990,
pp. 309.
How to program in Z.
Very nice and realistic examples and case studies: from
a telephone number data base and a bill of materials manager to the
specification of a text editor.
Two chapters show how to transform
specifications in Z to programs in Miranda and Prolog.
Includes the Z Reference Manual.
With the possible exception of the somewhat belligerent
preface, the author's attempts at humor are reasonably successful.
It suffers from 88 special mathematical symbols; why can't these people
use keywords?
Matthew Hennessy,
The Semantics of Programming Languages -- An Elementary Introduction using Structural Operational Semantics,
John Wiley,
Chichester,
1990,
pp. 157.
Structural Operational Semantics is a combination of Evaluation semantics
and Computation semantics.
Evaluation semantics is a calculus for results
(values) associated with instances of language constructs, Computation
semantics is a calculus for the sets of computations a construct specifies.
The book is disappointing in that it is again full of ingenious ways to
prove the obvious.
Caroll Morgan,
Programming from Specifications,
Prentice Hall,
Hemel Hempstead,
1990,
pp. 255.
Programming theory expressed with over 100 one-symbol operators,
gouverned by some 80 laws.
The written text is relatively easy to follow, the formalism
has a high threshold.
Many case studies.
Edsger W. Dijkstra,
Carel S. Scholten,
Predicate Calculus and Program Semantics,
Springer Verlag,
New York,
1990,
pp. 220.
Bone dry and very efficient exposition of the subject, giving heuristics
for finding proofs.
Probably worth the considerable effort of studying it.
Does not contain a single literature reference.
Anne Kaldewaij,
Programming: the Derivation of Algorithms,
Prentice Hall,
1990,
pp. 240.
A large number of small algorithms are derived using a very polished
form of calculational programming.
No literature references.
Helmut A. Partsch,
Specification and Transformation of Programs -- A Formal Approach to Software Development,
Springer Verlag,
Berlin,
1990,
pp. 493.
Systematic and thorough real-world treatment of the subject in great
mathematical detail.
Daniel Keller,
A guide to natural naming,
ACM SIGPLAN Notices,
vol. 25,
#5,
May 1990,
pp. 95-102.
Much good and usable advice on naming types, procedures, variables, etc.
in programs.
Giulio Ianello,
Programming abstract data types, iterators and generic modules in C,
Software -- Practice and Experience,
vol. 20,
#3,
March 1990,
pp. 243-260.
The values created for the ADT have pointer semantics.
Iterators are implemented as four routines: Start(iter_state),
Exists_Next(iter_state), Next(iter_state), and
Stop(iter_state).
Generics are implemented using (void *) pointers and macros.
C type checking is maintained.
H. Ehrig,
B. Mahr,
Fundamentals of Algebraic Specification 2 -- Module Specifications and Constraints,
EATCS monographs on Theoretical Computer Science, #21,
Springer-Verlag,
Berlin,
1989,
pp. 421.
Reasonably understandable set of examples of applications of the
formalisms described in part 1.
Indispensable for understanding part 1.
Mike Spivey,
The Z Notation,
Prentice Hall,
Hemel Hepmstead, England,
1989,
pp. 154.
Z is a specification and programming notation with peculiar
quasi-two-dimensional syntax items, called schemas, and is based on
(naive) set theory and predicate calculus.
It explicitly includes error handling.
Remarkably, the primitive and defined notions, predicates, etc., are
described in UNIX-like manual pages.
A library of definitions is presented, called the mathematical tool-kit.
J.J. van Amstel,
J.A.A.M. Poirters,
The Design of Data Structures and Algorithms,
Prentice Hall/Academic Service,
Schoonhoven, Holland,
1989,
pp. 334.
Pre- and postconditions are used systematically to design and sometimes
derive algorithms and data structures.
Meanwhile the first principles of
programming are explained, rapidly advancing to polymorphism, the
eight-queens problem and heap sort.
Provides interesting insights.
For the sophisticated novice.
Greg Nelson,
A generalisation of Dijkstra's calculus,
ACM TOPLAS,
vol. 11,
#4,
Oct. 1989,
pp. 517-561.
An almost tutorial explanation of axiomatic semantics from first
principles, almost in normal language.
Very readable, the right material for non-naive beginners who want to be
advanced.
Thomas Pressburger,
Maximum sum subsegment derivation,
vol. ,
pp. .
1988,
Actually an introduction to Squigol, with applications to the derivation
of maximum sum subsegment algorithms.
D. Riley,
Structured programming, sixteen years later,
J. Pascal, Ada & Mod2,
vol. 7,
#1,
1988,
pp. 42-48.
Two case studies of invariant-driven programming plus some pep talk
about its use.
H. Ehrig,
B. Mahr,
Fundamentals of Algebraic Specification 1 -- Equations and initial semantics,
EATCS monographs on Theoretical Computer Science, #6,
Springer-Verlag,
Berlin,
1985,
pp. 321.
This is Theory with a capital T; requires heavy mathematical background.
Gives a definition of a line editor its own mother would not recognize.
The average computer scientist is probably better off reading part 2 first.
Jacques Loeckx,
Kurt Sieber,
The Foundations of Program Verification,
Teubner / Wiley,
Stuttgart / Chichester,
1984,
pp. 230.
Reasonably palatable.
Covers program verification using Floyd's inductive
assertions, Hoare calculus and Milner's LCF.
J.J. van Amstel,
Programmeren -- het ontwerpen van algoritmen (met Pascal),
(In Dutch: Programming -- The Design of Algorithms (with Pascal)),
Academic Service,
Den Haag,
1984,
pp. 164.
Fairly formal introduction to programming, consistently placing pre- and
postconditions first and code second, in Dijkstra style.
Most of the work is done on simple types and arrays; records are mentioned in
passing, pointers not at all.
Elaine Weyuker,
Asserting test data adequacy through program inference,
ACM TOPLAS,
vol. 5,
#4,
Oct. 1983,
pp. 641-656.
[ bibliography ]
Jean Dominique Warnier,
Logical Construction of Systems,
Van Nostrand Reinhold,
New York,
1981,
pp. 177.
The methodology of Warnier's Logical Construction of Programs [1974] applied
to systems and organizations rather than to output.
D. Gries,
The Science of Programming,
Springer-Verlag,
New York,
1981,
pp. 357.
A tale from a different world.
Uses a mixture of proposition calculus,
predicate calculus and a natural deduction system.
Sometimes trivial (?), sometimes very profound, always readable.
Especially part III, the development of programs, is very interesting: it
shows the construction of many algorithms, including a way to run (some)
programs backwards.
One small complaint: some of the cross-references are incorrect.
N. Suzuki,
Analysis of pointer rotation,
in Seventh ACM Symposium on Principles of Programming Languages,
Jan. 1980,
Las Vegas, Nevada,
pp. 1-11.
Author's abstract:
Pointer manipulation is one of the trickiest operations in programming and is
a major source of programming errors. A great deal of research effort has been
dedicated to making programs with pointers more reliable. In this paper we
will present pointer operations which reduce conceptual difficulties when
writing pointer programs, and increase reliability of programs. We will
analyze theoretically as well as empirically why these operations are more
convenient. Finally, we will define the _safety_ of rotations, and show that
it is decidable to check whether the rotation is safe and also show that it is
a good measure of the correctness of programs. (That is, if the rotation is
safe it is highly probable that the program is correct, and if it is unsafe it
is highly probable that the program is incorrect.) Probably the most
successful effort to eliminate difficulties associated with pointer
manipulation is the invention of abstract data types and the use of
implementation modules. Pointers are often used to implement complex data
structures, which in turn represent abstract data types. We can textually
localize relevant pointer manipulations using abstract data types. Thus, we
can easily check correctness of pointer programs. This approach is successful
when the structure of abstract data types is simple, or efficiency is not
seriously required. However, programs like garbage collection, list marking
and copying, and balanced tree manipulations have not been well specified by
abstract types.
Zohar Manna,
Adi Shamir,
The optimal approach to recursive programs,
Commun. ACM,
vol. 20,
#11,
Nov. 1979,
pp. 824-831.
Discusses the merits of least, maximal and optimal fixed points.
Fascinating account of five slightly different very simple recursive
programs that have completely different fixed points.
J. Darlington,
R.M. Burstall,
A system which automatically improves programs,
Acta Inf.,
vol. 6,
#1,
March 1976,
pp. 41-60.
Actually about improving functional-like programs:
recursion removal,
common subexpression elimination,
in-lining,
using assignments to overwrite list cells no longer in use (compile-time
garbage collection).
B. Baker,
An algorithm for structuring programs,
in Third ACM Symposium on Principles of Programming Languages,
Jan. 1976,
Atlanta, Georgia,
pp. 113-126.
Author's abstract:
Structured programming emphasizes programming language constructs such as
_while_ loops, _until_ loops, and _if then else_ statements. Properly used,
these constructs make occurrences of loops and branching of control
obvious. They are preferable to _goto_ statements, which tend to obscure the
flow of control [DDH, DIJ]. This paper describes an algorithm which transforms
a flowgraph into a program written using _repeat (do forever)_ and _if then
else_ statements. The goal of the algorithm is to produce readable programs,
rather than to avoid the use of _goto_ statements entirely. _goto_ statements
are generated when there is no better way to describe the flow of control. A
number of techniques for eliminating _goto_ statements from programs have been
previously published [AM, BJ, BS, COO, KF, KOS, PKT]. However, these
techniques do not necessarily produce clear flow of control [KN]. Misuse of
control constructs may mislead the reader into expecting patterns of control
flow which do not exist in the algorithm. For example, these techniques may
use a _repeat_ statement when the contained code cannot be executed more than
once or add numerous control variables to avoid _goto_ statements. They also
may avoid _goto_ statements by copying segments of code or creating
subroutines. The former method results in longer programs and bugs may be
introduced when all the identical segments must be modified. The latter method
may result in subroutines which appear unnatural.
F.L. Bauer,
A philosophy of programming -- A course of three lectures,
in Proceedings of the International Summer School on Language Hierarchies and Interfaces, Marktoberdorf, FRG,
ed. by F.L. Bauer and K. Samelson,
LNCS #46,
Springer Verlag,
pp. 194-241.
1975,
Lecture 1: A philosophy of programming requires a unique conceptual
basis, founded on both scientific, economic and educational
considerations.
This basis is found in composability: the possibility to compose new
concepts from a small number of general and powerful concepts by
applying simple and well-defined composition rules.
[Orthogonality is not yet mentioned. DG]
Lecture 2: A "structured program" is a sequence of programs, each more
detailed than the previous, spanning the gap from problem to machine
code, such that each program can immediately be deduced from its
predecessor.
Various examples are given.
Lecture 3: Programming and programming languages should treat software
and hardware uniformly (especially in the context of parallel
processing).
Programming languages should have the same neutral stance as to being
implemented in software or in hardware as Petri nets have.
(The author sees the border line between software and hardware fade.)
Summary:
Scientifically:
Complexity is the problem; a unique conceptual basis, well-structured
design, and uniformity of software and hardware are the solution.
Economically:
Short-term focus on employee productivity, reliability and loyalty
should make place for emphasis on employee independence and intellectual
activity.
Educationally:
Start young; the old generation (1973!) is lost anyway.
Note for the modern reader: the "certain nameless manufacturer" vilified
in the paper is IBM, not Microsoft.
Jean Dominique Warnier,
Logical Construction of Programs,
Stenfert Kroese B.V.,
Leiden,
1974,
pp. 221.
The proposed method is very similar to JPM.
The user is led to write a regular expression (in schematic form) for the
results to be obtained.
The notation allows concatenation, non-empty repetition of names (+),
optionality of names (?), and alternatives specifying names
(( name1 | name2 | ... )).
The error-prone empty set is avoided by decomposing * into +?;
a rhs is called "complex" if it contains more than one repetition,
option or alternative.
Code is then attached to each terminal name and to the begin and end of each
nonterminal.
To accommodate relationships between various parts of the output, attributes
are attached to the names.
These attributes can only be booleans and designate whether the data belongs
to a given class.
Efficient implementations are constructed using truth tables as conditions
to output statements.
Boolean variables are introduced as "links to truth tables".
The text spells out every detail and summarizes everything in design rules.
K.V. Nori,
V. Srinivas,
Structured programming -- An example using decision tables,
J. Comp. Soc. India,
vol. 4,
#1,
June 1973,
pp. ??-??.
Proposes to accept the "decision table" as an addition to the
structured-programming tool "selection statement".
This somewhat non-linear paper supplies an analysis in terms of
structured programming, of a simulation program for a TDC 16 computer.
The instruction set of the machine was specified through decision
tables; the program was written in FORTRAN, augmented by a program
generator for decision tables.
The paper first describes the simulator, which consists of a driver +
code generated from the decision tables.
Next the driver is analysed for conformance to structured programming
principles (and found wanting).
Then the use of a decision table as a heavy-duty alternative to the
selection statement is argued, mainly because automatic methods of
checking for ambiguities and completeness of decision tables exist.
Finally, more of less unrelated to the main point, the authors report
their experience that structured programming is no help in how to
structure a program; it only supplies the tools, not the instructions of
use.
This problem is exacerbated by the fact that small mistakes in
high-level structuring can have (and did have) large and unfavorable
effects on the lower levels and on the final product.
D.L. Parnas,
On the criteria to be used in decomposing systems into modules,
Commun. ACM,
vol. 15,
1972,
pp. 1053-1058.
Shows an example of a mediocre and a good decomposition and compares the
criteria that led to both.
Likewise for hierarchical ordering between modules.
I.C. Pyle,
R.C.F. McLatchie,
B. Grandage,
A Second-Order Bug with Delayed Effect,
Softw., Pract. Exper.,
vol. 1,
#3,
1971,
pp. 231-233.
Horrible story of what occurs when you ignore zero.
A badly designed instruction (MVC, which copies 1 character more than the
length indication says, and so cannot copy 0 characters) had resulted in a
kludgey implementation of empty strings in the input routine EDIT (the empty
string was implemented as one space character, which was then further on
interpreted as the empty string).
When a macro processor was then added, it needed a real zero-length string,
but it also needed the input routine.
So a flag was added to the input routine, to tell it that it was used by the
macro processor, and had to deliver real empty strings.
Once this flag got out of sync, the input routine yielded a real empty string
where a single space was requires, the MVC got a negative length -1, 256
characters were copied, essential data was overwritten, and the OS crashed.
The problem was fixed by removing the "one space" idea, and checking before
using MVC.
(Packing the nasty MVC in a subroutine was probably too slow.)
D. Parnas,
Information distribution aspects of design methodology,
in Inform. Processing 71,
North-Holland,
1971,
pp. 339-344.
Parnas' Principles:
--
One must provide the intended user with all the information needed to use
the module correctly and nothing more.
--
One must provide the implementor with all the information needed to
implement the module correctly and nothing more.
Does this have any bearing on the "representation module" question?
|