Algorithms
Literature references and annotations by Dick Grune, dick@dickgrune.com.
Last update: Mon Sep 09 11:40:32 2024.
These references and annotations were originally intended
for personal use and are presented here only in the hope
that they may be useful to others.
There is no claim to completeness or even correctness.
Each annotation represents my understanding of the text
at the moment I wrote the annotation.
No guarantees given; comments and content criticism welcome.
Donald E. Knuth,
The Art of Computer Programming, Vol 4.0: Introduction to Combinatorial Algorithms and Boolean Functions,
AddisonWesley,
Reading, Mass.,
2008,
pp. 216.
Incredibly rich in ideas and detail; each page provide stuff for a week of
thought.
Still, the algorithms are presented in a 1950's cross between assembler and
Fortran II.
Stuart Dreyfus,
Richard Bellman on the Birth of Dynamic Programming,
Oper. Res.,
vol. 50,
#1,
Jan. 2002,
pp. 4851.
Very high level and therefore rather ungripping account of same.
Worth reading for the explanation of the term "dynamic programming": to
impress officials and attract funds.
Christian Böhm,
Stefan Berchtold,
Daniel A. Keim,
Searching in highdimensional spaces  Index structures for improving the performance of multimedia databases,
ACM Computing Surveys,
vol. 33,
#3,
Sept. 2001,
pp. 322373.
The problem is basically the same as in its companion paper, Chávez
et al., "Searching in metric spaces": given a (possibly very large) set
U of points and the distance between each pair of points (as a
function); given a point P in U; find all points within a distance
r from P.
But now the points are given a vectors, which is more information than
just their distances.
This allows more specialized and more efficient algorithms to be
designed.
Also, the algorithms allow the index to be modified.
The paper surveys existing techniques, with emphasis on recent research.
Edgar Chávez,
Gonzalo Navarro,
Ricardo BaezaYates,
José Luis Marroquín,
Searching in metric spaces,
ACM Computing Surveys,
vol. 33,
#3,
Sept. 2001,
pp. 273321.
Problem: given a (possibly very large) set U of points and the
distance between each pair of points (as a function); given a point P
in U; find all points within a distance r from P.
To do better than linear search, an index over U has to be built, to
be consulted when asked to solve this problem.
Many algorithms exist, all with initial and running costs.
The paper surveys all known algorithms and then exposes the basic
underlying algorithm, which creates equivalence classes.
Solving the problem then involves rapidly rejecting many equivalence
classes and searching the remainder.
This leads to a taxonomy of algorithms.
Remarkably readable, many examples, good pictures.
Jeffrey Scott Vitter,
External memory algorithms and data structures  Dealing with massive data,
ACM Computing Surveys,
vol. 33,
#2,
June 2001,
pp. 209271.
Internal memory (e.g. 32k words) was too small in the 1960s; it is still
too small (e.g. 256M bytes) in the year 2000; and guess is that it will
continue to be so for a very long time to come.
The paper surveys both the wellknown and the newest results for the
usual heavyduty applications: sorting, text search, indexing,
FastFourier, etc.
Sara Baase,
Allen Van Gelder,
Computer Algorithms  Introduction to Design and Analysis,
(3rd Edn.),
AddisonWesley,
Reading, Mass.,
2000,
pp. 668.
Excellent as ever!
S.S. Skiena,
The Algorithm Design Manual,
Springer,
New York,
1998,
pp. 486.
Good!
Dorit S. Hochbaum,
Approximation Algorithms for NPhard Problems,
PWS,
Boston, Mass.,
1997,
pp. 596.
Thirteen papers on various aspects of the subject.
As is usual, the authors restrict themselves to approximation algorithms
for which bounds can be proved.
No juicy unproven heuristics; more in particular, none of the mainstay
graph coloring heuristics is mentioned, although the approximation
algorithm that is given is surprising.
Rather than coloring the nodes with integers, they are first colored
with unitlength vectors of real numbers, such that the vectors of
adjacent nodes are almost orthogonal to each other.
The whole system is then distorted in Nspace by rounding, to the effect
that the vectors each reduce to contain one 1 and all 0's elsewhere.
Each vector then corresponds to an integer = color.
L.M.G. Feijs,
R.C. van Ommering,
Abstract derivation of transitive closure algorithms,
Information Processing Letters,
vol. 63,
#3,
Aug. 1997,
pp. 159164.
The authors base their analysis of Warshall's algorithm on relational
algebra, which they assume available right from page 1 (page 160
actually).
This is dangerous, since with full relational algebra available,
calculating the transitive closure T of a relation R becomes
trivial: $ T := R sup "+" $.
The basic algorithm is a fourline loop that is just not completely
trivial, uses Kleene star closure freely(!), and which has a subset Z
of the elements on which the relation is defined as a free parameter to
be chosen at the beginning of each loop.
Termination requires Z to be nonempty.
Three choices for Z are considered: a singleton, which leads to
Warshall's algorithm; a genuine subset, which leads to a generalization
of the grid algorithm; and the full set of elements, which leads to the
above trivial algorithm.
Next Algollike realizations for the first and second choice are given;
this supplies the proof that the authors have successfully avoided
trivializing the problem away:).
K. Sikkel,
Parsing Schemata,
Springer Verlag,
1996,
pp. 398.
Describes the "primordial soup algorithm": the soup initially contains all
grammar rules and all rules of the form $ A → t_i $ for all $t_i$ in the
input; both are in effect parse tree fragments.
During stewing fragments are combined according to obvious rules, until all
possible combinations have been formed.
Then the complete parse trees float to the surface.
The rules of this algorithm, which is actually a
transitive closure algorithm, are then formalized into sets of
inference rules geared to parsing, called parsing schemata.
These are then specialized to form many existing parsing methods and some new
ones, including predictive headcorner parsing,
and a parallel bottomup GLR parser.
All this is supported by great mathematical rigor, but enough diagrams and
examples are given to keep it readable.
Yedidyah Langsam,
Moshe J. Augenstein,
Aaron M. Tenenbaum,
Data structures using C and C++,
(2nd ed.),
Prentice Hall,
Upper Saddle River, N.J.,
1996,
pp. 672.
ANSIfied update of the first edition, which see.
The use of different style files gives the book a completely different
appearance, but much of the text has remained unchanged.
The C code has been ANSIfied, C++ code has been added, and the font
used for program code has been upgraded from ugly to peculiar.
Still very much worth using.
Gilles Brassard,
Paul Bratley,
Fundamentals of Algorithms,
PrenticeHall,
Englewood Cliffs,
1996,
pp. 524.
Simple and advanced algorithms (primality testing!) and their
theoretical foundations and complexity.
M.J. Nederhof,
Linguistic Parsing and Program Transformation,
Katholieke Universiteit Nijmegen,
Nijmegen,
1994,
pp. 206.
Contains in coherent chapter form versions of the following papers by the
author:
``Generalized leftcorner parsing'';
``An Optimal Tabular Parsing Algorithm'';
``Increasing the Applicability of LR Parsing''; and
``TopDown Parsing for LeftRecursive Grammars''.
They are preceded by an introduction to parsing, and
followed by a chapter on attribute propagation, and one on a grammar
workbench.
Esko Nuutila,
An efficient transitive closure algorithm for cyclic digraphs,
Information Processing Letters,
vol. 52,
#4,
Nov. 1994,
pp. 207213.
Very careful redesign of the topdown transitive closure algorithm using
strongly connected components, named COMP_TC.
Extensive experimental analysis, depicted in 3D graphs.
The speed ratio to Schmitz's algorithm varies from 1 for very sparse
graphs to about 3 for an outdegree of 10 percent.
Jayadev Misra,
Powerlists: a structure for parallel recursion,
ACM TOPLAS,
vol. 16,
#6,
Nov. 1994,
pp. 17371767.
Besides concat, the zip list constructor is introduced.
Also, a reverse operator is defined.
Powerlists are balanced (actually probably implemented as arrays) and
can be decomposed uniquely using deconcat or dezip, as desired.
Axioms of concat and zip are proved.
Several algorithms are expressed in these operators: permutations, Gray
codes, Fourier Transform, and lots, lots more.
The point is that components of powerlists can be distributed easily.
A. van Deudekom,
D. Grune,
P. Kooiman,
Initial experience with noncorrecting syntax error handling,
IR339,
Vrije Universiteit,
Amsterdam,
Nov. 1993,
pp. 11.
Shows how to implement Richter's noncorrecting error recovery using a
recombining breadthfirst topdown parser based on the LLgen input.
Provides somewhat artificial efficacy test results which show that one
program modification causes at most one error message.
Preston Briggs,
Linda Torczon,
An efficient representation for sparse sets,
ACM Letters on Programming Languages and Systems,
vol. 2,
#14,
MarchDec. 1993,
pp. 5969.
Assume sets of at most u members, numbered 1..u.
For set with n members, the following representation is kept:
1. an integer variable containing n;
2. an array dense[1..u], in which the elements 1..n contain the
member numbers;
3. an array sparse[1..u], in which only the elements are used whose
indexes correspond to member numbers: the element corresponding to
member number m contains the index of the element in dense which
contains m.
All other elements in both array are not initialized (but see below);
this avoids $ O ( u ) $ initialization costs.
This representation allows $O(1)$ implementations of set membership,
clearset, addtoset, deletefromset, and an iterator over the
elements.
Also, copy, compare, union, intersection and setdifference are $(O(n)$
rather than $O(u)$.
This comes at a cost of $ 4 u + 2 $ bytes per set (assuming short
unsigned ints for the indexes), as opposed to $ u / 8 $ bytes for a bit
vector.
Note that the set membership test for an absent element a accesses
the uninitialized element sparse[a]: the (unsigned) value x
found there can be larger than n, in which case the absence of a
is established or be between 1 and n inclusive, in which case
dense[x] is consulted; but since a is not in the set,
dense[x] cannot be equal to a and again the absence of a
is established.
The representation is compared to a bit vector implementation and found
to be 2 to 3 times faster for the sparse sets that occur in graph
colouring in register allocation.
(The speed advantage will be lost completely for nonsparse sets. DG)
Several other applications are discussed.
M.A. Weiss,
Data Structures and Algorithm Analysis,
Benjamin/Cummings,
Redwood City, Ca.,
1992,
pp. 455.
Coverage of the usual algorithms, each with a complete but not
excessively formal analysis.
Algorithm subjects include: lists, stacks, queues, trees, hashing,
priority queues including heaps, sorting including quicksort but
excluding quickersort, union/find, shortestpath, network flow and
minimum spanning tree.
Algorithmic techniques include: greedy, dynamic programming,
randomization, backtracking.
Analysis techniques include: simple analysis, amortized analysis.
Wim Hesselink,
Jan Jongejan,
Duplicate deletion derived,
Commun. ACM,
vol. 35,
#7,
July 1992,
pp. 99.
Addition to Jukka Teuhola, Lutz Wegner, ...
W.L. Hsu,
R.C.T. Lee,
ISA '91 Algorithms,
Lecture Notes in Computer Science #557,
SpringerVerlag,
Berlin,
1991,
pp. 395.
More than 40 papers on algorithms of various kinds, most of them with
explicit algorithms and proofs.
Arne Anderson,
A note on searching in a binary search tree,
SP&E,
vol. 21,
#10,
Oct. 1991,
pp. 11251128.
Replaces the threeway comparison (<, =, >) usual used in binary tree
searche algorithms with a twoway comparison (<, >=).
This is more realistic, since machines do not have a threeway
comparison.
Detailed code is given.
George H. Roberts,
Searching in discrete universes,
ACM SIGPLAN Notices,
vol. 26,
#3,
March 1991,
pp. 1924.
Very fast parsing (see, e.g. Pennello) requires fast searching in
(relatively small) discrete universes.
A hybrid binary/sequential search technique is demonstrated.
L. Davis,
Handbook of Genetic Algorithms,
Van Nostrand Reinhold,
New York,
1991,
pp. 352.
The first part of the book is an introduction to genetic algorithms.
A genetic algorithm for solving a parameterfinding problem works as
follows.
The parameters of the problem are encoded in chromosomes and an initial
generation of chromosomes is constructed (random or based on some
heuristics).
Each chromosome (solution to the problem) is evaluated and assigned a
grade.
A number of pairs of chromosomes, selected for their high grade, are
combined by crossover and mutation into new chromosomes; these
are the next generation.
Repeat until satisfied or out of patience.
Part two concerns 13 application case studies, ranging from aircraft
design to traveling salesman.
Part three describes the GENESIS and OOGA genetic algorithm software
packages.
Thomas H. Cormen,
Charles E. Leiserson,
Ronald L. Rivest,
Introduction to Algorithms,
MIT Press,
Cambridge, Mass.,
1990,
pp. 1028.
Both a tutorial and a handbook, encyclopedic in size and in contents.
Algorithms are introduced and proven correct using considerable math.
A reasonable addition the Knuth's unfinished opus.
205 literature references.
Aaron M. Tenenbaum,
Yedidyah Langsam,
Moshe J. Augenstein,
Data Structures Using C,
Prentice Hall,
Englewood Cliffs,
1990,
pp. 662.
In spite of its 650+ pages, the book is fastpaced.
It covers everything a student should know about data structures,
abstract or otherwise.
Especially enjoyable are the treatment of hash functions (normal and
perfect), that of garbage collection, and the exercises; but almost any
other subject would do.
Recommended!
Two minor gripes: the text is sometimes overly explicit, and the type
font used for the program texts was designed by a rank amateur.
Jeffrey H. Kingston,
Algorithms and Data Structures  Design, Correctness, Analysis,
Addison Wesley,
Sidney,
1990,
pp. 315.
A fairly usual text on advanced data structures, though slightly more
theoretical.
A small number of algorithms is analysed in depth.
Robert Sedgewick,
Algorithms in C,
AddisonWesley,
Reading, Mass.,
1990,
pp. 657.
Identical to Sedgewick [1988], all algorithms in C.
I don't particularly like the style of C coding.
Manoocher Azmoodeh,
Abstract Data Types and Algorithms,
(2nd Edn.),
MacMillan,
London,
1990,
pp. 377.
Many readable examples of implementations of ADT's in Pascal.
Using ADT's to implement ADT's.
Algorithm design using ADT's.
Good.
J. Ullman,
M. Yannakakis,
The input/output complexity of transitive closure,
in ACM SIGMOD 1990 International Conference on Management of Data,
ACM,
1990,
pp. 4453.
[[ grid algorithm?? ]]
John Q. Walker II,
A nodepositioning algorithm for general trees,
Software  Practice and Experience,
vol. 20,
#7,
July 1990,
pp. 685706.
The tree is conceptually laid out bottomup.
The children (nodes or subtrees) of a node are positioned a distance
sibling_distance apart if both are nodes, and such that there is at least
a distance subtree_distance between any node of one child and any node of
the other child otherwise.
Nodes that are free to move within these restrictions are equally spaced
in their space.
The parent is then positioned centered above the leftmost and
rightmost child nodes.
Full and detailed algorithms are given.
Hessam Khoshnevisan,
Efficient memotable management strategies,
Acta Inform.,
vol. 28,
#1,
1990,
pp. 4381.
Uses the term memotable in a very restricted sense: his memotables
exist only during the evaluation of a function and serve only to benefit
recursive calls.
When a function finally delivers its result, its memotables are
discarded.
So there is no longterm memory.
The author observes that the evaluation tree of many recursive functions
has a bounded width.
For example, the Fibonacci series for n only uses the elements for
$n1$ and $n2$.
By storing these in a memotable, a memotable of two entries suffices
to evaluate the Fibonacci function in linear time.
Such functions exhibit common generator redundancy.
Next the theory of functions with common generator redundancy is
developed and algorithms for the generation of table managers for them
are derived.
The latter work as sourcetosource converters.
Roger B. Dannenberg,
A structure for efficient update, incremental redisplay and undo in graphical editors,
Software  Practice and Experience,
vol. 20,
#2,
Feb. 1990,
pp. 109132.
The display is considered to contain "items", which have named
attributes, which have values.
An item is represented as a linked list of (name, value, version_number)
triplets, and modifications are prepended at the start of the list
without removing the old triplet.
Each triplet also contains a MODIFIED bit, which is on when the item in
memory has been modified more recently that the corresponding item on
screen.
The items are kept in a circular list, called the itemList.
Upon a redraw(), the itemList is scanned, and any triplet marked
MODIFIED (easily found at the head of the list) are displayed and
unmarked.
Upon a newVersion(), the global version number is increased and all
lists are scanned to remove any obsolete triplets (those with too low
version numbers).
Upon an undo(), the triplet lists are scanned for triplets with the
previous version number; these are then copied (not moved!) to the head
of the list.
This allows undoing the undoing.
An undo is performed by finding the old value and establishing it as the
most recent value.
So, Put 5, Put 8, Undo, yields three versions, 5, 8, 5.
Jeffrey D. Smith,
Design and Analysis of Algorithms,
PWSKent Publ. Comp.,
Boston,
1989,
pp. 447.
Thorough, though occasionally somewhat overemphatic treatment of a very
wide range of algorithms.
Manfred Ruschitzka,
John L. Clevenger,
Heterogeneous data translations based on environment grammars,
IEEE Soft. Eng.,
vol. 15,
#10,
Oct. 1989,
pp. 12361251.
[ use of 2level grammars in data marshalling ]
C. Bron,
E.J. Dijkstra,
A better way to combine efficient string length encoding and zerotermination,
ACM SIGPLAN Notices,
vol. 24,
#6,
June 1989,
pp. 1119.
Two methods are proposed, the first of which has been implemented in
Modular Pascal.
Both require the strings to be located in string containers (areas) of
known length and both allow the strings to be zeroterminated.
The first method is based on storing the distance between the end of the
string and the end of the container on the last byte(s) of the
container; if the distance is zero, the last byte is also zero.
In the second method, all bytes after the last string byte are kept
zero, so the end of the string can be found by binary search; this is
faster than the linear search required to find the first zero byte and
imposes no further requirements.
Alan Gibbons,
Wojciech Rytter,
Efficient Parallel Algorithms,
Cambridge Univ. Press,
Cambrigde,
1988,
pp. 259.
Has a chapter on parallel parsing, in which the authors prove that every
contextfree language can be recognized in $ O ( log sup 2 n ) $ time
using $ n sup 6 $ processors on a PRAM, and every unambiguous
contextfree language can be recognized on a PRAM in $ O ( log ~ n ) $
time using a (largeorder) polynomial number of processors.
Further results are derived for bracket languages and for inputdriven
languages; a language is "inputdriven" if there exists a PDA for
it with the property that all movements the PDA can make for a given
input symbol s stack or unstack the same number of symbols, $N(s)$.
The chapter relies heavily on the material explained in earlier chapters
and can hardly be read in isolation.
Robert Sedgewick,
Algorithms,
AddisonWesley,
Reading, Mass.,
1988,
pp. 657.
One of the best books on algorithms available;
comprehensive, understandable treatment of many algorithms, beautifully
done.
Some subjects: priority queues, hashing, parsing, convex hull, weighted
graphs, curve fitting, FFT, linear programming, you name it.
Webb Miller,
Eugene W. Myers,
A simple rowreplacement method,
SP&E,
vol. 18,
#7,
July 1988,
pp. 597612.
Optimal algorithms for video screen row updating on charactercontrolled
terminals.
P. Wayner,
Errorfree fractions,
BYTE,
vol. 13,
#6,
June 1988,
pp. 289298.
Factorialbased arithmetic for exact fractions, explained to the
intelligent layman.
I.R. Hentzel,
D.J. Pokrass,
A practical solution for a large sparse matrix,
SP&E,
vol. 18,
#3,
March 1988,
pp. 279284.
Downsizing the data needed for a specific large problem in linear
algebra.
Mostly math.
BingChao Huang,
Michael A. Langston,
Practical inplace merging,
Commun. ACM,
vol. 31,
#3,
March 1988,
pp. 348352.
The array, which is supposed to have size $ N sup 2 $ is divided in N
chunks of size N.
The N largest items are concentrated in one chunk, which is then used
as a buffer to manipulate the rest.
At the end the buffer is sorted normally.
Although all steps are detailed, the correctness of the steps is
taken for granted (though not always obvious!).
Lawrence Davis,
Genetic algorithms and simulated annealing,
Research Notes in Artificial Intelligence,
Pitman,
London,
1987,
pp. 216.
Collection of 13 papers on same.
J.H. Hester,
D.S. Hirschberg,
Selforganizing search lists using probabilistic backpointers,
Commun. ACM,
vol. 30,
#12,
Dec. 1987,
pp. 10741079.
While traversing a linked list, a second point, "backpointer" or
"finger", is maintained that drags after the search pointer.
When the required element is found, it is moved uplist to the position
pointed to by the backpointer. Never moving the backpointer yields
"movetofront", always moving it along yields "transpose".
Several update strategies, called JUMPs, are examined.
No clear winner emerges.
David Alex Lamb,
IDL  Sharing intermediate representations,
ACM TOPLAS,
vol. 9,
#3,
July 1987,
pp. 297318.
Although IDL Interface Description Language is intended to describe
the interconnections in large systems, it is mainly concerned with
getting the data from one program to another.
The actual interfaces are very simple and consist of a reader side and a
writer side; input and output along other paths does not count.
The system features type declarations that are converted into
marshalling read and write routines; this marshalling includes things
like pointers, circular sharing structures, byte exchange and data type
upgrade.
Often the read routine can handle multiple representations, and the
write routine can always produce ASCII (humanreadable) output.
Although the routines are about 8 times slower than handcoded routines,
they slow down the entire system by no more than about 15 percent.
This IDL should not be confused with Atkinson's IDL Intermediate Data
Language, q.v.
Richard M. Karp,
Michael O. Rabin,
Efficient randomized patternmatching algorithms,
IBM Journal of Research and Development,
vol. 31,
#2,
March 1987,
pp. 249260.
Author's abstract:
We present randomized algorithms to solve the following stringmatching
problem and some of its generalizations:
Given a string X of length n (the pattern) and a string Y (the text),
find the first occurrence of X as a consecutive block within Y.
The algorithms represent strings of length n by much shorter strings
called fingerprints, and achieve their efficiency by manipulating
fingerprints instead of longer strings.
The algorithms require a constant number of storage locations, and
essentially run in real time.
They are conceptually simple and easy to implement.
The method readily generalizes to higherdimensional patternmatching
problems.
Jeff Naylor,
Introduction to Programming,
Paradigm,
London,
1987,
pp. .
Very simple, using Pascal and COBOL.
Wayne Amsbury,
Data Structures, From Arrays To Priority Queues,
Wadsworth Publ. Comp.,
Belmont, Ca.,
~1986,
pp. 512.
Is exactly what the title says.
Beautiful pictures, but stops explaining as soon as the going gets
tough.
Contains too much program code.
Sometimes confusing or incomprehensible explanation.
Some literature references are not listed (e.g. to Gonnet).
Johannes J. Martin,
Data Types and Data Structures,
Prentice Hall,
Englewood Cliffs, NJ,
1986,
pp. 282.
Solid and convincing explanation of abstract data types.
Very readable.
In addition to the usual subjects it has a chapter on garbage
collection.
A novel feature of the book is a running summary in the margin, which
allows the book to be read at very high speed :).
JeanPaul Trembley,
Paul G. Sorenson,
An Introduction to Data Structures with Applications,
(2nd ed.),
McGrawHill,
Singapore,
1986,
pp. 861.
Some short notes:
Mainly PL/I based.
Much algorithm, little explanation.
Of interest: Markov algorithms, survey of nonlinear data structures,
biconnectivity algorithm, garbage collection.
Lowpriced and worth its money.
Brigitte Jaumard,
Michel Minoux,
An efficient algorithm for the transitive closure and a linear worstcase complexity result for a class of sparse graphs,
Inf. Process. Lett.,
vol. 22,
#4,
April 1986,
pp. 163169.
Algorithm of the PurdomEveKurkiSuonio line.
Improves the computation of the transitive closure of the dag by
improving the merging of the sets (actually lists) of related nodes
associated with each node.
Clear explanations with pictures.
The algorithm is faster than Purdom's on large graphs with fanin
smaller than 2.
It is linear for any graph with constant fanin.
Daniel F. Stubbs,
Neil W. Webre,
Data Structures with Abstract Data Types and Modula2,
Brooks/Cole Publ. Comp.,
Monterey, Ca. 93940,
1985,
pp. 556.
Clear, entertaining and beautifully done.
Covers arrays, records, linked lists, stack, queues, linear structures,
trees, sorting, sets, strings and graphs, all represented tastefully as
abstract data types.
John Bentley,
Programming pearls  selection,
Commun. ACM,
vol. 28,
#11,
Nov. 1985,
pp. 11211127.
How do I find the kth element in an unordered set of N elements in
time $ O ( N ) $?
Take an arbitrary element X and construct three sets:
{$ x < X $}, {x} and {$ x > X $}.
From the size of the first set, decide in which set the kth element
will be found and with what ranking.
Repeat the procedure until the element is in the middle set.
Cost: $2N$ on the average.
R.R. Oldehoeft,
S.J. Allen,
Adaptive exact fit storage management,
Commun. ACM,
vol. 28,
#5,
May 1985,
pp. 506511.
Most programs deal with blocks of a limited number of different sizes,
the working set.
Freed blocks are cached for a while.
Requests are first checked against the cache, and if a hit is found, the
block can be supplied at minimal cost.
Kurt Mehlhorn,
Data Structures and Algorithms 1: Sorting and Searching,
EATCS Monographs on Theoretical Computer Science,
SpringerVerlag,
Berlin,
1984,
pp. 336.
More narrow but much deeper than Knuth's works.
Very much worth consulting, though not (printing)errorfree.
Printed from typed pages.
Walter F. Tichy,
Stringtostring correction problem with block moves,
ACM Trans. Comput. Syst.,
vol. 2,
#4,
Nov. 1984,
pp. 309321.
We have two more or less similar large strings (actually files) S and
T, and we want to exploit the redundancy by storing only S plus a
script that can construct T from S when needed.
This script can contain two kinds of actions:
move(from, length), which copies length characters (actually
file lines) from position from in S and appends them to T (note
that it copies and appends; S and T are files, not strings);
and
add(line), which adds the line line to T.
The problem is how to construct the script when S and T are given;
once we have the script, T can be discarded.
We want the script to be as small as possible.
The add command includes a complete line and is supposed to be
larger than any move command; so we want to minimize the number of
add commands, and, once that is done, the number of move
commands.
The optimal algorithm is surprisingly simple.
We try to find the longest common sequence in S that starts at $T[0]$
in T.
If there is such a sequence of length l starting at s in S, we
record a move(s, l), discard the first l lines of T and start
again.
If there is no such sequence, we record an add(T[0)], discard the
first line of T and start again.
We continue until T has been completely "explained away".
The algorithm may be simple, but it is not trivial; several equally
"obvious" algorithms are presented and shown wanting.
The author proves the optimality of the presented algorithm, but the
proof is fairly complicated.
We can see that the algorithm minimizes the number of adds and then
the number of moves, as follows.
When there is no common sequence, $T[0]$ does not occur in S and we
cannot ever obtain it by a move command, so we have no choice.
When there is a common sequence, we have three choices: use all of it, use
part of it, and do not use it.
If we do not use it, we still have to produce $T[0]$.
Using an add is too expensive, so we must get $T[0]$ from another
place in S.
But we will never find a longer common sequence than we have already
got, so there cannot be any advantage in not using this one.
If we use part of it, we can use the first part or any other part; if we
use any other part we still have to produce $T[0]$, and the previous
argument applies.
If we use only the first lx lines of the common sequence, we will
still have to produce the other x lines.
This may require an extra move or we may be lucky and combine
obtaining them with another move required by the lines following
$T[l]$ from an appropriate sequence in S.
In either case it is not cheaper than incorporating $T[x..l]$ in our
present move.
So there is no point in using only part of the longest common
subsequence.
The paper explains and proves the algorithm in terms of strings and
characters, which is formally correct but hardly helpful, since the
algorithm is pretty useless for strings, so the motivation for the work
goes out of the window.
Two improvements of the basic algorithm are introduced, both aiming at
speeding up the localization of the longest common sequence.
The first enters all lines into a symbol table (called "index" in the
paper) (implemented by hashing);
the information kept with each line is a list of all positions it occurs
in in S.
$T[0]$ is looked up in this symbol table.
If it is not there, it is not in S and no common sequence exists;
if it is there, the symbol table gives us a list of all positions where
a common sequence might start.
Since lines do not normally repeat very often in files, this list is
probably short.
The second avoids all risks by constructing a suffix tree of S (by
McCreight's algorithm).
We just use $T[0]$, $T[1]$, $T[2]$, ..., to follow a path in the suffix
tree and see how far we get.
This brings us to the longest common sequence in a time proportional to
the length of it.
The suffix method is slower than the index method for small files (<1000
lines) but quickly gets superior for larger files (>5000 lines).
The drawing of the suffix tree in Figure 1 is misleading because unary
nodes are not shown; so there is no place, for example, for a search for
wvwa to end.
Larson,
Kajla,
File organization, a method guaranteeing retrieval in 1 access,
Commun. ACM,
vol. 27,
#7,
July 1984,
pp. 670.
See writeup.
Retrieval is cheap, insertion can be very expensive.
Kannitz,
Van Ekert,
Audit trail compaction for database recovery,
Commun. ACM,
vol. 27,
#7,
July 1984,
pp. 678.
An audit trail consists of
update_record(a),
update_record(b),
...
update_record(a),
...
There are very many records (N = ~10^9) and many updates (~10^6), among
which many duplicates.
We compress the audit trail efficiently as follows.
1. Allocate a bit map of N bits.
2. Read the audit trail backwards starting from the end, and for each
update_record(p), IF bit p is on THEN skip the update ELSE keep the
update (writing backwards) and switch on bit p.
L. Schmitz,
An improved transitive closure algorithm,
Computing,
vol. 30,
#4,
1983,
pp. 359371.
Detailed improvement of the algorithm of Eve & KurkiSuonio, 1977,
clearly explained.
Compares 4 versions of the algorithm to those of Eve & KurkiSuonio and
of Warshall, 1962.
Warshall loses big, and the others are comparable, with the author's
always equal or superior.
Jürgen Peemüller,
A correction to Brelaz's modification of Brown's coloring algorithm,
Commun. ACM,
vol. 26,
#8,
Aug. 1983,
pp. 595597.
Code for two complete algorithms for exact graph colouring, with proofs.
K. Mullin,
A second look at Bloom filters,
Communications of the ACM,
vol. 26,
#8,
Aug. 1983,
pp. 570571.
Short theoretical analysis of the performance of Bloom filters, checked
against a simulation.
The simulation is worse than the theory predicts and the difference is
explained from the imperfection of the hash functions.
T.C. Hu,
Combinatorial Algorithms,
AddisonWesley,
Reading, Mass.,
1982,
pp. 292.
Unusual book; rich in subjects, a minimum of math and formalism, of
uneven style.
Not always easy to follow, but adequate.
Threats shortest path, maximum flow, dynamic programming, backtracking,
optimal binary trees, heuristic and nearoptimal algorithms for bin
packing and job scheduling, and matrix multiplication; P versus NP.
Extensive bibliographies by subject.
Many examples.
Author quotes amply from own publications and rightly so.
Lots of advice on designing algorithms.
M. Herlihy,
B. Liskov,
A value transmission method for abstract data types,
ACM Trans. Programming Languages and Systems,
vol. 4,
#4,
Oct. 1982,
pp. 527551.
Indepth analysis of marshalling data (including graphs) in CLU, both
for distributed programming and for persistent data.
Manfred Broy,
Peter Pepper,
Combining algebraic and algorithmic reasoning: An approach to the SchorrWaite algorithm,
ACM TOPLAS,
vol. 4,
#3,
July 1982,
pp. 362381.
Formal proof based on the rotating turnstyle interpretation.
Lots of literature references.
Paolo Sipala,
Compact storage of binary trees,
ACM TOPLAS,
vol. 4,
#3,
July 1982,
pp. 345361.
The relationships between the storage requirements for binary and
kary trees are examined in closed form, using considerable amounts of
undergraduate math.
Ensembles of trees are characterized by a left branching probability p
and a right branching probability q.
This results in a map of regions showing for which k kary trees are
preferable for a given p and q.
For Lisp, the goal of the exercise, $k=3$ is best.
A.C. Kilgour,
Generalized nonrecursive traversal of binary trees,
SP&E,
vol. 11,
#12,
Dec. 1981,
pp. 12991308.
Ultimate generalization of the Schorr & Waite algorithm, derived
systematically.
The algorithm is parameterized with the visit order (six possibilities).
Jürgen Ebert,
A sensitive transitive closure algorithm,
IPL,
vol. 12,
#5,
Oct. 1981,
pp. 255258.
First does Tarjan's strongly connected components and then Warshall's
matrices; tuned to exploit the low number of edges.
R.S. Bird,
Tabulation techniques for recursive programs,
ACM Computing Surveys,
vol. 12,
#4,
December 1980,
pp. 403417.
The call of a recursive program defines a call graph the nodes of which
are labelled with the arguments of the call; this graph has to
be noncyclic for the call to terminate.
If the graph is a DAG (that is, not a tree) some values will be
calculated more than one, which we would like to avoid.
In the general case this is only possible through bruteforce memoizing,
but if the recursive function shows some special properties, the call
graphs have a systematic structure that can be exploited to simplify the
memoization.
Using graph pebbling as a tool, three types of tabulation strategies are
described.
The first is based on uniform dependency graphs; these allow an optimal
solution.
Less uniform graphs can sometimes be extended to (embedded in) uniform
graphs; the optimal solution to the uniform graph is then a suboptimal
solution to out original graph.
Nonuniform graphs can still be handled, but the solution is equivalent
to bruteforce memoizing.
Edsger W. Dijkstra,
Some beautiful arguments using mathematical induction,
Acta Inform.,
vol. 13,
1980,
pp. 18.
The first concerns a simple formula for the number of factors p (p
prime) in $n!$, based on the representation of $n!$ in base p.
The second is a proof of the theorem that the vertices of any polygon of
maximal area lie on a circle.
The third concerns an algorithm for finding the length of the longest
ascending subsequence (called `upsequence' here) in a given sequence of
numbers.
Peter J. Downey,
Ravi Sethi,
Robert Endre Tarjan,
Variations on the common subexpression problem,
J. ACM,
vol. 27,
#4,
April 1980,
pp. 758771.
[[ actually congruence closure on a directed graph ]]
Robert Endre Tarjan,
Andrew ChiChih Yao,
Storing a sparse table,
Commun. ACM,
vol. 22,
#11,
Nov. 1979,
pp. 606611.
First the wellknown row displacement technique is considered and it is
shown that its implementation using firstfit on rows of decreasing
density works well (with provable bounds) if the densities of the rows
have a "harmonic decay" property (obey Zipf's Law).
Now, if the rows do not have this property, the columns are extended
downwards with empty spaces and the actual column contents are slid
downwards over a certain distances so as to create a set of rows that
does have the desired property.
Hence the name double displacement.
Although we have increased the number of rows by this procedure, they
behave better than before, provided the sliding distances have be
determined properly.
A criterion (definitely nontrivial) is given for these sliding
distances and an analysis is sketched to justify it.
The size of the resulting table is
$ O ( n ~ log ~ log ~ n ) $ and its acces time is constant.
By packing some of the data in one machine word, the memory requirements
can be lowered to $ O ( n ) $.
Code is given for the firstfit decreasing row displacement algorithm.
Of the code for the much more difficult column displacement algorithm,
the authors say "We leave the details to the reader" (pg 610, left
column) :(.
Daniel Brélaz,
New methods to color the vertices of a graph,
Commun. ACM,
vol. 22,
#4,
April 1979,
pp. 251256.
Assumes considerable familiarity with graph colouring algorithms,
heuristic and exact.
Specifies two improvements, one to a heuristic algorithm and one to an
exact one.
The heuristic algorithm use the "saturation degree" of a node, the
number of different colours its neighbours have (which is less than or
equal to the "degree" of the node, the number of neighbours the node
has).
First a node with maximal degree is coloured.
Next, repeatedly nodes of maximal saturation degree are coloured, until
done.
The improvement: when a node cannot be coloured without introducing a
new colour, we first try to interchange some of the colours already
assigned so as to reduce the saturation degree of the node.
This is called saturation degree with interchange.
The exact algorithm (RandallBrown) seems to me to be just depth first
visit.
The improvement consists of preceding it by a cliquefinding step to
obtain an initial unavoidable colouring and a lower bound, and an
application of the saturation degree with interchange algorithm to
obtain an upper bound.
Both algorithms are compared to a couple of other algorithms from the
literature and found superior.
C.P. Schnorr,
An algorithm for transitive closure with linear expected time,
SIAM J. Comput.,
vol. 7,
#2,
May 1978,
pp. 127133.
Gives a very sophisticated version of the iterative bottomup transitive
closure algorithm and does a very careful statistical complexity
analysis.
The author even worries about the time spent in reading the graph, and
justifiedly so, since the entire algorithm has no higher time
complexity.
S. Even,
M. Rodeh,
Economical encoding of commas between strings,
Commun. ACM,
vol. 21,
#4,
April 1978,
pp. 315317.
[ recursively prepend the length; this is about single bit strings ]
Thomas L. Saaty,
Paul C. Kainen,
The FourColor Problem  Assaults and Conquest,
Dover Publ.,
New York,
1977,
pp. 211.
The first half is devoted to the fourcolour problem and the AppelHaken
solution.
The second half treats a number of variations: Hamiltonians, chromatic
numbers, etc.
Extensive bibliography.
J. Eve,
R. KurkiSuonio,
On computing the transitive closure of a relation,
Acta Inform.,
vol. 8,
1977,
pp. 303314.
[[ excerpt ]]
Extremely complicated derivation of a relatively simple way to compute a
transitive closure in topdown fashion: determine strongly connected
components on the fly and contract them; then a topdown visit does the
job.
Or am I missing something important here?
M. Sassa,
E. Goto,
A hashing method for fast set operations,
Inf. Process. Lett.,
vol. 5,
#2,
1976,
pp. 3134.
None to clear exposition of ...
Niklaus Wirth,
Algorithms + Data Structures = Programs,
PrenticeHall,
Englewood Cliffs, NJ,
1976,
pp. 366.
Solid treatment of the state of the art in computer programming,
combining breadth with depth.
The division in chapters is surprising to say the least:
1. Fundamental data structures;
2. Sorting (!);
3. Recursive algorithms;
4. Dynamic information structures;
5. Language structures and compilers.
But the lack of structure is completely offset by the wealth of good
algorithm examples in Pascal.
Many algorithms are covered; to name a few:
multiway file merging;
polyphase sort;
backtracking;
stable marriage problem;
balanced trees;
Btrees;
hashing;
tabledriven parsers.
(Note: The publisher goofed: it says "Nicklaus Wirth" on the cover!)
Still a good read after 25 years.
M.R. Garey,
D.S. Johnson,
The complexity of nearoptimal graph coloring,
J. ACM,
vol. 23,
#1,
Jan. 1976,
pp. 4349.
Extension of the result (by Stone... ??) that if there is a polynomial
algorithm that always does graph coloring with less than 4/3 of the
minimum number of colours, then P == NP.
There is so much math here that it is difficult to tell what the exact
implications are; perhaps not severe, as with the previous result.
J, Cohen,
Interpretation of NonDeterministic Algorithms in HigherLevel Languages,
Inf. Process. Lett.,
vol. 3,
#4,
1975,
pp. 104109.
Floyd ("Nondeterministic Algorithms", 1967) shows how nondeterministic
algorithms can be executed; this paper fills in the programmatic details by
supplying code segments for the operations used by Floyd.
D.S. Hirschberg,
A Linear Space Algorithm for Computing Maximal Common Subsequences,
Commun. ACM,
vol. 18,
#6,
June 1975,
pp. 341343.
The maximally possible common subsequence (not substring!) of two
strings $ A [ 1 .. m ] $ and $ B [ 1 .. n ] $ can be found in linear
space and quadratic time as follows.
Conceptually, an $ n times m $ matrix L is computed, where each
element contains the length of the maximal common subsequence (MCS, to
avoid confusion with LCS, longest common substring) that can be found in
the topleft submatrix of which it is the bottomright corner.
This matrix can be filled easily in a toptobottom lefttoright
fashion.
This yields the length of the MCS of the entire matrix in its last
element, but leaves two problems: the size of the matrix, and what is
the MCS.
At any moment, filling uses only the present row and the previous row,
so we need only two rows rather than the entire matrix; this is
Algorithm A.
Finding an MCS (there may be more than one) itself is more difficult.
If we knew the MCS, we could trace it in the matrix as a broken
diagonal, from somewhere upleft to somewhere downright.
We try to find the point where the MCS crosses the horizontal middle of
the matrix.
To this end, we compute the bottommost row $L1$ of the top half of the
matrix, using Algorithm A, and the topmost row $L2$ of the bottom half
of the matrix, using Algorithm A upside down.
Now any point on the border between these two rows at which the sum of
the lengths equals the length of MCS, is a point where an MCS crosses
the border.
This is Algorithm B.
Applying Algorithm B recursively to the top and bottom halves, we can
find succesive points through which the MCS passes.
Once the recursion has reduced the height of the matrix to 1, we know
which character is part of the MCS, and returning from the recursion we
assemble these characters into the required MCS.
This is Algorithm C.
There are several bugs in the code:
1. Algorithm B, line 2:
for j ← 1 to n do => for j ← 1 to n do.
2. Algorithm C, line 3.3:
$ A sub { n , i+1 } $ => $ A sub { m , i+1 } $.
3. Algorithm C loops when both input strings are empty (I think;
implement!).
M. Rem,
On the programming of elastic stores,
Information Processing Letters,
vol. 3,
#6,
July 1975,
pp. 184187.
An elastic store is a linear memory in which insertion and deletion
operations are about as expensive as read and write operations in "normal"
memory.
The paper considers the hypothetical situation in which such stores
could be built, and asks how such stores could be used in programming,
addressing by offset of course being out of the question.
Accessing data is based on markers, zerothickness named entities
positioned between some pairs of adjacent data cells.
The operation "find next (previous) marker" is assumed efficient.
Several questions arise:
1. If two different markers are positioned between the same pair of
adjacent data cells, do these markers have any order?
2. Do markers point to the data cell before them or after them?
3. How do we use markers to partition the store into partitions
(segments), especially when these partitions can be empty?
All these questions have a heavy "oneoff error" flavor.
Several alternatives are examined in the paper, and checked for
consistency.
For example, the sequence "insert x as a data cell at marker m"
followed immediately by "delete data cell at marker m" should be a
null operation.
The answers are:
1. adjacent markers have no order;
2. one needs two kinds of markers, one kind that points to the
predecessor and one kind that points to the successor;
3. one needs two sentinel data cells per partition, each with a marker
pointing away (!) from it.
The author shows that this is sufficient (and almost necessary) to
implement a FIFO queue.
J. Dzikiewicz,
An algorithm for finding the transitive closure of a digraph,
Computing,
vol. 15,
1975,
pp. 7579.
This is Algol 60 code only.
The algorithm looks a lot like Purdom's, but is structured much better.
Comparing them statement by statement is too much work.
M.M. Sysło,
J. Dzikiewicz,
Computational experiences with some transitive closure algorithms,
Computing,
vol. 15,
1975,
pp. 3339.
Four algorithms for transitive closure are compared:
Warshall (1962), Purdom (1970), Yen (not published) and Dzikiewicz
(1975).
Purdom's is better behaved, but Dzikiewicz's is usually faster;
Warshall's is OK for very low densities only.
David A. Fisher,
Copying cyclic list structures in linear time using bounded workspace,
Commun. ACM,
vol. 18,
#5,
May 1975,
pp. 251252.
This is what happens, as far as I understand it.
The "list structure" from the title is a binary graph.
Its nodes may be all over the place, but the nodes of its copy have to
be allocate contiguously in a known segment of memory, since pointer
increments and decrements are used to reach the nodes and pointers to
them have to be identifiable.
In phase 1, the original is copied by repeatedly scanning and copying
the spines described by the right pointers.
It turns out that in doing so, all pointers in the copy can be set
correctly except those right pointers that point to forward (= not yet
copied) nodes; these still point to their original right successors and
the right pointer of their original points to them.
In phase 2, the nodes of the copy are scanned in forward direction
(using pointer increments) and the affected right pointers are reset to
the parent nodes of the original right successors.
In phase 3, the right pointers of a copied node, its original and the
parent of the latter are rotated much as in the Schorr and Waite
algorithm, to set all pointers correctly.
Robert Endre Tarjan,
Efficiency of a good but not linear set merging algorithm,
J. ACM,
vol. 22,
#2,
April 1975,
pp. 215225.
The original paper on the famous UNION/FIND algorithm, with complexity
analysis resulting in the inverse Ackermann function.
R.A. Wagner,
Ordern correction for regular languages,
Commun. ACM,
vol. 17,
#5,
May 1974,
pp. 265268.
Presents an O(n) algorithm which, given a string and a finitestate
automaton, can correct the string to an acceptable one with a minimum
number of edit operations.
@article{DBLP:journals/spe/Plum74,
author = {Thomas W.S. Plum},
title = {Random Search on the 8Queens Problem},
journal = {Softw., Pract. Exper.},
volume = {4},
number = {3},
year = {1974},
pages = {251253},
annote = {
Subtitled "A Simple Solution for a Simple Problem".
Presents a 15line APL program, which draws random permutations of length 8
(in APL: 8?8; simple indeed), interprets them as a board position and
tests them. After 996 tries a solution was found.
}
@article{DBLP:journals/spe/Bell74,
author = {James R. Bell},
title = {Rapid Calculations of Subscripted Array Addresses},
journal = {Softw., Pract. Exper.},
volume = {4},
number = {3},
year = {1974},
pages = {275277},
annote = {
Since address calculations for a multidimensional array involve multiplying
indices by a known step size over a known range, it is proposed to precompute
the pertinent multiplication table; it is (much) smaller than the array itself.
The method is compared to precomputing tables of pointers to subrows and
submatrices, which are (much) bigger.
}
J.E. Hopcroft,
J.D. Ullman,
Set merging algorithms,
SIAM J. Comput.,
vol. 2,
#?,
1973,
pp. ????.
Very complicated algorithm, probably superseded by Tarjan,
"Efficiency of a good but not linear set merging algorithm", 1975.
J.M. Robson,
An improved algorithm for traversing binary trees without an auxiliary stack,
Inf. Process. Lett.,
vol. 2,
#1,
March 1973,
pp. 1214.
Shows a clever way to get rid of the TAG bit in Siklóssy's algorithm
(Siklóssy, "Fast and readonly algorithms for traversing trees
without an auxiliary stack", Inf. Process. Lett., Vol. 1, No. 4, June
1972, pp. 149152), but silently gives up the readonly property.
@article{DBLP:journals/spe/Baker72,
author = {John L. Baker},
title = {CDC 6000Series Register Save/Restore},
journal = {Softw., Pract. Exper.},
volume = {2},
number = {4},
year = {1972},
pages = {377387},
annote = {
Saving all 23 (X[0:7], A[0:7], B[1:7]) registers of the CDC 6000 is somewhat
of a Chinese puzzle; it is solved by devious manipulation in about 60 machine
instructions.
Same for restore.
The routine cannot handle concurrent peripheral action on A6 or A7.
}
@article{DBLP:journals/spe/Brown72,
author = {P. J. Brown},
title = {Recreation of Source Code from Reverse Polish Form},
journal = {Softw., Pract. Exper.},
volume = {2},
number = {3},
year = {1972},
pages = {275278},
annote = {
The author suggests considering an opening parenthesis as a unary operator on
the expression it enclosed, and the closing parenthesis as a postfix operator
on the basic operand it follows.
Under this regimen reverse Polish is generated which can be converted back to
the exact original expression.
}
David W. Matula,
George Marble,
Joel D. Isaacson,
Graph coloring algorithms,
in Graph Theory and Computing,
ed. by Ronald C. Read,
Academic Press,
New York,
1972,
pp. 109122.
Practical paper concerned with efficient heuristics to do the job, but
still keeps an eye on the theoretical bounds.
Restricts itself to "sequential algorithms", algorithms which first
order the nodes and then colours them in the indicated order.
Two basic sequential orderings are explained: largestfirst, which
orders the nodes by degree, highest degree first, and smallestlast,
which recursively dismantles the graph, removing a node of highest
degree in all steps, and then reverses the order found this way.
Also two ways of colouring are given: naive, in which a new colour is
introduced only if none of the colours in use would do, and
"colouringwithinterchange".
Colouringwithinterchange is invoked when a node, V, is to be
coloured and its direct neighbours together already have all the colours
that are in use.
Rather than immediately decking out a fresh colour, we try to reduce the
number of neighbouring colours by interchanging two colours in a
restricted part of the graph.
We pick a neighbour, N, of V, which has already been coloured and
which has a colour that occurs only once among the neighbours of V,
and which has a neighbour M with that same property.
N and M have different colours $ C sub N $ and $ C sub M $,
since they are neighbours.
We now swap the colours of N and M.
The result is that there is now one colour less among the neighbours of
V (since $ C sub N $ disappeared and the $ C sub M $ which took its
place was already there, since all colours were already there), but
both N and M may now be in trouble since they may have
neighbours of the colour they have just been given.
If so, the offended nodes are recoloured in turn, from $ C sub N $ to
$ C sub M $ and vice versa, to solve the problem.
This may introduce new conflicts, which are solved the same way.
If this process dies out without reaching back to a neighbour of node
V, all is well (it will die out eventually for lack of nodes to
colour).
If, however, it does reach back to a neighbour of V, we lose, since we
recolour a neighbour of V back to $ C sub N $, reintroducing the
colour we tried to dispel (we cannot return to a neighbour of V and
colour it to $ C sub M $, since then its colour had to be $ C sub N $
originally, which is impossible since there was only one occurrence of
$ C sub N $ among the neighbours of V).
The four combinations are analysed theoretically (to a certain depth)
and compared practically.
Smallestlast with interchange is the winner, which the best colouring
most of the time and one colour too many about 30 percent of the time.
Even random ordering without interchange usually wastes less than 35
percent of the colours.
The theoretical bounds are often 200 to 300 percent too high.
R. Bayer,
E. McCreight,
Organization and maintenance of large ordered indexes,
Acta Inform.,
vol. 1,
1972,
pp. 173189.
[Btrees]
Donald E. Knuth,
Ancient Babylonian algorithms,
Commun. ACM,
vol. 15,
#7,
July 1972,
pp. 671677.
The ancient Babylonians (~ 1800 B.C.) used base 60, without indicating
the position of the sexagesimal point; so 30 could mean 30 or 1800 or
½.
Addition is simple, multiplication was probably done by using tables, and
division by X was done as multiplication by $1/X$ and was only
possible for those values of X that have finite reciprocals in base
60.
Tables of reciprocals of all numbers with reciprocals of <= 3
sexagesimal digits have been found.
There is some evidence that they were also used backwards, for division
by numbers that did not have finite reciprocals in base 60.
No explicit algorithm descriptions for the simple arithmetic operations
have been found; probably everybody concerned knew them.
Algorithms, almost all of them nonrepetitive, were specified as
workedout examples, with sample numbers filled in, but formulated so
that they would also work for other numbers.
Algorithms described specific domestic or architectural problems, but
in fact solved equations.
Some of the equation problems for which algorithms have been found are
solving $ x  y = d $ and $ x * y = p $ for x and y; finding the
lengths of diagonals using Pythagoras (over 1000 years before P.!);
computing y under the constraint $ x + y = x * y $ and x given; and
compound interest (at a sample interest of 20% per year!).
Some algorithms are given in stack machine style.
Arithmetic operations conceptually destroyed their left operand, i.e.,
"double the length" destroyed the copy of "length", necessitating
instructions for making copies for further computations.
Compound interest was done by example macro expansion.
The Seleucids (~ 300 B.C.) had a stylized form of algorithm description,
a "programming language", in which no sample numbers occurred.
They also had reciprocal tables of <= 6 sexagesimal digits (sorted as
well!).
Laurent Siklóssy,
Fast and readonly algorithms for traversing trees without an auxiliary stack,
Inf. Process. Lett.,
vol. 1,
#4,
June 1972,
pp. 149152.
The tree nodes are modified thus: a single bit per node, TAG,
distinguishes between left and right children (to direct the backingup
step) and the left and right pointers are xored with the address of the
parent.
Now apply Schorr & Waite, suitably modified for the xor, and it will
never have to write a pointer in a node.
If the entire tree is fixed, right pointers can be changed to threads
(this can be done irrespective of xor, Schorr & Waite, etc. anyway DG).
V.L. Arlazarov,
E.A. Dinic,
M.A. Kronrod,
I.A. Farad\*(vzev,
On economical construction of the transitive closure of an oriented graph,
Soviet Math. Dokl.,
vol. 11,
1970,
pp. 12091210.
The "Four Russians" algorithm.
Ian Munro,
Efficient determination of the transitive closure of a directed graph,
IPL,
vol. 1,
#2,
1970,
pp. 5658.
Casts the problem in adjacency matrices and then reformulates it into
matrix multiplication.
Next applies Strassen's matrix multiplication to obtain a bound smaller
than $ O ( N sup 3 ) $.
Paul Purdom Jr.,
A transitive closure algorithm,
BIT,
vol. 10,
1970,
pp. 7694.
First a fourstep procedure is described:
1. all cycles are contracted;
2. the resulting nodes are sorted topologically;
3. the transitive closure is computed in that order;
4. the contracted nodes are expanded, to produce the entire transitive
closure.
The term strongly connected component is not used.
Next the algorithm is analysed in detail, and compared to Warshall's
algorithm (which loses, often by a factor of 5 or more).
Third, the algorithm is given in full, in Algol 60 (one procedure of
7½ printed pages, no subprocedures, no recursion).
Burton H. Bloom,
Space/time tradeoffs in hash coding with allowable errors,
Commun. ACM,
vol. 13,
#7,
July 1970,
pp. 422426.
The addressed problem is the test for nonmembership of a large given
set of (compound) values.
If the test succeeds, nonmembership is guaranteed; if it fails,
membership is very likely but not guaranteed.
Two hashing algorithms are given.
Method 1 uses open hashing and stores a hash representation of the value
rather than the value itself in the hash table.
If the lookup fails (no hash entry or no matching representation)
nonmembership is guaranteed.
Method 2 uses a hash table H of N singlebit elements and d hash
functions $ h sub 1..d $.
If the given set contains a value v,
all bits with indexes $ h sub 1..d ( v ) $ are on in H.
If for a test value t not all bits with indexes $ h sub 1..d ( t ) $
are on, nonmembership is guaranteed.
There is an optimal value for d, given N and the number of values in
the fixed set, which minimizes the chance of a false positive.
Method 2 has later become known as Bloom filters.
It is shown analytically that Method 2 is better than Method 1.
In a realistic case study concerning word hyphenation, allowing about 6%
false positives allowed the size of the hash table to be reduced by a
factor of about 7.
B.W. Kernighan,
S. Lin,
An efficient heuristic procedure for partitioning graphs,
Bell System Technical J.,
vol. 49,
#2,
Feb. 1970,
pp. 291308.
This paper concerns minimumcost graph partitions with restrictions
on the sizes of the resulting subgraphs.
Quite a number of heuristics are supplied.
The first heuristic addresses the problem in which 2 subgraphs should
result of equal size.
A random partition in two equally sized subgraphs is made, and its cost
computed; this is the initial partition.
The heuristic now works as follows.
We are going to exchange pairs of nodes, one from each of the subgraphs,
until all nodes have been swapped; so we end up with the same partition,
with left and right exchanged.
During such a swap sequence, the cost may go up and down, but the gain
starts at 0 and ends at 0.
Somewhere on each such swap sequence there is a maximum gain, and among
all possible swap sequences there is one that exhibits the optimal gain.
The problem is to find this optimumexhibiting swap sequence.
The following heuristic generates promising swap sequences.
Make a temporary copy C of the initial random partition P.
Find the pair of nodes in C, one from each subgraph, that, if
these nodes were exchanged, would give the greatest reduction in cost;
note that this greatest reduction may still be negative.
Record the pair and the reduction so far, swap the pair in C, and mark
the pair as "swapped".
Repeat this process, restricting the choice to nonswapped nodes, until
all nodes have been swapped;
note that since the subgraphs change, the greatest reduction need not
remain negative after having been negative.
Find the point in the swap sequence thus obtained at which the reduction
was the highest, replay the swap sequence on the partition P, and
clear the "swapped" marks on all nodes.
This yields our new initial but no longer rnadom partition P.
Now new gain is possible, since all nodes are back in the game.
We repeat this process until we meet a swap sequence that shows no
reduction at all.
This yields our required partition.
(The paper explains how to do these operations efficiently and avoid the
copy.)
The algorithm requires between 2 and 4 cycles, and yields the optimal
partition in about half the cases for graphs of 30 nodes.
The probability goes down exponentially with increased graph size.
The time requirements are $ O ( n sup 2 ) $.
The next heuristic builds on the previous one, as follows.
Again start by randomly splitting the graph into two subgraphs, A and
B.
Apply the above heuristic algorithm to partition
A into $ A sub 1 $ and $ A sub 2 $, and
B into $ B sub 1 $ and $ B sub 2 $.
Now apply the above heuristic algorithm with initial partitions
(A, B),
($ A sub 1 $ + $ B sub 1 $, $ A sub 2 $ + $ B sub 2 $) and
($ A sub 1 $ + $ B sub 2 $, $ A sub 2 $ + $ B sub 1 $), and keep the
best.
This raises the probability of obtaining the optimal partition for a 30
node graph to almost certainty.
The paper includes many more heuristics for related problems.
Howard L. Morgan,
Spelling correction in systems programs,
Communications of the ACM,
vol. 13,
#2,
Feb. 1970,
pp. 9094.
Applies Damerau's (1964) spelling corrector to program texts and extends
it to have different dictionaries for different contexts.
For example, in the context of the condition in an ifstatement, the
dictionary contains only words that are allowed syntactically in a
condition.
So the identifier GOR will be corrected to OR rather than to FOR.
The algorithm is given in a flowchart and in IBM 360 Assembly Language
Code; it relies heavily on the TRT (Translate and Test) instruction.
D. Michie,
"Memo" functions and machine learning,
Nature,
vol. 218,
#5136,
April 6, 1968,
pp. 1922.
(Dept. of Machine Intelligence and Perception, Univ. of Edinburgh),
Recognizes that a computer function should behave like a mathematical
function: the input determines the output, and how the calculation is
done is immaterial, or at least behind the screens.
This idea frees the way for alternative implementations of a given
function, for example by memoizing.
A function implementation consists of a rote part and a rule part.
The rote part contains the inputtooutput mappings the function has
already learned by rote, and the rule provides the answer if the input
is new.
New results are added to the rote part and the data is ordered in order
of decreasing frequency by using a selforganizing list.
This way the function "learns" answers by rote as it is being used.
The list is fixedsize and if it overflows, the least popular element is
discarded.
The first example uses memoization simply as a speedup device to
calculate the function factorial (in POP2).
The second example is considerably more sophisticated.
It concerns a program on one machine (an Elliott 4100) that must learn
to control an intentionally unstable program on a second machine (a
PDP7); both machine are connected by a link.
The PDP7 controls what the paper calls a Donaldson system.
Its main constituent is a little motor cart which runs on a linear
bounded track.
The cart balances a pole on its top; the pole is connected to the cart
by a hinge, which allows it to fall forward or backward, but not
sideways.
The goal of the Elliott 4100 is to keep the pole balanced as long as
possible by sending commands forward, stop or backward to
the PDP7.
It looses when the pole comes down or the carriage crashes into one of
the ends of the track.
What happens next is not entirely clear to me.
There are at least two issues here.
One is that the result of the action is somehow used to either or not
(or negatively?) enter the resulting value in the rote table, so
learning is conditional here.
The second is that the input to the function is mapped onto equivalence
classes, to reduce the size of the rote table; the mapping is programmed
by the user.
The third example concerns interpolation and curve fitting, but there
the relation with memoization escapes me completely.
Robert W. Floyd,
Nondeterministic Algorithms,
J. ACM,
vol. 14,
#4,
Oct. 1967,
pp. 636644.
A nondeterministic flowchart ("algorithm" is equated with "flowchart"
in this paper) is a flowchart that contains a call to a multiplevalued
function, a function that can give more than one result.
The program is then run for all the results of the function, as follows.
Each time the program reaches a point of termination labeled "success",
the program result is reported; when it reaches a termination point
labeled "failure", it backtracks to the latest call of a multivalued
function and continues from there with the next value.
Many programs, especially those for combinatorial search problems, can
be formulated more easily in a nondeterministic flowchart, but it is
impossible to execute such a flowchart on a conventional computer.
The paper describes a mechanical method for converting a
nondeterministic flowchart into a deterministic one.
Basically each path in the flowchart is replaced by two paths, one going
forward performing the computation, and the other going backward
performing the backtracking.
The backtracking is implemented using a stack.
For simple nondeterministic flowchart constituents, two replacing
deterministic constituents are given, one for each path.
For example, an assignment node $ X := f $ is replaced by a computation
node $ push(X); X := f $ and a backtrack node $ X := pop() $.
More complicated nondeterministic nodes, which can have more than one
input and output arc, are replaced by a graph of deterministic nodes,
with each nondeterministic input arc augmented by a backtrack output
arc and each nondeterministic output arc augmented by a backtrack input
arc.
This process turns out to be less complicated than it sounds, and the
transformations are relatively straightforward.
The process is illustrated using the 8queens problem and by finding
cycles in a directed graph.
The author takes pains to point out that nondeterministic algorithms are
not probabilistic, random of Monte Carlo algorithms.
He also suggests the interpretation that the value of the multivalued
function is gouverned by the desired result rather than by preceding
causes; it is as if the values that did not lead to success were never
actually yielded by the function.
S.W. Golomb,
L.D. Baumert,
Backtrack programming,
J. ACM,
vol. 12,
1965,
pp. 516524.
The idea of backtracking is introduced as a method to find one or all global
optimal of a function $φ(x sub 1, ..., x sub n)$.
The method is set off against hillclimbing, which starts with one set of
parameter values and improves the function value by small changes in some of
the parameters until no more improvement can be achieved; this method finds one
local optimum.
The method needs a threshold T such that if
$φ(x sub 1, ..., x sub k, ) < T$ it can be concluded that any further
solution with parameters $x sub 1, ..., x sub k$ will be suboptimal.
At that point all possibilities for $x sub (k+1), ..., x sub n$ can be
skipped, and backtracking can be initiated leading to other values of
$x sub k$.
The process is explained in detail, using the eight queens problem.
The next problem, commafree words, uses a different threshold.
As words are added to the solution set, a test is made if the words still obey
the commafreeness condition; if not the program backtracks.
Two optimizations are discussed: preclusion, in which certain values of some
$x sub i$ with $i > k$ are temporarily taken out of the game because they are
incompatible with $x sub 1, ..., x sub k$; the heuristic of ordering the
parameters such that $X sub 1 <= ... <= X sub n$, so smaller sets are
tested first.
A statistical version for very large problems (recovery from noisy
transmissions) is also explained.
The author emphasizes that the application of backtracking is not automatic,
and ingenuity is required.
Also, backtracking does not always take away the exponential sting of an
algorithm.
Fred Damerau,
A technique for computer detection and correction of spelling errors,
Communications of the ACM,
vol. 7,
#3,
March 1964,
pp. 171176.
Observes that 80% of all spelling errors in natural language text fall
in one of four classes:
wrong character,
missing character,
extraneous character, and
adjacent characters swapped.
The correcting spelling checker is based on this, as follows.
A word (both in the text and in the dictionary) is characterized by its
length and its character register, a machine word (bit array) the length
of the alphabet, in which bit N is on if the word contains character number
N from the alphabet.
These characterizations are used to quickly determine if a match is
possible (under the proposed error corrections) between a word W in
the text and a word D in the dictionary, to speed up the linear
dictionary search.
If the lengths of W and D differ by more than 1, or their character
registers differ in more than 1 bit, there can be no match.
Otherwise, W and D are compared character by character until a
nonmatching position is found (if none is found W is in the
dictionary).
Now,
if W  D = 1, one character in W is skipped and the rest must
match;
if W  D = 1, one character in D is skipped and the rest must
match;
and otherwise (W  D = 0) the next two characters in W are
swapped and the rest must match.
J. Weizenbaum,
Symmetric list processor,
Commun. ACM,
vol. 6,
#9,
Sept. 1963,
pp. 524544.
Algorithms for doublylinked lists.
Thomas N. Hibbard,
A simple sorting algorithm,
Journal of the ACM,
vol. 10,
1963,
pp. 142150.
Interesting stuff about shortest fixed sort sequences.
L.R. Ford, Jr.,
D.R. Fulkerson,
Flows in Networks,
Princeton Univ. Press,
Princeton, New Jersey,
1962,
pp. ???.
At p. 11: Maxflow mincut algortithm.
Douglas T. Ross,
A generalized technique for symbol manipulation and numerical calculation,
Comm. ACM,
vol. 4,
#3,
March 1961,
pp. 147150.
Invents the record by observing that many property lists in Lisp (the
only dynamic data structure known in 1960) are similar or even
equal, thus describing similar or equalstructured elements.
By putting the components of such an element consecutively in fixed
places, memory space and processor cycles can be saved.
To access a component of an element (in today's speech "a field in a
record") indexing instructions are used "reversed", as follows.
The normal index instruction specifies the (constant) address A of
the data, usually an array, and an index register R holding the
offset into the array; the memory location accessed is
mem[A+contents(R)].
In the reversed use, R holds the address of the element (=record),
and the constant A is the fixed offset of the element (=field) in
the record.
Since the instruction still accesses mem[A+contents(R)], it
accesses R→A.
The use of records is demonstrated by a very complicated flowchart
for the propositional calculus, the Wang Algorithm, which involves
extensive pointer manipulation.
Note: the null pointer is called END.
The author is deeply impressed by the fact that "the simple operation of
changing the index register to refer to the next element ... causes a
great number of changes of data to be made essentially instantaneously".
Makes you realize how much we take for granted today.
Charles R. Blair,
A program for correcting spelling errors,
Information and Control,
vol. 3,
1960,
pp. 6067.
The paper actually describes a test for the "similarity" of two words,
one from the text (W) and one from the dictionary (D), under the
assumption that W is spelling error (not typing error!); given W,
D is to be retrieved.
So "sieze" should be found similar to "seize", but "sieae" would not.
Both words are abbreviated to 4 characters, using a Soundexlike system
not based on pronunciation.
The abbreviation algorithm has two tables, one giving the probability
L that a given letter is a mistake (for example, 'A' will result from
spelling error more readily than 'Q'), the other giving the probability
P that the letter in a given position is a mistake (mistakes are less
likely at the start and end of a word than in the middle).
Now for each letter in W, compute $ L times P $, the probability that
the letter in its position is a mistake, and retain the 4 most certain
(least errorlikely) letters in their proper order.
This is the abbreviation of W.
In a test on 117 misspelled words, the algorithm corrected 89, rejected
26, and incorrectly accepted 2.
A. Newell,
J.C. Shaw,
H.A. Simon,
Empirical explorations of the logic theory machine,
in 1957 Western Joint Computer Conference,
???,
1957,
pp. 218230.
[ Found as an early reference to dags ]
