Memory Management

Literature references and annotations by Dick Grune, Last update: Wed Sep 16 22:37:24 2009.
These references and annotations were originally intended for personal use and are presented here only in the hope that they may be useful to others. There is no claim to completeness or even correctness. Each annotation represents my understanding of the text at the moment I wrote the annotation.
No guarantees given; comments and content criticism welcome.

* in Workshop on Memory Systems Performance (MSP 2002), ACM SIGPLAN Notices, 38, #2, Feb. 2003, pp. 1-107.

* in Int. Symposium on Memory Management (ISMM 2002), ACM SIGPLAN Notices, 38, #2, Feb. 2003, pp. 108-290.

* Yossi Levanoni, Erez Petrank, "An on-the-fly reference counting garbage collector for Java", in OOPSLA '01 Conference, ACM SIGPLAN Notices, 36, #11, Nov. 2001, pp. 367-380.
Memory management through reference counting is not obviously compatible with a multiprocessor environment, due to the high bandwidth needed to update the reference counts. The authors propose a number of techniques to allow reference counting on multiprocessors.
First, like Deutsch & Bobrow (CACM, Sept 1976) they defer updating non-local reference counts. This requires keeping the old reference counts as well. Second, they introduce a collection phase (not present in normal reference counting), during which the old and new reference counts of all threads are brought into agreement with each other, and during which unreferenced objects are detected and freed.
To avoid stopping the world, a sliding view of the world is used, in which internally consistent views of different treads at slightly different times are integrated. To allow this integration, all messages during the collection phase have to be recorded and made available.
All the algorithms involved are sketched in the paper, but for the full algorithms the reader is referred to a technical report.

* Timothy Harris, "Early storage reclamation in a tracing garbage collector", ACM SIGPLAN Notices, 34, #4, April 1999, pp. 46-53.

* "OOPSLA'97 Workshop on Garbage Collection and Memory Management", Peter Dickman, Paul R. Wilson, Oct 1997,

* Richard L. Hudson, Ron Morrison, Eliot B. Moss, David S. Munro, "Garbage collecting the world: One car at a time", ACM SIGPLAN Notices, 32, #10, Oct. 1997, pp. 162-175.

* Richard Jones, Rafael Lins, "Garbage Collection -- Algorithms for Automatic Dynamic Memory Management", John Wiley, Chichester, 1996, pp. 377.

* Colin Runciman, Niklas Röjemo, "Two-pass heap profiling -- A matter of life and death", in Implementation of Functional Languages, Lecture Notes in Computer Science #1268, ed. by Werner Kluge, Berlin, Springer-Verlag, 1996, pp. 222-232.

* José M. Piquer, "Indirect distributed garbage collection: handling object migration", ACM TOPLAS, 18, #5, Sept. 1996, pp. 615-647.

* Hans-J. Boehm, "Simple garbage-collection safety", in 1996 ACM SIGPLAN Conf. Programming Language Design and Implementation, ACM SIGPLAN Notices, 31, #5, May 1996, pp. 89-98.
Conservative garbage collection is threatened by some compiler optimizations which "hide" pointers by doing calculations on them. For example, suppose the array reference p[i-1000] is the last reference to the heap array pointer p; then the reference could conceivably be translated as p := p - 1000; p[i], hiding the pointer p from a conservative garbage collector. Although this is a non-problem in practice, the author shows ways (not at all so "simple" as the title promises) to guarantee full safety.

* David A. Barrett, Benjamin G. Zorn, "Garbage collection using a dynamic threatening boundary", in ACM SIGPLAN '95 Conf. Programming Language Design and Implementation, ACM SIGPLAN Notices, 30, #6, June 1995, pp. 301-314.

* William J. Schmidt, Kelvin D. Nilsen, "Performance of a hardware-assisted real-time garbage collector", in 6th Symposium on Architectural Support for Programming Languages and Operating Systems, ACM SIGPLAN Notices, 29, #11, pp. 76-85. Nov. 1994,

* Nandakumar Sankaran, "A bibliography on garbage collection and related topics", ACM SIGPLAN Notices, 29, #9, Sept. 1994, pp. 149-158.
[ 281 entries ]

* Henry G. Baker, "Minimizing reference count updating with deferred and anchored pointers for functional data structures", ACM SIGPLAN Notices, 29, #9, Sept. 1994, pp. 38-43.

* Benjamin Zorn, "The measured cost of conservative garbage collection", Software -- Practice & Experience, 23, #??, 1993, pp. 733-756.

* Hans-J. Boehm, "Space-efficient conservative garbage collection", in ACM SIGPLAN '93 Conf. PLDI, ACM SIGPLAN Notices, 28, #6, June 1993, pp. 197-206.
Several techniques to avoid false pointer identification are described. The main technique is blacklisting data values found during a scan that are identified to be prospective false pointers and to avoid allocating data on these positions. A prospective false pointer is for example a value that points into the heap at a position where nothing has been allocated. The conservative collector can determine this, but if it happens to allocate data at that position, it will have created a false pointer. A slow-down of less than 1 percent is reported.

* Henry G. Baker, "Infant mortality and generational garbage collection", ACM SIGPLAN Notices, 28, #4, April 1993, pp. 55-57.

* Reed Hastings, Bob Joyce, "Purify -- Fast detection of memory leaks and access errors", in Winter '92 USENIX Conference, pp. 125-136. 1992,
Description of the tool Purify. "Access errors are errors of commission, memory leaks are errors of omission".
Access errors are detected as follows. Purify maintains a two-bit state for each byte in memory; the states are: unallocated, uninitialized and initialized. It also instruments each memory access with code to test and / or update the state. Arrays are surrounded by "red" zones of bytes marked uninitialized, to catch out-of-bounds indexing.
Memory leaks are sniffed by calling a "garbage detector". The garbage detector reports the origin (routine of allocation) of each chunk it thinks is garbage. It is up to the user to confirm this and take action.
There is also an excellent critique of conservative garbage collection in Note 7, which points out, among other things, that the chance for having a false pointer nail down an piece of garbage is larger for larger pieces, but these are exactly the ones that we would like to remove.

* Daniel Edelson, "A mark-and-sweep collector for C++", in Nineteenth ACM SIGPLAN Symposium on Principles of Programming Languages, pp. 51-58. 1992,

* Emmanuel Chailloux, "A conservative garbage collector with ambiguous roots for static typechecking languages", in International Workshop on Memory Management, St. Malo, France, September 1992, Springer, Lecture Notes in Computer Science #637, pp. 218-229.

* Y. Bekkers, O. Ridoux, L. Ungaro, "Dynamic memory management for sequential logic programming languages", in International Workshop on Memory Management, St. Malo, France, September 1992, Springer, Lecture Notes in Computer Science #637, 1992, pp. 82-102.

* Paul R. Wilson, "Uniprocessor garbage collection techniques", International Workshop on Memory Management, St. Malo, France, September 1992, Springer, Lecture Notes in Computer Science #637, 1992, pp. 1-42.

* Hans-J. Boehm, David Chase, "A proposal for garbage-collector-safe C compilation", Journal C Language Translation, 4, #2, December 1992, pp. 126-141.

* Amer Diwan, Eliot Moss, Richard Hudson, "Compiler support for garbage collection in a statically typed language", ACM SIGPLAN Notices, in ACM SIGPLAN '92 Conf. on Prog. Lang. Design and Implementation, 27, July 1992, pp. 273-282.
An untidy pointer is a pointer that is instrumental in accessing data, but does not point properly to that data due to optimizations. The obvious example is using the base address A[0] for an array A[10:20]. Such pointers will be misinterpreted by the garbage collector. The paper points out optimizations which can give rise to untidy pointers, and suggests a number of remedies. The basic solution is to extend the templates for the garbage collector with the possibility of untidy pointers; simple expressions involving addition and/or subtraction of constants can be specified. This scheme has several problems, solutions for which are presented.

* Ralph E. Johnson, "Reducing the latency of a real-time garbage collector", ACM Letters on Programming Languages and Systems, 1, #1, March 1992, pp. 46-58.

* Henry G. Baker, "The treadmill: real-time garbage collection without motion sickness", ACM SIGPLAN Notices, 27, #3, pp. 66-70. March 1992,

* David Ungar, Frank Jackson, "An adaptive tenuring policy for generation scavengers", ACM TOPLAS, 14, #1, Jan. 1992, pp. 1-27.

* David L. Detlefs, "Concurrent garbage collection for C++", in Topics in Advanced Language Implementation, ed. by Peter Lee, Cambridge, Mass., MIT Press, 1991, pp. 101-134.
The two main questions a garbage collector has to answer are: "Given an object what pointers does it contain?" and "Given a pointer, where is the start and end of the object it points to?". Both are a problem in C++. The first can be "solved" by conservative garbage collection. The second is almost unsolvable due to the existence of interior pointers. On top of this, the author wanted a compacting concurrent garbage collector, which raise a host of new problems. Compacting means updating pointers, which cannot be done in a "conservative" way.
The paper first explains several solutions from the literature, including Bartlett's "mostly copying" algorithm, which hitherto had only been published in technical reports (but see Bartlett, "Compacting garbage collection with ambiguous roots", Lisp Pointers, Vol. 1, No. 6, 1988, pp. 3-12), and shows all of them wanting. His own garbage collector includes a new front end for the C++ code, which constructs type descriptions for all allocatable types, calls a different version of malloc for the "new" construct and makes various other adaptations. This provides complete knowledge of where the pointers are. To find the start of an allocated segment from an interior pointer, a bit map of memory is kept, one bit per allocatable position. The bit is on for the start positions of all allocated segments; by searching backwards, the start position can be found.
Compacting is done by a variant of Bartlett's mostly copying collector. Concurrency is introduced by using Baker's algorithm: copy the roots only, and then start a second process to do the rest of the copying. At the same time, watch all pointers the program is going to see, and if the point to from-space, move the corresponding object right now. Watching all pointers is implemented using the page lock primitives of the Mach operating system. Overlap percentages of 25 to 50 percent are achieved.

* Andrew W. Appel, "Garbage collection", in Topics in Advanced Language Implementation, ed. by Peter Lee, Cambridge, Mass., MIT Press, 1991, pp. 89-100.
A somewhat breathless summary of garbage collection techniques (the Schorr and Waite is algorithm explained in six lines!).

* Ravi Sharma, Mary Lou Soffa, "Parallel generational garbage collection", ACM SIGPLAN Notices, 26, #11, Nov. 1991, pp. 16-32.
Many references; Do excerpt.

* Jürgen Heymann, "A comprehensive analytical model for garbage collection algorithms", ACM SIGPLAN Notices, 26, #8, Aug. 1991, pp. 50-59.
Derives closed formulas for various properties of five garbage collection algorithms, using the entire Greek alphabet and then some. The five algorithms are: marks&scan (M&S), two-space copying (COPY), parallel M&S, parallel COPY, and reference count + parallel M&S. The justification of the various properties is argued in depth. In very broad terms, reference count + parallel M&S is "best".

* Benjamin Goldberg, "Tag-free garbage collection for strongly typed programming languages", in ACM SIGPLAN '91 Conf. Programming Language Design and Implementation, ACM SIGPLAN Notices, 26, #6, June 1991, pp. 165-176.
For every type (including the implicit types of activation records) garbage collection routines are generated. For monomorphically and polymorphically typed languages.

* Hans-J. Boehm, Alan J. Demers, Scott Shenker, "Mostly parallel garbage collection", in ACM SIGPLAN '91 Conf. Programming Language Design and Implementation, ACM SIGPLAN Notices, 26, #6, June 1991, pp. 157-164.

* Ulrich Neumerkel, "Speicherbereinigung für Prologsysteme", Thesis, TU Inst.f. Computersprachen E185/1 Argentinierstr.8/4, A-1040 Wien Austria, ~1990, pp. .
(ulrich@vip.UUCP) +431 58801/4477 fax +431 5057838

* A. Demers, M. Weiser, B. Hayes, H. Boehm, D. Bobrow, S. Shenker, "Combining generational and conservative garbage collection: Framework and implementations", in Seventeenth ACM Symposium on Principles of Programming Languages, 1990, pp. 261-269.

* E.P. Wentworth, "Pitfalls of conservative garbage collection", Software Practice & Experience, 20, #7, July 1990, pp. 719-727.
Points out that one falsely identified pointer can keep an extensive data structure from being freed, and shows that this can happen in practice, mainly in systems in which garbage is not created piecemeal but in large blobs. This happens regularly in functional languages, from which the author draws his examples.

* David R. Hanson, "Fast allocation and deallocation of memory based on object lifetimes", Softw. Pract. Exper., 20, #1, Jan. 1990, pp. 5-12.
All dynamic data is grouped into a small number of lifetimes, e.g. phase 1 to n of a compiler. For each lifetime, data is allocated from large chunks of memory which are freed simultaneously only after the lifetime is over. The algorithm is twice as fast as quick-fit and twice as slow as stack allocation. Includes code in ANSI C.

* Paul R. Wilson, Thomas G. Moher, "Design of the opportunistic garbage collector", in OOPSLA '89 Conference, ACM SIGPLAN Notices, 24, #10, Oct. 1989, pp. 23-35.

* R.P. Brent, "Efficient implementation of the first-fit strategy for dynamic storage allocation", ACM Trans. Prog. Lang. Syst., 11, #3, July 1989, pp. 388-403.
Naive implementation of first-fit for malloc() uses linear search to find the position of the first fit. There exist several techniques to improve upon this by imposing a tree structure over the free list, thus obtaining logaritmic cost; this is another one, with a lower constant factor. (Keeping separate free lists for exponentially increasing block sizes gives (almost) constant time, though.)
Memory is filled by free and occupied chuncks; a chunk consists of a one-word header (size + free/occupied bit) followed by the user block. On top of this a conceptual structure of a constant number equally sized "segments" is imposed. Segments have no representation in memory and the only important information about a segement is the position of the first chunk in it and the size of the largest free chunk that starts in it; since segements do not exist in memory, chunks can extend right through them. Segement addresses are kept in a priority heap with the one with the smallest largest free chunk on top. Now finding a chunk that is large enough is easy: search the priority heap for the first segment to contain a large enough chunk, and find that chunk by linear search from the first chunk in the segment.
The paper is not too reader-friendly: 1. The reason for having the segment structure is not explained. I suppose it serves to keep the memory requirements for the administration constant; keeping the free chunks themselves in a priority heap would make that heap proportional in size to the number of free chunks. 2. Both blocks and blocks + administration word are called blocks, which is mildly confusing. 3. A number of algorithms and technical terms are introduced by referring the reader to literature references, often exercises in Knuth's book.

* Benjamin Goldberg, "Generational reference counting: A reduced-communication distributed storage reclamation scheme", in ACM SIGPLAN '89 Conf. Programming Language Design and Implementation, ACM SIGPLAN Notices, 24, #7, July 1989, pp. 313-321.

* Andrew W. Appel, "Simple generational garbage collection and fast allocation", Softw. Pract. Exp., 19, #2, Feb. 1989, pp. 171-183.

* Michael Caplinger, "A memory allocator with garbage collection for C", in Winter 1988 USENIX Conf., pp. 325-330. Feb. 1988,
A conservative garbage collector for C. Uses a 32-bit magic number to (heuristically) mark the start of a chunk (called a segment here), to catch pointers to non-segments. Coexists with free().
All malloc calls done by the system before the program starts set the alien bit in the segment header. Such segments are not garbage-collected, due to possible uncouth operations during program start-up. The performance of the package is "not extremely good, but acceptable".

* D.J. Challab, "Elastic memory -- Buddy systems interface", SP&E, 18, #12, Dec. 1988, pp. 1139-1156.

* Charles B. Weinstock, William A. Wulf, "Quick Fit: an efficient algorithm for heap storage allocation", ACM SIGPLAN Notices, 23, #10, Oct. 1988, pp. 141-148.
A set of quick-lists is kept for blocks of popular sizes.

* Hans-Juergen Boehm, Mark Weiser, "Garbage collection in an uncooperative environment", Softw. Pract. Exper., 18, #9, pp. 807-820. Sept. 1988,
Introduces "conservative garbage collection": conservative garbage collection does not guarantee that it will collect all garbage, only that it will not collect all non-garbage. The authors point out that in principle all garbage collectors are conservative garbage collectors. Conservative garbage collection needs virtually no cooperation from the code. This has several advantages: 1. it can work with virtually any code; 2. it costs nothing when not used; 3. it requires no extra memory; 4. the cooperation from the code cannot be wrong. The only requirements are: 1. the code does not conceal pointers; 2. it must be easy to determine if a purported pointer points to an allocated chunk. The present version cannot handle pointers to inside a chunk.
Basically, conservative garbage collection is like normal non-compacting mark-and-scan garbage collection, with the following additions: 1. the allocator keeps a list of allocated chunks; 2. anything that looks like a pointer (is in range and points to an allocated chunk) is considered a pointer.

* K. Nilsen, "Garbage collection of strings and linked data structures in real time", SP&E, 18, #7, July 1988, pp. 613-640.

* A.W. Appel, J.R. Ellis, K. Li, "Real-time concurrent collection on stock multiprocessors", ACM SIGPLAN Notices, 23, #7, July 1988, pp. 11-20.
Using virtual memory traps to avoid tests in Baker's version of two-space copying garbage collection.

* Joel F. Bartlett, "Compacting garbage collection with ambiguous roots", Lisp Pointers, 1, #6, April-June 1988, pp. 3-12.

* A.W. Appel, "Garbage collection can be faster than stack allocation", Inf. Proc. Letters, 25, pp. 275-279. 1987,

* J. Dana Eckart, Richard J. LeBlanc, "Distributed garbage collection", in ACM SIGPLAN '87 Symposium on Interpreters and Interpretive Techniques, ACM SIGPLAN Notices, 22, #7, July 1987, pp. 264-273.

* Bernard Lang, Francis Dupont, "Incremental incrementally compacting garbage collection", in ACM SIGPLAN '87 Symposium on Interpreters and Interpretive Techniques, ACM SIGPLAN Notices, 22, #7, July 1987, pp. 253-263.
Starts by constructing an abstract garbage collection algorithm and then shows that mark & scan + compaction and two-space copying are actually different actualizations of the same algorithm.
The authors then construct a "mixed" algorithm, with a from-space, a to-space and an ms-space (for mark & scan). Each space may consist of several non-adjacent segments. During each garbage collection cycle one set of from-space, to-space and ms-space segments is treated. In addition the boundaries between the segments are movable. All this should result in very adaptive garbage collection. The algorithms are only described loosely (but probably adequately).

* Y. Bekkers, B. Canet, O. Ridoux, L. Ungaro, "MALI: A memory with a real-time garbage collector for implementing logic programming languages", in 3rd IEEE Symposium on Logic Programming, pp. ???-???. Sept. 1986,

* M. Rudalics, "Distributed copying garbage collection", in 1986 ACM Conf. on Lisp and Functional Programming, pp. 364-372. August 1986,

* M. Niwa, M. Yuhara, K. Hayashi, A. Hattori, "Garbage collection with area optimization for Facom ALPHA", in 31st IEEE Comp. Soc. Int Conf., Spring Compcon 86, ed. by A G. Bell, pp. 235-240. March 1986,

* D. Yun Yeh, Toshinori Munakata, "Dynamic initial allocation and local reallocation procedures for multiple stacks", Commun. ACM, 29, #2, pp. 134-141. Feb. 1986,
The authors propose two improvements to the bidirectional multiple-stack allocation algorithm of Korsh & Laison, Commun. ACM, 1983. The first improvement concerns the initial stack allocation. Traditionally, a static placement is chosen, before any element is stacked. This paper proposes to delay allocation for a stack until the first element is stacked. When the time arrives for a stack to be allocated, it is allocated to the back of an unpaired stack if there is one, or in the largest gap. An advantage is that the number of stacks need not be known in advance, in addition to a more flexible allocation.
The second improvement concerns reallocation after a collision. Traditionally, all stacks are reallocated. This paper proposes to try to restrict the number of stacks pushed around. Note that in the bidirectional structure four stacks are involved in a collision: the colliding stacks with their back-to-back partners. First an attempt is made to disengage the colliding stacks by shifting the pair that has the largest gap next to it. If this does not yield enough elbow room (according to some criterion), more stack pairs and gaps are involved in the shift, etc., until in the worst case all stacks have to be moved.
Simulation results are given, suggesting that both measures together 1. achieve a 15 to 20% reduction of the number of elements moved, 2. get more improvement as the stack behavior is more erratic, and 3. are more stable; all with respect to the Korsh-Laison algorithm.

* David G. Korn, Kiem-Phong Vo, "In search of a better Malloc", in 1985 Summer USENIX Conference, pp. 489-506. June 1985,

* Ashwin Ram, Janak H. Patel, "Parallel garbage collection without synchronization overhead", in 12th Annual Symposium on Computer Architecture, pp. 84-90. June 1985,

* M. Bruynooghe, "A note on garbage collection in Prolog interpreters", in Implementation of Prolog, Ellis Horwood Series in Artificial Intelligence, ed. by J.A. Campbell, John Wiley, New York, 1984, pp. 258-267.

* G. Bozman, W. Buco, T.P. Daly, W.H. Tetzlaff, "Analysis of free-storage algorithms -- Revisited", IBM Syst. J., 23, #1, 1984, pp. 44-64.

* D. Julian M. Davies, "Technical Correspondence on "A multiple-stack manipulation procedure"", Commun. ACM, 27, #11, Nov. 1984, pp. 1158.
Points the reader to Bobrow & Wegbreit, CACM, 1973.

* T. Hickey, J. Cohen, "Performance analysis of on-the-fly garbage collection", Commun. ACM, 27, #11, pp. 1143-1154. Nov. 1984,

* R.A. Brooks, "Trading data space to reduce time and code space in real-time garbage collection on stock hardware", in 1984 ACM Symposium on Lisp and Functional Programming, pp. 256-262. Aug. 1984,

* D.A. Moon, "Garbage collection in a large Lisp system", in 1984 ACM Symposium on Lisp and Functional Programming, pp. 235-246. Aug. 1984,

* Mordechai Ben-Ari, "Algorithms for on-the-fly garbage collection", ACM Trans. Prog. Lang. Syst., 6, #3, July 1984, pp. 333-344.
After having critisized Dijkstra's algorithm for being too difficult to prove correct, the author presents a simpler but less efficient algorithm using two colours, and proves its correctness (informally). From this, a more efficient algorithm using three colours is derived. Full algorithms are given in Ada; efficient forms of bit manipulation in hardware are also included.

* Thomas W. Christopher, "Reference count garbage collection", SP&E, 14, #6, June 1984, pp. 503-507.
The title of the paper is misleading: the reference count algorithm is not doctored to do full garbage collection, it is just complemented by a full mark and scan garbage collector in the traditional way. The novelty of the paper lies in the way the two are integrated such that the presence of the reference counts are used to obviate the need to know the stack. "How to do full garbage collection without knowing the stack" would be a more appropriate title.
Basically, the garbage collector works in ??? scans (aside from doing the normal frees on chunks that get a reference count of 0). First it goes through all pointers in all chunks not in the free list and decrements the reference count of the chunks they point to by one. Now all non-zero reference counts stem from references from outside the heap.
Next a scan finds all chunks with a non-zero reference count, and for each such chunk C performs a depth-first visit on all chunks reachable from it and sets their reference counts to epsilon (called RCNIL in the paper), a number that is not zero, but acts as zero when something is added. Now all non-garbage chunks have a reference count larger than zero.
The last scan reverses the effect of the first: it goes through all pointers in all chunks with a non-zero reference count and increments the reference count of the chunks they point to by one. Now all non-referenced circular structures have got reference counts of zero.
Some details of the scans and the implementation of the counters and the free list are given in narrative form.

* David M. Ungar, "Generation scavenging: A non-disruptive high performance storage reclamation algorithm", in ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, ACM SIGPLAN Notices, 19, #5, May 1984, pp. 157-167.
Mature, give them everything but the kitchen sink and then the kitchen sink on top approach to memory management, as implemented for Berkeley Smalltalk.
After rejecting four existing garbage collection techniques (reference counting for being too expensive and for not releasing cycles, deferred reference counting for not releasing cycles, mark & sweep for causing frequent pauses, and two-space copying), the authors combine scavenging for new object and off-line reclamation of old objects.
There are three areas for new objects, NewSpace where the new are allocated, PastSurvivorSpace where surviving new objects remain and a scratch area FutureSurvivorSpace. The scavenger moves live objects from NewSpace + PastSurvivorSpace to FutureSurvivorSpace and then swap the meaning of PastSurvivorSpace and FutureSurvivorSpace. Objects that survive several scavenges get tenure and are moved to the old objects area, where they are subject to off-line collecting (using good old mark & scan?). All this no doubt requires a lot of pointer updating.
Measuements are given and comparisons with the rejected algorithms are made. A four-bit pointer extension in the hardware is suggested to assist in garbage collecting.

* D. Ungar, "Generation scavenging", ACM SIGPLAN Notices, 19, #4, pp. 157-167. April 1984,

* A. Kaufman, "Tailored-list and recombination-delaying buddy systems", ACM Trans. Program. Lang. Syst., 6, #1, Jan. 1984, pp. l18-125.

* James F. Korsh, Gary Laison, "A multiple-stack manipulation procedure", Commun. ACM, 26, #11, Nov. 1983, pp. 921-923.
The authors propose a modification to Garwick's algorithm (BIT 4, 1964) for keeping a number of stacks in a contiguous chunk of memory. In Garwick's algorithm all stacks grow in the same direction, but this algorithm implements half the stacks as growing from low to high and the other half as growing from high to low. The end stacks are with their backs to the ends of memory, the left one growing right and the right one growing left. The other stacks are in between, in back-to-back pairs. This arrangement leaves larger gaps between the stack, and fewer hard boundaries, which is a good thing. So note that "bidirectional" in this case does not mean that the stacks can grow from both ends, as one might assume. It is probably up to the interface to hide this set-up, although this is not mentioned in the paper.
Simulation results are presented which show that the bidirectional technique roughly reduces the number of shifted bytes by half.

* C.J. Stephenson, "Fast fits -- New methods for dynamic storage allocation", in Ninth Symposium on Operating System Principles, ACM SIGOPS Review, 17, #5, pp. 30-32. May 1983,

* Jaques Cohen, Alexandru Nicolau, "Comparison of compaction algorithms for garbage collection", ACM TOPLAS, 5, #4, pp. 532-553. Oct. 1983,
The time efficiency of four garbage compacting algorithms is determined by deriving closed formulas for it. The algorithms were first coded in a special language and a processor was used to determine the a closed form for the time requirement; the program Macsyma was then used to simplify the formula. Memory requirements were derived directly from the algorithm.
The four garbage compacting algorithms are: Knuth's standard Lisp three-pass garbage collector, a "rolling table" compacter, Morris' compacter and Jonkers' compacter. Each is explained briefly but sufficiently.
The formulas are then applied to specific hardware, the PDP-10. The verdict is that Knuth's standard algorithm is fastest but requires space for one pointer, unlike the other three, and among those that need almost no additional memory (usually 1 or 2 bits) Jonkers' is fastest. Morris's is 3 to 8 times slower due to pointer manipulation, and table rolling can become non-linear.

* H. Liebermann, C. Hewitt, "A real-time garbage collector based on the lifetimes of objects", Commun. ACM, 26, #6, pp. 419-429. June 1983,

* Paul Hudak, Robert M. Keller, "Garbage collection and task deletion in distributed applicative processing systems", in 1982 ACM Symposium on Lisp and Functional Programming, pp. 168-178. 1982,

* Robert K. Dewar, Micha Sharir, Elia Weixelbaum, "Transformational derivation of a garbage collection algorithm", ACM Trans. Prog. Lang. Syst., 4, #4, Oct 1982, pp. 650-667.

* Johannes J. Martin, "An efficient garbage compaction algorithm", Commun. ACM, 25, #8, pp. 571-581. Aug. 1982,
Considerable pointer shuffling to speed up Morris' algorithm.

* J.L. Dawson, "Improved effectiveness from a real time Lisp garbage collector", in 1982 ACM Symposium on Lisp and Functional Programming, pp. 159-167. Aug. 1982,

* David R. Barach, David H. Taenzer, Robert E. Wells, "A technique for finding storage allocation errors in C-language programs", ACM SIGPLAN Notices, 17, #5, May 1982, pp. 16-23.
malloc() and free() are instrumented so as to write a small (6-byte) record of their actions to a file. A postprocessing program analyses this file to produce a report of leaked blocks and duplicate frees.
The report includes call stack listings for the offending actions, which necessitates compiler-dependent code in the instrumentation, to scan the stack. Since the authors use fopen() to open the record file, they have to switch off their recording mechanism during that open.

* David Spector, "Minimal overhead garbage collection of complex list structure", ACM SIGPLAN Notices, 17, #3, pp. 80-82. March 1982,
The pointer obtained when an object is created is marked the "defining reference" by setting one bit in it; whenever the pointer is copied, the defining-reference bit in the copy is cleared. Whenever a pointer is discarded, its defining-reference bit is checked and if it is set, the object pointed at is discarded after recursively discarding all the pointers in it.
This scheme only works if the defining reference is the last to be discarded; it is the programmer's job to see to this, so it's not really garbage collection.

* Jacques Cohen, "Garbage collection of linked data structures", Computing Surveys, 13, #3, 1981, pp. 341-367.
Survey of many algorithms known at that time, with short descriptions.

* B.J. Gerovac, "An implementation of new and dispose using boundary tags", Pascal News, 19, Sept. 1980, pp. 49-59.

* D.G. Bobrow, "Managing reentrant structures using reference counts", ACM TOPLAS, 2, #3, July 1980, pp. 269-273.
Although in principle reentrant structures cannot be handled with reference counts, large subclasses of them can, with some assistence of the programmer. The basic idea is to define special-purpose groups of nodees and to keep one reference count for the entire group, which counts the number of references pointing into the group from outside of it. The programmer defines what constitutes a group, and the system maintains the counter. When the counter reaches zero, the entire group can be deallocated.
Four examples of such groups are discussed, covering doubly-liked lists and trees, circular lists, etc.

* Y. Hibino, "A practical parallel garbage collection algorithm and its implementation", in ?th Ann. Symposium on Computer Architecture, Rennes (France), SIGARCH Newsletter, 8, #3, pp. 113-120. May 1980,

* J.B. Hext, "A storage management laboratory", Aust. Comput. Sci. Commun., 2, #1, Jan. 1980, pp. 185-193.

* D.P. Friedman, D.S. Wise, "Reference counting can manage the circular environments of mutual recursion", IPL, 8, #1, 1979, pp. 41-45.

* F.L. Morris, "On a comparison of garbage collection techniques", Commun. ACM, 22, #10, Oct. 1979, pp. 571.

* David S. Wise, "Morris' garbage compaction algorithm restores reference counts", ACM TOPLAS, 1, #1, July 1979, pp. 115-120.
[[ So concise as to be incomprehensible. ]]

* H.B.M. Jonkers, "A fast garbage compaction algorithm", Inform. Process. Lett., 9, #1, pp. 26-30. July 1979,
The algorithm consists of two scans, both from left to right.
The first scan does not move blocks. When arriving at a block B, the invariant holds that pointers to B from blocks left of B have been threaded into a thread starting at B; these can now be updated to point to the final location of B. Next, pointers from B are threaded to their destinations. Note that all pointer locations in B still contain genuine pointers, not thread pointers.
The second scan moves the blocks. When arriving at a block B, the invariant holds that all blocks to the left of B have been moved to their final positions and contain updated pointers only, and pointers to B from B and blocks to the right of it have been threaded into a thread starting at B; these can now be updated to point to the final location of B.
The very clear paper contains an informal description of the algorithm, informal code, formal code in Algol 68 and two page-size pictures, one showing the threading symbolically (abstracting from the exact mechanism) and one showing the actual threading.

* Edsger W. Dijkstra, Leslie Lamport, "On-the-fly garbage collection: an exercise in cooperation", Commun. ACM, 21, #11, Nov. 1978, pp. 966-975.

* F. Lockwood Morris, "A time- and space-efficient garbage compaction algorithm", Commun. ACM, 21, #8, pp. 662-665. Aug. 1978,
Vivid description of a compaction algorithm that requires only one bit per pointer. To simplify updating the pointer after (or during) the move phase, for each chunck X the pointers to it are chained in a chain starting at X and terminated by the contents of the first word in X which was pushed out by the new pointer to the start of the chain; one bit is needed to signal the end of the chain. Then the chunks can be moved and for each chunk its address, now known, can be patched back into the pointer locations accessible through the chain. Chaining the pointers in the first place is the hard part, and the author shows that it can be done in two sweeps, one in the direction of the compaction and a second against it. The latter can then be combined with the move phase and the update phase, to result in an efficient algorithm.
Full code is supplied, with an informal proof of correctness.

* Henry G. Baker Jr., "List processing in real time on a serial computer", Commun. ACM, 21, #4, pp. 280-294. April 1978,
[ concurent garbage collection for a LISP system ]

* J.P. Fitch, A.C. Norman, "A note on compacting garbage collection", Computer J., 21, #1, pp. 31-34. Feb. 1978,

* J.E. Shore, "Anomalous behavior of the fifty-percent rule in dynamic memory allocation", Commun. ACM, 20, #11, Nov. 1977, pp. 812-820.

* H.T. Kung, S.W. Song, "An efficient garbage collection system and its correctness proof", in IEEE 18th Symposium on Foundations of Computer Science, pp. 120-131. Oct.-Nov. 1977,

* D.S. Wise, D.P. Friedman, "The one-bit reference count", BIT, 17, #3, Sept. 1977, pp. 351-359.
The one-bit reference count of a chunk has two values: 0 = unknown / overflow, 1 = there is exactly one reference to this chunk. Chunks start with a reference count of 1. Adding a reference sets the reference count to overflow, deleting the reference frees the chunk. Unreachable chunks with a reference count = overflow have to wait until a real garbage collector comes along. (Seems extendible to reference counts of any size - DG.) The rest of the paper (the larger part) describes a caching mechanism for nodes whose reference count has only temporarily overflowed.

* J.M. Robson, "Worst case fragmentation of first fit and best fit storage allocation strategies", Comput. J., 20, #3, Aug. 1977, pp. 242-244.

* Jeffrey M. Barth, "Shifting garbage collection overhead to compile time", Commun. ACM, 20, #7, July 1977, pp. 513-518.
Extensive data flow analysis to catch situations in which allocation is more or less immediately followed by becoming inaccessible ("cancelling transactions"). Also detects situations in which bunch allocations can be done.

* J.L. Peterson, T.A. Norman, "Buddy systems", Commun. ACM, 20, #6, June 1977, pp. 421-431.

* D.R. Hanson, "Storage management for an implementation of SNOBOL4", Software-Practice and Experience, 7, #2, pp. 179-192. March 1977,

* C. Bays, "A comparison of next-fit, first-fit and best-fit", Commun. ACM, 20, #3, March 1977, pp. 191-192.

* L.E. Thorelli, "A fast compactifying garbage collector", BIT, 16, #4, pp. 426-441. 1976,

* David S. Wise, Dan C. Watson, "Tuning Garwick's algorithm for repacking sequential storage", BIT, 16, 1976, pp. 442-450.
Garwick's algorithm (BIT 4, 1964) concerns the maintenance of a sequence of n stacks or extensible arrays in a consecutive memory segment: when one stack bumps into the next as a result of a PUSH or EXTEND operation, some (or in Garwick's algorithm all) stacks have to be reshifted to "well-spaced" positions. To compute the new positions, Garwick's algorithm remembers the old sizes of the stacks, so it can compute their respective growths. Now a fraction a of the available memory is distributed as growth space equally over the n stacks, and a fraction 1-a is distributed in proportion to the stacks' latest growths. Garwick recommends a to be set to about 0.1.
The authors propose to adjust a dynamically, as follows. When reallocation takes place, they compute the a that would have avoided this reallocation, had that been the value of a the last time, and set a to that value. They test this heuristic by simulation under three scripts: 1. Stacks are pushed on at random. Result: the value of a soon reaches 0.8 and then fluctuates around that value; prediction has no value here. 2. One stack is chosen and extended until reallocation takes place, and then a new stack is chosen at random, etc. Result: the value of a slowly climbs to 1.0; again prediction has no value. 3. Random stacks are pushed on in proportion to their sequence number, i.e. stack number i grows about i times as fast as stack 1. Result: the value of a stabilizes at about 0.15, even if it is started at 0.9; prediction is effective here.
So it works.

* L.Peter Deutsch, Daniel G. Bobrow, "An efficient, incremental, automatic gargbage collector", Commun. ACM, 19, #9, Sept. 1976, pp. 522-526.
The authors consider a two-level memory: a cheaply accessible region called the "variables" and a more expensive region called the heap. Whether these regions are indeed on different hardware (main memory and backing store) or just the working set versus pages out on disk is immaterial.
Reference counts are kept for all garbage-collectible data, but with four considerable differences: 1. they count references from the heap only; 2. they are kept up to date only for those in the variables; 3. they are not kept in the data itself but rather in hash tables; 4. reference counts of exactly one (which are the most frequent) are not recorded. This greatly increases efficiency (no more updating of counters in secondary memory), but no longer allows freeing data as soon as its reference count drops to zero (it may still be referenced from the variables).
To remedy this, two (hash) tables are kept: the ZCT (zero count) table, which holds the addresses of data not pointed to by data on the heap, and an MRT (multi-reference table) which holds the addresses and reference counts of data pointed to by more than one reference from the heap; since most (90 tot 98%) of the accessible data carry a reference count of 1, these tables can be small.
When creating a data item, its address is entered into the ZCT table. Storing pointers to it in variables carries no cost. When storing a pointer into heap data, 1. if the pointer is in the ZCT table, delete it from that table (reference count goes from 0 to 1); 2. if it is in the MRT table, increment its reference count there; 3. if it is in neither, enter it into the MRT with a count of 2. Similar obvious rules apply to the deletion of a pointer situated in the heap. Again, pointers situated in the variables are not tracked.
When a memory allocation request fails or the ZCT table overflows, reclaiming takes place, as follows. The variables are scanned, and any ZCT entry holding a pointer accessible from the variables is marked "used by variables but not by the heap"; now all data pointed to by unmarked entries in the ZCT table can be freed, and the corresponding entries in the ZCT table can be cleared; finally the marks are removed. Freeing the heap data can again affect the ZCT and the MRT, so, if needed or desired, reclaiming may be repeated.
Reference-counting garbage collectors require a full garbage collector to reclaim cyclic structures. The paper describes a two-space copying garbage collector based on the ZCT and MRT tables, and a generational one. Furthermore it suggests the use of a separate processor to concurrently handle the ZCT and MRT table handling and memory reclamation. Most descriptions in the paper are rather sketchy, and require considerable interpretation.

* P.L. Wadler, "Analysis of an algorithm for real time garbage collection", Commun. ACM, 19, #9, pp. 491-500. Sept. 1976,

* Guy L. Steele Jr., "Multiprocessing compactifying garbage collection", Commun. ACM, 18, #9, pp. 495-508. Sept. 1975,

* J.E. Shore, "On the external storage fragmentation produced by first-fit and best-fit allocation strategies", Commun. ACM, 18, #8, pp. 433-440. Aug. 1975,

* D.G. Bobrow, "A note on hash linking", Commun. ACM, 18, #7, pp. 413-415. July 1975,
Normally a pointer leads to a memory location and gives access to the contents of that location and of possibly surrounding memory locations. The technique described here allows additional data to be associated with arbitrary pointers and memory locations, at a price. The basic idea is very simple: put the additional information in a hash table with the pointer as the key, and look it up.
The problem is of course to know that additional information is associated with a pointer and should be looked up. The answer is: mark the location under the pointer as "hash-linked". If that cannot be done, it gets more expensive, but the scheme can still be useful (see below); but for the moment we assume it can.
This scheme has funny applications. First of all, it can be applied to itself, as follows. When a new pointer is entered into the hash table and there is a collision, the hash table entry found is marked hash-linked, and the new entry is entered as the additional information into a second hash table with the pointer as the key. This means that there will never be a collision during look-up in the first hash table, which in turn means that this table does not have to store the keys; this is a considerable memory saver.
The paper continues by giving a number of applications. Some examples: 1. Large arrays of mostly small numbers can be allocated with 1 or 2 bytes to a number; the occasional larger value is stored in the hash table, with its location marked by maxint. 2. Occasional text insertions or deletions in very long strings can be represented by storing the modifications in the hash table and replacing their positions by marker bytes. 3. A recursive visit to all nodes of a possibly cyclic graph (called a "reentrant structure" here) in which there is no room in the nodes to store marking bits can be performed by storing the pointer to each node in a hash table as soon as it is met, and check the hash table before following a pointer. (This seems to me to be a different algorithm: here we do not know in advance that the address is there -- its presence in or absence from the hash table is the marking bit. DG.)
All these techniques are useful when memory is more important than CPU cycles.

* D.A. Fisher, "Bounded workspace garbage collection in an address order preserving list processing envirronment", Inf. Process. Lett., 3, #1, July 1974, pp. 29-32.

* S. Arnborg, "Storage administration in virtual memory Simula system", BIT, 12, 1972, pp. 125-141.

* B. Wegbreit, "A generalized compactifying garbage collector", Computer J., 15, #3, pp. 204-208. August 1972,

* D.G. Bobrow, "Requirements for advanced programming systems for list processing", Commun. ACM, 15, #7, Nov. 1972, pp. 618-627.

* P. Branquart, J. Lewi, "A scheme of storage allocation and garbage collection for Algol 68", in Algol 68 Implementation, ed. by J.E.L Peck, Amsterdam, North Holland, 1971, pp. 199-232.

* C.J. Cheney, "A non-recursive list compacting algorithm", Commun. ACM, 13, #11, Nov. 1970, pp. 677-678.
[ this is about two-space copying garbage collection ]

* R.R. Fenichel, J.C. Yochelson, "A LISP garbage collector for virtual-memory computer systems", Commun. ACM, 12, #11, pp. 611-612. November 1969,

* W.J. Hansen, "Compact list representation: definition, garbage collection and system implementation", Commun. ACM, 12, #9, pp. 499-507. September 1969,

* B.K. Haddon, W.M. Waite, "A compaction procedure for variable-length storage elements", Computer J., 10, pp. 162-165. August 1967,

* H. Schorr, W.M. Waite, "An efficient machine-independent procedure for garbage collection in various list structures", Commun. ACM, 10, #8, pp. 501-506. August 1967,

* D.G. Bobrow, D.L. Murphy, "Structure of a LISP system using two-level storage", Commun. ACM, 10, #3, pp. 155-159. March 1967,

* D.G. Bobrow, "Storage management in LISP", in IFIP Working Conf. on Symbol Manipulation Languages and Techniques, ed. by D.G. Bobrow, pp. 291-301. September 1966,

* L.A. Belady, "A study of replacement algorithms for a virtual storage computer", IBM Systems Journal, 5, #2, 1966, pp. 78-101.

* K.C. Knowlton, "A fast storage allocator", Commun. ACM, 8, #10, Oct. 1965, pp. 623-625.

* Jan V. Garwick, "Data storage in compilers", BIT, 4, 1964, pp. 137-140.
Describes a procedure in Algol 60 to emulate various dynamically sized arrays in one fixed array. The arrays are named by numbers, so the store procedure is defined as store(val): at (i): in array (a); int val, i, a;.

* George E. Collins, "A method for overlapping and erasure of lists", Commun. ACM, 3, #12, pp. 655-657. Dec. 1960,
Newell, Shaw and Simon [1957] have shown how to make lists share common data (creating dags in our terminology), and the author points out that this causes difficulties when a list is freed ("erased"). The problem is solved by introducing reference counts, unknown in 1960, but with a number of built-in optimizations.
An implementation is described for the IBM 704, 709 and 7090. Thes machines have 36 bits words and 15 bits addresses. Each word accessible by a pointer (a "location word") contains a 21 bits descriptor and a 15 bits address of the next link. The descriptor contains 2 type bits and a 15 bits segment (plus 4 unused bits). For shared data, these 15 bits contain a pointer to reference-counted object; for non-shared data that will not fit in 15 bits, the 15 bits contain a pointer to the data, and for non-shared data that fits in 15 bits, the 15 bits contain the data. When non-shared data gerts shared, it is endowed with a reference count; when the reference count drops to 1, the data format reverts to non-shared. So the reference count is always greater than 1.