Mathematics, Physics, etc.

> Literature references and annotations by Dick Grune,
Last update: Thu Apr 27 15:47:33 2023.

These references and annotations were originally intended for personal use and are presented here only in the hope that they may be useful to others. There is no claim to completeness or even correctness. Each annotation represents my understanding of the text at the moment I wrote the annotation.
> No guarantees given; comments and content criticism welcome.

* Colignatus, Thomas, A Measure of Association (Correlation) in Nominal Data (Contingency Tables), Using Determinants, 2007, pp. 27.

Exploratory paper on the possibility of obtaining a methodologically sound correlation coefficient for contingency tables indexed by nominal values.
[DG: My last look at statistics was in 1966, hence this small refresher.]
A contingency table is a matrix M the columns of which are labeled with attributes A1..Ak from one category A and the rows with attributes B1..Bk from another category B. Each element Mi,j of the matrix contains the number of samples (the "frequency") that have been observed that have both attribute Ai and attribute Bj. A standard example has "gender" as category A, with attributes "male" and "female", and "political affiliation" as category B, with attributes "Democrat" and "Republican". The correlation coefficient should then express the degree of correlation between gender and political affiliation. (In the original design the correlation coefficient was supposed to express in how far political affiliation was "contingent" upon gender, but this interpretation implies causation, which the table cannot provide.)
     Correlation coefficients are traditionally defined for sets of pairs of numeric values (xi, yi); x could be soil acidity and y crop yield per squre meter. A correlation coefficient of +1 means that xi and yi grow and shrink in lock-step, 0 means that they are independent, −1 means that if one is larger, the other is smaller, etc. No values outside [−1..+1] occur.
     This does not apply directly to contingency tables: there is no set of pairs of numerical values but rather a frequency table indexed by nominal values. Still, intuitively both types of data seem to contain similar information: the degree of interdependence of two sets of data, numeric for vectors (xi and yi), and nominal for contingency tables (the labels on the rows and columns).
     A major obstacle to grafting the techniques for vectors to the tables is that the numeric values in the vectors can be manipulated algebraically, allowing ordering, taking the average, etc., whereas all that is impossible for nominal values. An inroad can be made by considering the basic idea on which correlation coefficient calculation for vectors is based: comparing the observed situation with what would be expected if no correlation existed. This comparison leads to the xi − x-bar forms in statistics (x-bar is the expected value of the stochastic variable x). For lack of better information the expected value is estimated as the average of all xi. Summing the squares of these deviations from average yield, after some adjustments, the χ² measure. Unlike the correlation coefficient above, a χ² value is always >= 0; if 0, there is a complete correlation between the data, and the higher the χ² value the more independent the data are.
     Along the same lines the expected frequencies in a contingency table can be computed from the averages of the rows and columns in which each frequency resides, and a χ² measure can be constructed. Its normalized form (scaled to a value between 0 and 1) is known as "Cramér's V". It is,however, methodologically unsound because it uses the numerical average of sets of frequencies, but these sets (rows and columns of the matrix) have no probability distribution. It is this problem that this paper is trying to solve/evade.
     It is a small step to view a contingency table as an m×n matrix. If the matrix is square (n×n), it has a determinant. Each row in the matrix defines a point in an n-dimensional space, plus a vector from the origin of the n-dimensional space to that point. These n vectors span up a geometric solid in the shape of an n-dimensional rhomboid (parallelepiped). This rhomboid lives in an n-dimensional box the sides of which have lengths equal to the sums of the n columns of the matrix, and its volume is equal to the absolute value of the determinant of the matrix. So far the vector algebra; now for the statistics.
     If the rows in the matrix are closely related the vectors they define are all pointing roughly in the same direction, and the volume of their rhomboid is small, but when they are strongly independent, their vectors are more perpendicular to each other, and the volume they span fills almost the entire n-dimensional box. So the volume ratio of the box that is occupied by the rhomboid spanned up by the vectors is indicative for the independence of the data in the contingency matrix!
     Although this Volume Ratio as a measure of association in contingency tables is not based on any statistical model, it has several properties that suggest that it is not a completely crazy choice:
• The Volume Ratio is automatically normalized to lie between 0 (full association) and 1 (full independence). Cramér's V requires a somewhat heuristic normalization.
• The Volume Ratio for (2×2) tables is identical to Cramér's V.
• If there is only one non-zero frequency in each row and column, each row is specifically associated wit one column, and vice versa. Such a matrix is orthogonal, so the Volume Ratio is 1 (as with Cramér's V): total non-association, which means total independence.
• If all frequencies are equal each row is equally associated with each column. The Volume Ratio method and Cramér's V both yield 0, confirming total association.
• The Volume Ratio does not change when two rows or columns are swapped (as with Cramér's V).
• The Volume Ratio is independent of scale: it does not change when all table entries are multiplied by the same factor (as with Cramér's V).
But other phenomena are harder to explain:
• For a table filled with random numbers from the range [1..999], the Volume Ratio approaches 0 (0.00018... for 6×6, 0.000000... for 12×12), implying total association, whereas Cramér's V still shows considerable independence (0.26... for 6×6, 0.16.. for 12×12). [DG: own observation.]
     Problems arise when the contingency table is not square; in that case it does not have a determinant, and the above method cannot be applied directly. This is solved by considering square sub-matrices and their relationships, as explained in the rest of the text of the paper. The text proper is followed by 8 appendices with notes, detailed examples, explanations, and considerations.

[DG: I applied both methods to the gender-class correlation tables from two Papuan languages, Burmeso (non-TNG), described by M. Donohue, and Mian (TNG), described by S. Fedden and G.G. Corbett. Background: When natural languages classify nouns into groups they usually do this on the basis of gender: French le soleil, la lune. Other languages classify according to size, shape, etc.: Swahili mtoto child, kitoto little baby. Very few languages use both systems simultaneously, with each noun having both a gender and a class. Using gender to label the columns and class for the rows, the number of nouns having gender i and class j can be put in element [i, j] of the gender-class correlation table. The correlation coefficient would then indicate how independent the notions of gender and class are in the language.
For the languages Burmeso and Mian the obtained vales are

    Burmeso (6×6): Cramér's V = 0.488549; value_free = 0.600000; Volume Ratio = 0.000040
    Mian (4×6):    Cramér's V = 0.726651; value_free = 0.833333; Volume Ratio = ---
Indeed the Mian table looks more orthogonal than the one for Burmeso, in accordance with the above values. The measure used in those papers, however, is the ratio of the number of empty entries to the maximum number of empty entries a non-trivial table of the given dimensions can have. This is a value-free measure, because it does not depend on the actual values in the table, only on their presence. The values from the value-free method are also given above and are seen to agree reasonably well with Cramér's V, thereby validating the value-free method somewhat. Also, its computation does not require a computer.]

* Mark Ronan, Symmetry and the Monster, Oxford University Press, Oxford, 2006, pp. 255.
This is a history book about the history of research into group theory and the discovery of the "Monster", not a book about that Monster. The math has been simplified beyond recognition, and even after reading up on the subject in the Wikipedia and with a PhD in computer science, I could not make head or tail of it.
     The first problem is that the author does not make clear what he means by "a symmetry". We learn that the "zillions of symmetries" of the Rubik cube are "generated by 90 degree turns", which in the lines above are compared to "symmetry operators". This suggests that the 24 turns (4 on each of the 6 sides) are the operators and that the positions that can be achieved are the symmetries. But operators in a (mathematical) group have the property that the combination of two operators is again an operator in that group, so any configuration can be achieved with a single (compound) operator. So are all these operators "symmetries"? I find it confusing. Symmetries are also explained as permutations, but the relationship remains vague.
     A second problem is that the level of explanation is very uneven: the root sign is explained, but the j-function is written out without any explanation.
     We learn a lot about the people around the Monster but next to nothing about the Monster itself, except that it is 196,884-dimensional, but that's already on the cover. Does it have a geometric representation, like a cube? Or is it just a network of symbols? (Does a network of symbols have symmetries?) If it can be geometric,it must have sides. Are all sides the same length like in a cube or a dodecahedron? How big is it if the length of the shortest side is 1 unit? Answers to such questions would have made the Monster much more accessible.
     Perhaps the subject is too complicated to allow a popularized treatment, in which case sticking to just the history is OK. But it would have been nice to see an example or two of representatives of the simpler symmetry groups. Some examples are given, but they are not assigned to groups. And it would have been nice to be told to what position in the periodic table of symmetries Rubik's cube occupies, probably the most complicated symmetric object any of us can relate to.

* Marcus Du Sautoy, The Music of the Primes: Why an Unsolved Problem in Mathematics Matters, Harpercollins, 2003, pp. 335.
Mostly about the people involved in attacks on the Riemann hypothesis, and indeed supplying interesting biographies of them. The application of primes in cryptography is emphasized, justifying the second half of the title. The math is exceptionally shallow; modulo arithmetic is called "clock arithmetic".

* Julian Havil, Gamma -- Exploring Euler's Constant, Princeton Science Library, Princeton, 2003, pp. 266.
"Fun with Series" would probably be a better title, but within that realm the book indeed focuses on γ, the Gamma function, the harmonic series, etc., in 14 chapters. The book closes with two chapters on the distribution of primes and the Riemann zeta function. Two appendices, about Taylor expansions and Complex Function Theory, provide handy refresher courses on the subjects.
     Most chapters start in low gear but soon speed up; not all explanations are as clear as I'd hoped. The material is covered in quite reasonable depth, the most difficult results sketched only.

* M. Copi Irving, Carl Cohen, Introduction to Logic, Prentice Hall, Upper Saddle River, NJ, 1998, pp. 714.
Thorough, interesting, readable, good.

* Samuel D. Guttenplan, The Language of Logic, Basil Blackwell, Oxford, UK, 1987, pp. 336.
Pleasant introduction.

* William H. Press, Brian P. Flannery, Saul A. Teukolsky, William T. Vetterling, Numerical Recipes -- The Art of Scientific Computing, Cambridge Univ. Press, Cambrigde, England, 1986, pp. 818.
A much more amusing and easy-going account than one would expect, given the subject. Chapters on: linear algebraic equations, interpolation and extrapolation, integration of functions, evaluation fo functions, special functions (Gamma, Bessel, Jacobi, etc.), random numbers, sorting(!), root finding and non-linear sets of equations, minimization or maximization of functions, eigensystems, Fourier transform spectral functions, statistical description of data, modeling of data, integration of ordinary differential equations, two-point boundary-value problems, and partial differential equations.
With programs and program diskettes in Fortran and Pascal.

* H. M. Edwards, Riemann's Zeta Function, Dover, Mineola, NY., 1974, pp. 315.
Of considerable depth. The first chapter explains Riemann's famous 1859 paper "On the Number of Primes Below a Given Size", and the subsequent 11 chapters cover many famous papers and theorems based on Riemann's paper. Requires serious study.