0 = he(ℒ2 ∩ ℒ2): A Most Wonderful Theorem

Dick Grune

What's this about? (very loosely explained...)

Text in a language is a restricted sequence of words. It has to obey syntax rules, telling which kind of word may go where: "into the garden" is OK, "the into garden" is not. Once a text obeys all syntax rules of a language, it still needs to obey context conditions, restricting the text further: "William saw the old house at the end of the driveway. It brought back fond memories of his childhood." is OK, but "William saw the old house at the end of the driveway. It brought back fond memories of her childhood." is not. Syntax is relatively easy to describe and check. Context conditions can be very complicated and the "end point" of a context condition (the point where "William" and "her" clash) can be an indeterminate distance away.

To formalize these levels of restriction, Chomsky defined a hierarchy of grammar types. Type 2 grammars (also called "context-free" grammars) define syntax only. Type 1 grammars (also called "context-sensitive" grammars) define context conditions that can be checked. Type 0 grammars (also called "phrase-structure" grammars) define context conditions that can only be confirmed if true; if the Type 0 context condition is violated, checking may never give an answer. Type 0 grammars describe the most general type of texts that can still be checked for internal consistency. Since all text in any language can be checked by people, Type 0 grammars must in some sense be very important.

Type 2 grammars are relatively easy to understand, and are used extensively in the definition of programming languages, but Type 1 and Type 0 grammars, in spite of their importance, are so difficult to understand that they are not used. Given the enormous gap in difficulty between Type 2 and Type 0 grammars it is surprising to say the least that the effect of a given Type 0 grammar G can be achieved by using two Type 2 grammars and some simple tricks, as follows.

From Type 0 grammar G, make two Type 2 grammars, S1 and S2 (for Syntax 1 and Syntax 2, resp.). Now make a text that obeys both S1 and S2 (Trick 1) and erase in the text a special set of unusual letters that do not occur in G (but do occur in S1 and S2) (Trick 2). The resulting text will now obey the original Type 0 grammar G! The Theorem given below tells us exactly how to do all this.

Granted, this only tells us how to make a context-correct text, not how to check an arbitrary text for context correctness; but it is a beginning. Also, the grammars S1 and S2 constructed according to the recipe from the Theorem turn out to be pretty horrible. Fortunately, in practice it turns out that, for almost all kinds of texts one wants to describe, there are much nicer grammars S1 and S2 that will also do the job.

A less loose explanation, a proof of the Theorem, and all the details and considerations are given here.

About the title

0 A ℒanguage obeying a Type 0 grammar
= is equal to
he the application of an erasing homomorphism (the erasing of the special unusual letters) to
(ℒ2 one ℒanguage obeying a Type 2 grammar
intersected with
2)     another ℒanguage obeying a Type 2 grammar.

[Home Page]
A Most Wonderful Theorem / Dick Grune / dick@dickgrune.com

... and my name is not Richard ...