## Thursday, April 26, 2012

### I left Push Down Automata out of my class and learned some things!

This semester in the Ugrad Course titled Elementary Theory of Computation
(Syllabus: Reg Languages, CFLs, Computability Theory, P and NP) I decided to NOT
teach PDA's. I mentioned them and told the students they were equivalent
to CFG's but nothing more.

This lead to a NEW (to me at least) piece of mathematics!

Proving
X={a^nb^nc^n | n ∈ N}
is NOT a CFL is easy using the pumping theorem.
Consider
Y={w | number of a's, b's and c's in w is the same}
How do you prove Y is not regular?
Why don't you just intersect Y with a*b*c* to get X which
you know is not a CFL?
I hear you say.
AH- you need to know that CFL intersect REG is CFL.
If we had the PDA/CFG equivalence then this would be an
easy cross product construction. But now?
SO, we need a proof that CFL intersect REG is CFL
that ONLY uses CFL's. That is, DOES NOT use PDA's.
Here is a proof.
We do not believe it is new but have not been able to find a reference.
If you know a reference please comment.
(NOTE ADDED LATER- The commenters politely provided a reference
and I have put it into the paper.)

Another point of interest: How do you show that REG is a subset of CFL.

1. Normally you would note that DFA's are just PDA's without a stack,
hence every lang recognized by a DFA can be recognized by a PDA,
and then use PDA/CFG equivalence. I could not use this.

2. You could use the proof that any DFA language is a left-linear grammar
(left linear only has productions of the form X-->aY)
and this is nice since, in fact, they are equivalent.
Here is a proof.
I ended up not doing this but I will next year
when I do the course.

3. On the midterm I had the following question:

Recall that if α is a regular expression then \$(α) is the
set of strings that α generates.
Recall that if G is a CFG then L(G) is the set of strings generated by G.

Prove the following by induction on the formation of a regular expression:
For all regular expressions α there exists a Context Free Grammar G
such that L(α)=L(G).
You cannot use PDA's (Push Down Automata). (If you do not know what this is,
do not worry.)

I could only ask this BECAUSE they had not seen PDA's or Left Liner Grammars.
Of the 40 students about 20 got it right.

4. One of my students, Justin Kruskal, wondered how to go from a DFA directly to a CFG
(recall that he had not seen left-linear grammars).
It is interesting to see what untainted students come up with on their own.
He came up with a proof
which is
here.
It is a weaker result than the left-linear grammar equivalence, but its his!

1. What do you think of leaving out PDA's? (I may heed your advice
next spring when I teach it again.)

2. Is the proof I point to that CFG intersect REG is CFG, that only uses CFG's, new?
If not please give a reference. A comment like Bill you idiot, this is a well known proof
that I have neither a reference nor a website to point to.
,
is not helpful. If you DO have a reference or website to point to you can call me whatever you like.

3. Have you ever learned some new math be leaving something OUT of a course?

1. A proof that context-free languages are closed under intersection with regular sets, that use only context-free grammars (not PDAs) can be found in

A. Salomaa. Formal Languages. ACM Monograph Series, 1973.

on page 59; there its Theorem 6.7.

Best regards.

1. This proof is often called the "Bar-Hillel construction" in computational linguistics and is quite well known. (It is the basis of approaches to parsing known as "parsing as intersection".) I believe it first appeared in

Bar-Hillel, Y., M. Perles, and E. Shamir. 1961. On formal properties of simple phrase structure grammars. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 14(2): 143-172.

I don't have the paper handy, but if I'm not mistaken, this is the paper that first proved the pumping lemma.

2. The Bar-Hillel et al. paper is reprinted in two books:

Yehoshua Bar-Hillel. 1964. Language and information: selected essays on their theory and application. Addison-Wesley.

Robert Duncan Luce, editor. 1963. Readings in mathematical psychology, vol. 2. Wiley.

2. The title of your note is not meaningful. It just says that regular languages are context free which you asked your students. It should say that
{x \cap y | x\in CFL, y\in REG} = CFL.

3. Anon 1: Thanks. Is his proof essentially the same as the one I present?

Anon 2: Thanks. I have changed the title.

4. This is a classical proof: Essentially the same proof was already in Bar-Hillel, Perles, and Shamir, _On formal properties of simple phrase-structure grammars_, Zeitschrift für Phonetik, Sprachwissenschaft, und Kommunikations-forschung vol. 14, 1961.

In particular, if you also show that emptiness of the language of a CFG is in linear time, this proof yields an O(|G|.|w|^3) time parsing algorithm by seeing the input string w as an automaton with |w|+1 states---for efficiency reasons you need to use a quadratic form on G (all productions have at most two symbols in their right-hand side) rather than Chomsky normal form, which can incur a blow-up in the size of the grammar.

5. Nice proof! Small typo: in Lemma 2.3, L should be declared regular.

6. I like having students develop (through homework problems) the notion of a counter automaton. They are a useful abstraction for problems involving # of a's, b's, c's (which can come in any order). It's easy to simulate a counter using a stack, but I imagine it would be very tedious to do directly in a CFG.

7. In response to question #1, we've been repeatedly toying with leaving out PDAs in the UIUC ToC course (which is usually done in the style of the Sipser text). Proofs, grading, PDA-->CFG equivalence are all painful. There's an argument that the machine hierarchy is nonintuitive and thus valuable to teach, but I think you miss the early semester student attentiveness (ESSA..y?) if you dwell on DFAs or PDAs for too long.

8. Comp Sci Student11:20 AM, April 26, 2012

As a student I'd prefer if PDAs were left in the course.
CFGs are often confusing and it isn't obvious why some languages can be derived and others can't be. Without seeing them as a machine, I fear I would never have understood them.

9. I think it's important to define PDA's and do the CFG-to-PDA direction of the equivalence, because the whole point of doing CFL's at all is they can be designed to be processed with a stack. (I usually tell my students my opinion that in 500 years, if we have a civilization at all, Chomsky will be completely forgotten as a political figure, possibly remembered as a linguist if he turns out to have been mostly right, but definitely remembered as a mathematician for finding the CFG-PDA equivalence.

I think the PDA-to-CFG proof is of only technical interest, though I always do it. Your new argument is a good contribution -- useful if you are committed to proving what you use and want to skip PDA-to-CFG.

It's interesting that you appear to have done the Warshall-like construction for DFA-to-RegExp as Justin Kruskal (any relation?) used it in his argument. I always skip this because state elimination works so much better on any actual examples.

If I were cutting back on the formal language section of this course I would consider dumping CFL's entirely.

1. > because the whole point of doing CFL's at all is they can be designed to be processed with a stack.

Hmm, I see what you're saying in terms of stressing the parsing connection, but I think there's other educational value in CFLs. If you are interested in introducing formal grammars (or any generative system outside of regular languages), CFGs seem like the ideal flavor. They're simple enough to understand derivations, complicated enough to provide some tricky problems, and used in practice.

10. I am largely self-taught in language theory: I am doing a Master's degree in regulated rewriting, but on context-free and regular languages, I have learned mostly from the internet (and what I skimmed from Sipser while researching an undergraduate project on P vs. NP).

Until I tutored a course in formal languages, I had only the vaguest idea of PDAs. I find the concept of a context-free grammar exceedingly elegant, while PDAs are relatively meaningless for me (essentially, I feel the opposite of Comp Sci Student above). While tutoring this course, I also found that many struggled with non-determinism (the course didn't cover NFAs -- but NFAs are anyway more easily understood as "there exists a path labelled blah" than as matching machines, IMO). They also aren't particularly useful for practical parsing. So I say skip 'em

Your proof of CFL being closed under intersection with REG is the one I came up with myself (of course it wasn't original with me -- in fact until now I had assumed it was the standard proof); it also generalises nicely to larger grammars (with restricted rewriting, etc.).

1. you guys teach complexity theory,sometimes advanced complexity theory,theory of computation and many other very very important,fundamental and basic subjects in theoretical computer science but students like us,meaning from a remote country and self learners,could not avail yours great teaching.you guys should video tape(record)them and make them publicly available.look at MIT OCW,it helped people all over the world the way people learn and its doing a great job but alas,mit ocw doesn't have advanced courses like complexity theory,approximation algo's video so being great researcher in theoretical computer science and a good teacher,you guys should make your class video available.lance,you are teaching Topics in Computational Complexity this semester,now nice it would be if you make this class available to the whole world.

11. You do not need Chomsky normal form. Just change all rules

S -> A1 A2 ... An

to

[p1, S, p(n+1)] -> [p1, A1, p2] [p2, A2, p3] ... [pn, An, p(n+1)]

for all appropiate sequences of states pi (do not change nonterminals). Like another Anonymous I came up with it myself when I was an undergraduate.

12. One way of leaving pdas out -- but not really -- is to do semirings. More precisely:

1. When doing finite automata, do also probabilistic finite automata. (This is a good idea anyhow, since they are quite useful under the name of Hidden Markov Processes with our Machine Learning friends.)

2. The algorithm to get regular expressions from finite automata is also an algorithm for shortest paths, and an algorithm to compute transition probabilities for probabilistic automata.
A version of the process associates nonterminals with each state, that generate the languages accepted from that state. The whole thing can be expressed as a set of (recursive) equations in languages. These systems can be solved algebraically. (Essentially, instead of inverses, compute fixpoints.)

3. These recursive equations turn out to be left recursive for regular languages, and left recursion can be eliminated.

4. The same process works for cfls. The recursion cannot be eliminated in general, but students have already learned that you implement recursion with stacks. You may then comment that there is a version of automata that does this, called pda.

The advantage of this approach is that you show that Theory can do fun and interesting things. Many students leave a standard Formal Languages course convinced that Theory consists in long uninteresting inductive proofs on the length of something, and they always prove some trivial fact. The approach above throws in a bunch of interesting, nontrivial (but not very hard) facts and constructions.

13. I would leave CFG out also. They are not needed for the rest of the course, a modern introduction to computability and complexity theory doesn't need them. The same way we do not teach general recursive models we don't need to and should not teach CFL/CFG/PDA anymore.

It is true that they might be useful for some other areas (Computational Linguistics, Compilers, ...) but why should they be taught in an introduction to computability and complexity course considering the amount of pressure at universities to reduce the number of required theory courses? They can teach CFL in their courses if they need them. There are too many interesting topics in computability and complexity that I don't think teaching CFL in such a course make sense anymore. They are nice results but they are not the only ones that are left out of an introduction course to computability and complexity. I would prefer to teach students topics that would get them interested in learning about modern complexity theory.