## Tuesday, February 23, 2010

### What is in MY automata theory course/What should be

In the last post I pondered what was more important: Automata Theory or Crypto. This raises the question of what should be in a course in automata theory. Rather than discuss that I will tell you what is in mine and see what you think. (ADDED LATER: YOUR COMMENTS HAVE MADE ME RETHING THINGS. I WILL ADD MINMIZING DFA'S TO THE COURSE THIS SEMESTER. I AM JUST FINISHING UP REG STUFF SO I CAN DO IT NOW.)

The standard topics are:
1. Regular Languages: DFA's, NFA's, Reg Expressions.
2. PDA's, CFL's.
3. Turing Machines, computable and c.e. sets.
4. NPC
The following make this course a bit different than others, though not much. All the thing listed below that I claim I WON"T do are definite- I WON"T do them. All the things that I claim I WILL do are less definite I can't do all of them. I'll see how it goes and which ones I will do.
1. Decidability of Weak Second Order with S and ≤. The language has quantifiers that range over finite sets, quantifiers that range over natural numbers, and symbols for Successor and ≤. The proof uses Reg Languages. We do it in the Reg Language section and then revisit it when we do decidability. At that point I will also tell them (but not prove) about some theories that are undecidable. We also do decidability of Presburger arithmetic (quantify over naturals, have + and ≤) which follows from decidability of WS1S easily. Will also talk about decidability of S1S and omega-automta, but not prove anything. This did not take up too much time because I presented alot of it as more examples of regular languages. This material is not in any textbook that I know of, however see pages 8-28 of this PhD thesis.
2. I am NOT going to do the algorithm for MINIMIZING a DFA.
3. I am NOT going to do Context-Sensitive Languages.
4. I am NOT going to have them prove things that are obvious, like that S-->aSb, S-->emptystring generates {anbn}. Generally I am against having students prove things that are obvious.
5. I am NOT going to have them ever program a Turing Machine. I will tell them they can do everything and rarely refer to them ever again. I DO need the definition so that I can prove Cook's theorem.
6. I am NOT going to to Primitive Recursive functions.
7. In the NPC section I WILL DO the protocol for, in our language, NGI (non-Graph-Isom) is in AM.
8. In the NPC section I WILL DO the protocol for, in our language, given bit-commit, 3-COL is in ZK. (I may to other ZK protocols as well.)
9. In the NPC section will do that Vertex Cover with FIXED k is in O(n2) (I know that better is known.) Why? Because this is a very good example of an obvious thing (can't do better than roughly O(nk)) being WRONG. Shows the NEED to prove things.
10. Might do NSPACE(n) closed under Complementation. Might not- this may be conventionally difficult for this audience. And we really wont' be talking about space anyway.
11. Let SUBSEQ(L) be the set of a subsequence of L. I will, throughout the year, do the following: Show that if L is regular than SUBSEQ(L) is regular, Show that if L is context free than SUBSEQ(L) is context free, Show that if L is c.e. than SUBSEQ(L) is c.e. All of this leads to material I discussed in this old post. I WILL NOT prove that if L is ANY language then SUBSEQ(L) is regular, but I may talk about it.

#### 21 comments:

1. I really like the theory of primitive recursive function, and I do think that the proof, for example, that Ackermann-Péter function is not primitive recursive is a nice proof that is worth knowing.

But as you said, you cannot teach everything!

2. I think some of the logic stuff you mention as topic 1) is presented in Dexter Kozen's book "Theory of Computation". Or maybe I'm misremembering things.

3. Having been a teaching assistant for similar courses, I think it's worth having students "program" at least a couple simple TMs.

This is especially true if you're presenting non-determinism as "an accepting computation path existing" (as opposed to there existing some string that satisfies a deterministic verifier, which I think is Sipser's presentation.)

4. I took a computation course in undergrad from Sipser who used his book. It excluded this
http://en.wikipedia.org/wiki/Myhill-Nerode_theorem
but I later wished it were covered, it's pretty cool. Even if you don't like the minimizing algorithm, it is enough to state this theorem and then handwave that it yields a minimizing algorithm. The reason I like it is that it gives an alternate characterization of regularity... alternate characterizations are good :)

5. I'm scheduled to teach our undergraduate automata theory course next year so this is of some interest to me. But I think algorithms for minimizing DFAs (and for testing nonemptiness of regular languages, etc) are actually somewhat important: for instance, they came up yesterday for me in a research meeting with some social scientists as something they should probably be using.

That doesn't mean I think we should all switch to doing new research on these problems, but what we want to tell the undergrads about and what we want to research can be two different things.

6. When you say "a course" you mean a 10-week course (a quarter) or a 16-week course (a semester)? There's a lot you can include in 6 weeks difference -like an algorithm for minimizing DFA, or, say, ambiguity in CFGs- without over-stressing the students. I studied both in my semester-long automata theory course, though I think the teacher went too far near the end, trying to make us multiply integers using lambda-calculus in the final exam.

7. The course is usually 15 weeks (twice a week for
75 minutes each meeting).
This year it is 14 weeks because of the snow.

I may rethink mininizing DFA's the next time I do the course. THANKS to all for making me rethink this.

8. What I think such a class should not be: spending lots of time with lots of pretty animations learning how to program Turing machines to perform simple string-matching and arithmetic tasks. The ability to program a Turing machine is almost completely useless. For similar reasons, if pushdown automata are mentioned at all they should be covered only very briefly — in the guise of LALR and LL parsers, they are useful, but not as something anyone needs to learn to actually program by hand.

9. In fact I once "programmed" a pushdown automaton for actual research - I needed to show that a regular expression with squaring has a small context free grammar for the language (it also has a regular expression without squaring, but not a small one). Not a hard exercise actually.

10. About FPT-VC:
Papadimitrious-Yannakis takes \$n * 3^k\$, using maximal (not necessarily maximum) matchings.
What algorithm takes \$n^2 * f(k)\$?

11. Is this another exercise in hypocrisy, like Lipton's complexity course at Georgia Tech which covers almost nothing from his high flying list of topics:

http://rjlipton.wordpress.com/2010/01/12/a-course-on-complexity-theorem/

12. This is somewhat off topic, but I would recommend that if you could create a video lecture of your course and put it on the web like MIT does, that that would be a great way of getting more people interested in theory. Right now, there are not many *introductory* video lectures in CS Theory that young people can look at, and get excited about.

I work in a corporate research lab, and I recently pointed a new colleague (he has a Ph.D. in Psychology and will work in the area of HCI) to MIT's algorithms course and I have never seen someone so excited about Algorithms before :-)

13. Rani: Fix k. Form the following tree:
take an edge. ONE of the two vertices
that forms it must be in the VC.
So branch both ways, taking either
choice. Delete that vertex. Then repeat
this on each child. Keep doing this until you form a tree of depth k and size of course about 2^k. If any of the leaves is the empty graph then you have a VC of size k.

Anon who wonders if I am being a hypocrite. what are you talking about? I listed what is in my course in my post. You may agree or disagree with my choice of topics but I don't see what there is here to be hypocritical about.

Anon who wants my course on video.
Good idea that an EXCELLENT lecturer should do that. Not clear that that person should be me
(some would say its clear that person should NOT be me.)

bill gasarch

14. ... but we could have videos from many lecturers and then let the crowd decide.

Plus, it may be the case that one lecturer is quite strong on a particular topic but weaker on another. By having a larger collection of videos, there would be strong lectures on everything.

15. Hi Bill --

This term I'm teaching both our undergraduate and graduate automata-computability-complexity courses, except that the grad course is now all complexity, using Arora-Barak chapters 1-10 and 14. The undergrad course , using Sipser, is a third each finite-automata-and-CFL's, computability, and complexity. In both courses I go easy on practical NP-completeness, as that is covered in our undergrad and grad algorithms courses.

I do DFA minimization in the undergrad course because it's cool, the students like it, and it illustrates Myhill-Nerode. (I do a whole lecture on MN whereas Sipser relegates it to a solved exercise.) I roughly agree with you about proving obvious things and programming TM's apart from one or two exercises. I do talk about space complexity a fair bit and (given that I am at UMass especially) do Immerman-Szelepcseyni in both courses. Not much logic, though -- in the grad course I mention descriptive complexity characterizations and use logic to explain nondeterminism and alternation. I agree about NON-ISO in AM in both courses -- I don't think I have time for any real ZK. I think I'll do the VC result in the undergrad course, but not the grad course because they are getting most of their NPC stuff elsewhere.

Right now the undergrad course is optional for the degree though it's taken by most math-CS double majors. The grad course is required for Ph.D candidacy along with a grad algorithms course. Undergrad algorithms comes before this undergrad course, and is required for the degree. (For the BS, that is, we're also starting a BA which allows any five upper-levels.)

More than you wanted to know, perhaps...

16. Hi Bill,

I think that adding one or two lectures along the lines of "automata/grammars in real life" could increase the appeal of the course. Some examples (some of which might require extensions to probabilistic grammars or automata, which might be interesting on their own):

- comp bio: RNA folding
- verification: reachability of "bad" states
- graphics: generating fractal-like images

Etc

17. A problem with this course (and a possible reasons students prefer crypto to this course) is that it seems to be trying to be many things at once:
(1) Introduction to basic theory of computer science (computability, halting, Turing machines)
(2) Theoretical basics that also have applications elsewhere (automata, regular expressions, grammars, NPC, etc.)
(3) Random "gems"(?) of theoretical computer science (all the other stuff you mentioned)

It sounds like, at Maryland, (2) is sufficiently covered in other courses -- at least to the extent that doing it formally would not require a whole additional semester. Topic (1) is very important, but also not enough to fill a semester. So you pad the semester with (3), and I can imagine that some students do not like that.

18. Last Anon: Actually this is the first time I've taught it for a while, so your comments may be PREDICTIVE rather than
analysing the past. Also, I will modify the course as it goes, note what works and doesn't work for next time.

Also- as my next post will show- the
course was far more popular this semester.
Why was that? Read it and find out.
(The reasons DO NOT contradict your obeservations.)

19. Bill,

I often teach formal language theory, but have just 27 lectures since Chicago is a quarter school. I have the students buy Kozen, but I supplement from Durbin, Eddy, Krogh, and Mitchison "Biological Sequence Analysis," Lindenmayer's "The Algorithmic Beauty of Plants," and from the 1st edition of Hopcroft and Ullman.

I teach DFA minimization, because it enables me to talk about homomorphisms of DFAs, and my students find this helpful. I also talk about probabilistic automata as language generators, and do the standard biological sequence alignment algorithm as an application, and touch on Viterbi's algorithm.

In terms of context free languages, I talk about weighted grammars (which makes it possible to adapt the CYK algorithm to RNA secondary structure prediction) and about Lindenmayer systems (which can be thought of as "parallel" CFLs, and have applications to computer graphics). I also teach LR(1) parsing. I think this is helpful for students who are going to take a compilers course later.

I don't teach NP completeness, which it seems to me is better left to the algorithms course (for Cook's Theorem and the standard Karp 6), and to Razborov's complexity theory course for anything beyond that.

I can usually squeeze in a lecture or two on general computability: time enough to introduce Turing Machines, the Kleene mu-calculus, the Lambda calculus, and a quick handwave of a proof that all three systems define the same set of computable number theoretic functions. We haven't had a general computability theory course in years :-(. If students want to learn the recursion theorem, they have to learn it from Soare.

20. I think a big interest in automata these days are in the area of group theory -- namely, automatic groups. Your course description seems not touch this topic. On the other hand any modern course on automata should spend substantial time on this topic.

21. I've had the opportunity to TA several different flavors of this course. Three of these courses were undergraduate level classes using Savage, Sudkamp, and Ullman-Hopcroft respectively. The last was a mixed undergrad/grad course using Sipser. It's given me a lot to chew on in terms of what I think is important, and I got to play around with some of this as the instructor for Automata Theory this summer.

I think this subject is too deep to be confined to a one-semester course. It should be a two-semester sequence. I would use Savage the first semester to cover circuits, FSMs, a minimal exposition on PDAs, and TMs and Computability. Savage does a really good job in providing a perspective, and I like his proof of Cook-Levin in showing TMs and Circuits are logspace equivalent. CIRCUIT-SAT is a better first NP-Complete language than the standard SAT problem. Savage also provides exposition into the RAM model, which other standard texts omit; and he does a good job mentioning some algebra. It's a healthy perspective for someone going onto grad school to have.

I would emphasize group theory and group actions, algebraic combinatorics, and complexity in the second semester (with Arora-Barak and Dummit & Foote as the texts). Algebra comes up enough in TCS (Babai's recent result on graph isomorphism, rings and fields in crypto) where it should be a staple in an undergraduate curriculum, rather than an obscurity for Math-CS double majors and grad school. Additionally, it provides a lot of solid relations to the discrete math prerequisites. Regular languages have a similar algebraic structure to the integers and are closely related to ordinary generating functions. I also like to teach the Brzozowski Algebraic Method and Brzozowski Derivatives instead of the State Reduction procedure, which provides a nice relation to solving recurrences with OGFs and walks on graphs.