Thursday, January 14, 2016

A limit on the DFA-CFG divide for finite sets

It is easy to see that for Ln =  {a,b}*a{a,b}2n

There IS a CFG of size O(n)

ANY DFA is of size double-exp-in-n

I was wondering if we can increase this gap. That is, a statment like: For all but a finite number of n there exists a lang Ln such that

There IS a CFG of size O(n)

ANY DFA is of size triple-exp-in-n.

I also looked at finite sets. For COMPLIMENT of { ww : |w|=2n }

There IS a CFG of size O(n)

ANY DFA (in fact any DPDA) is of size double-exp-in-n.

This is stated in my paper here though the real interesting math needed to get it are in here where they get lower bounds on the size of CFG's, generalizing techniques from here.

SO my question still stands- can we get a triple exp separation? I have shown that this CANNOT be achieved with finite sets:

THEOREM: If L is a finite lang then there exists n such that (1) Any CFG in Chomsky Normal Form for L has to be of size \ge log n, AND (2) there IS a DFA of size 2n.

Proof:  Let n be such that the longest string in L if of length n.  In order for a Chomsky Normal Form grammar to generate this string it needs \ge log n nonterminals. Since the lang has at most 2n
strings in it, there is a DFA of size 2n for it.
End of proof.

So I have shown that one approach won't work. I am hoping that YOU know an approach to get
a trip-exp-sep and leave a comment about it.


  1. Isn't the (CFG,DFA) separation non-recursive (Meyer and Fischer 1971)? I think this is mentioned in the paper of yours that you linked to.

  2. That result gives you the following:
    For all computable f,
    For an INFINITE number of n there is a lang L_n such that
    there is a CFG of size n but ANY DFA takes size f(n).

    In fact, this is even true for all f such that f\le_T HALT.
    In fact, this is even true for all f such that INF \NOT\le_T HALT (thats a new result of mine, see

    BUT what I am asking about is a FOR ALL result. I want
    FOR ALL n there is a lang L_n such that the diff is large.
    For that all I have manages is double-exp.

  3. Isn't the diff lower bound ruled by the max "growing" that can be achieved by a CFG (i.e. k^n for some k; k=2 for grammars in Chomsky normal form) ?

  4. Oh, I get it now. So what happens if you apply a simple regular operation to the language L_n that you found, like L_n^* or L_n concatenated with itself? That could possibly bump you up one exponential.

  5. (in the previous comment replace lower-bound with upper-bound :-)