Thursday, March 28, 2013

Counting Descriptions- a ``new'' complexity class

Let nσ(w) is the number of σ's in w.

We often ask our students about languages like { w | na(w) = 2nb(w) } (CFL but not REG). Lets formally define languages that are like that.

A COUNTING DESCRIPTION is a boolean combination of linear equations and inequalities involving nσ(w). For example

(na(w) ≤ 2nb(w)+3) AND NOT( nc(w) = nb(w) ).

We denote a counting description by E. Let L(E) = { w : E(w) is true } A lang L is CD if there exists an E such that L=L(E). What to make of the class CD?

The following items came out of emails between myself and Eric Allender, Dave Barrington, and Neil Immerman.

  1. CD is contained in 1-way log space and in uniform TC0.
  2. CD is incomparable to REG since (1) from above we see there are langs in CD that are not REG, and (2) ba* is not in CD.
  3. CD intersect REG is in uniform AC0
  4. Parikh's Theorem yields a large class of CD's that produce context free languages.
  5. If E is a Boolean combination of threshold and mod statement, each about a single nσ(w), then L(E) is regular
  6. The following papers may help answer some of the questions one could ask: here and here.
Are the following decidable:
  1. Given a counting description E, is L(E) regular?
  2. Given a counting description E, is L(E) context free?
Richard Beigel has shown these problems, with an unbounded alphabet size, are NP-hard. see here. I am very curious about the case where the alphabet size is bounded. Are there other (NATURAL!) classes one could ask about? NOT context sensitive since CSL contain log space. NOT the class DSPACE((log n)1/2) since that's not natural. Maybe some sublinear class would be interesting. We can also ask variants of these questions involving any combinations of the following variants:
  1. Do not allow negation.
  2. Do not allow intersection.
  3. Do not allow inequalities.
  4. Do not allow additive constants.
  5. Do not allow multiplicative constants.
  6. Only allow a bounded alphabet size.
  7. Allow other types of equations, for example n_a(w) = n_b(w)2.
  8. Allow other primitives such ast n_a(w) == n_b(w) mod 9.
  9. Have a non-uniform version of Counting Descriptions and bound the size of the formula as a function of n.

In problem 7 if you allow any poly then the problem is undecidable by the solution to Hilbert's tenth problem.


  1. A trivial remark after item 2.(2): Your languages are closed under permutation. The class CD may then be compared to other classes closed under permutation. I am not at all a specialist of these things but I suspect such classes have been studied before!

  2. sounds somewhat similar to timed automata. maybe some natural bridge thms possible?

  3. The concept of semilinearity is very relevant here -- so all MCFGs/LCFRSs are all semilinear too (permutation equivalent to a regular language.)
    You might also look at the use of string kernels to generalise beyond n_a(w) to n_{abc}(w) -- i.e. look st linear constraints on continuous or discontinuous substrings.

  4. Look at "Parikh automata".