(Guest Post by Ken Regan)
pdf file available

here
Computational complexity theory is the study of information flow
and the

*effort* required for it to reach desired conclusions.
Computational models like cellular automata, Boolean or algebraic circuits,
and other kinds of fixed networks exemplify this well, since they do not
have "moving parts" like Turing machine tape heads, so the flow's locations
are fixed. Measures of effort include the time for the flow, the amount
of space or hardware needed, and subtler considerations such as time/space
to prepare the network, or energy to overcome possible dissipation during
its operation. These models and measures have fairly tight relations
to Turing machines and their familiar complexity measures.

For an example and open problem, consider the general task of moving all
"garbage bits" to the end of a string, leaving the "good bits"
in their original sequence. We can model this as computing the function
f: {0,1,2}

^{*} ® {0,1,2}

^{*} exemplified by
f(1020212) = 1001222, f(2200) = 0022, f(101) = 101, etc.,
with 0,1 as "good bits" and 2 as "garbage."
A rigorous inductive definition, using e for the empty string, is
f(e) = e, f(0x) = 0f(x), f(1x) = 1f(x), and f(2x) = f(x)2.
This is the "topological sort" of the partial order
B = {0 < 2, 1 < 2} that is

*stable*,
meaning that subsequences of incomparable elements are preserved.
The problem is, can we design circuits C

_{n}, each computing
f(x) on strings x of length n, that have size O(n)?

The circuits C

_{n} have

*input gates* labeled
x

_{1},...,x

_{n} which receive the corresponding "trits"
(0, 1, or 2) of the input string x, and

*output gates* y

_{1},...,y

_{n} giving y = f(x).
The first question is,
what interior computational gates can C

_{n} have?
A

*comparator gate* g for a partial order (P, < )
has two input and two output wires, maps (a,b) either to (a,b) or (b,a),
and never maps to (d,c) when c < d. The unique

*stable comparator* g

_{P} maps (a,b) to (a,b) unless b < a.
The following slightly extends the famous 0-1 law for comparator networks:

**Theorem 1.** If a circuit C

_{n} of comparator gates computes f(x)
correctly for all x

ÃŽ
{0,2}

^{n} (not even including any 1s),
then for every partial order (P, < ), the circuit C

_{P} with each
comparator replaced by g

_{P} computes the stable topological sort of P.

**Proof.** First suppose C

_{P} errs for a total order (P, < ).
Then there are x,y

ÃŽ P

^{n} such that C

_{P}(x) = y, but for some j,
y

_{j}+1 < y

_{j}. Take the permutation p such that x

_{i} = y

_{p(i)}
for all indices i. Define a binary string y

¢ ÃŽ {0,2}

^{*}
by y

¢_{i} = 0 if y

_{i} < y

_{j}, y

¢_{i} = 2 otherwise, and x

¢
by x

¢_{i} = y

¢_{p(i)} for all i. Then C

_{n}(x

¢) = y

¢
(exercise: prove this by induction taking gates one at a time),
contradicting that the original C

_{n} was correct on {0,2}*.

For (P, < ) not a total order, an error C

_{P}(x) = y
(which might violate only stability) is also an error in the total
order (P

_{x}, <

¢) with P

_{x} = {(a,i): x

_{i} = a} and
(a,i) <

¢(b,j) if a < b or a is not comparable to b and i < j.

[__¯__]
**Corollary 2.** Circuits C

_{n} of comparator gates computing
f require size n*log

_{2}(n)

- O(n).

[__¯__]
This follows by applying the standard sorting lower bound to C

_{P}.
It's interesting that we did not need 1s in x to argue stability,
and the lower bound allows gates g in C

_{n} to be arbitrary when
either input is 1.

For general circuits, however, the argument doesn't hold, and all bets are
off! To see why, consider sorting the total order {0 < 1 < 2}.
Clever O(n)-size circuits can

*count* the numbers a,b,c
of 0s, 1s, and 2s in the input string x, respectively,
and then assemble the correct output y = 0

^{a} 1

^{b} 2

^{c}.
For the basic idea see

Muller-Preparata,
1975,
and various sources
on the "Dutch National Flag Problem." Applying this counting idea
to our poset B reduces our task to "nice" strings z of length
N = 2

^{k} with exactly N/2 2s.

**Theorem 3.** If s(N)-size circuits D

_{N} can compute f(z)
for "nice" z, then f has circuits of size at most s(4n) + O(n).

**Proof.** We can build O(n)-size circuits E

_{n} that on inputs x
of length n count b,c as above and find k such that
m = 2

^{k-1} is the least power of 2 above n. Make E

_{n}(x)
output z = x1

^{m+c-n}2

^{m-c}, which gives

|z

| = N < 4n.
Then compute y

¢ = D

_{N}(z) and re-use the computed b,c,m
to pluck off the n bits of f(x).

[__¯__]
This reduction to nice z enhances the "flow" metaphor.
The m-many 2s in z can be advance-routed to the last m places of y

¢,
so the whole issue is how the m-many 0s and 1s in z
flow together into the first m places of y

¢.

*Must*
this flow progress (without loss of circuit-size generality) by
"squeezing out 2s" in an intuitively plane-filling fashion,
allowing "mileposts" whose forced spacing might mandate having
n*log

_{2}(n)

- O(n) gates? Or can linear-size networks rise above
the planar view? No one I've asked has known, and lack of them
frustrates a desired general linear-size circuit simulation of my

"Block Move" model.
Issues

here
may be involved.
Nor do I know nicer descriptions of
O(nlogn)-sized circuits than "use ancillas to tag bits of x
and work in P

_{x} as in the proof of Theorem 1,
employing ideas of Theorem 3 and/or mapping into the O(nlogn)-sized

Ajtai-Komlos-Szemeredi
networks."
Those seeking an o(nlogn) upper bound may be my guest,
but those believing a super-linear circuit lower bound must
reflect that no such bounds are known for string functions
whose graphs belong to NP or to E.

The above inductive definition of f yields a linear-time algorithm
on any model that simulates each operation of a double-ended queue
in O(1) time. But is booting a 2 to the rear in f(2x) = f(x)2
really in constant time, even amortized? True, our technical issues
shrink away on passing from linear to polynomial time,
so all this may seem to have nothing to do with P versus NP.
But

*au-contraire * the Baker-Gill-Solovay "oracle"
obstacle may mean nothing more than that standard "diag-sim"
and timing techniques are insensitive to internal information flow.
The "Natural Proofs" obstacle

*may* ultimately say only
that network-preparation/"nonuniformity" is a subtly powerful
consideration. Honing tools for information-flow analysis on
incrementally more-general cases that yield super-linear lower
bounds may be the walk to walk before trying to run.

File translated from
T_{E}X
by
T_{T}H,
version 3.77.

On 21 Jun 2007, 23:36.