Thursday, March 01, 2007

Inductive Turing Machines

On last week's Numb3rs episode One Hour, Charlie, the mathematician, and Amita, his colleague/girlfriend, had the following conversation:
Charley (to Amita): I haven't seen an inductive Turing machine used like that before.
Amita: I'm trying to find the finite state machine for these. (points to screen)
So what is an inductive Turing machine? I put the question to Bill Gasarch.
If you IGNORE the TV show, it could mean the following, taking a cue from the field of Inductive Inference:

A set of computable functions S is in EX if there is a TURING MACHINE M such that for all f in S if you feed f(0), f(1), f(2), … into M, it outputs e1, e2, e3, … and in the limit the sequence converges to e, a Turing Machine index for a machine that computes f. Such a machine M is called an INDUCTIVE TURING MACHINE.

Does this definition make sense in context of the show? The set S of regular languages (those computed by finite state machines) is in EX, where ei is the lexicographically least FSM whose output is consistent with f(0), f(1), …, f(i-1).

Of course this is an incredibly inefficient way to learn regular languages, but then again Amita wasn't having much success. Perhaps she should have used one of the efficient finite automata learning algorithms like Rivest and Schapire.

Gasarch has a different take.

The show DID NOT mean this. So what did they mean and what could they have said? They should have either used a generic term for learning or just use a fictional term. Then they can't really be wrong. Here are some possibilities:
  1. Charley (To Amita): I haven't seen that learning algorithm used that way before.
  2. Charley (To Amita): I haven't seen Carl Smith's Technique used that way before.
  3. Charley (To Amita): I haven't seen cross-convergence used that way before.
Or they could have made Charley's comment and Amita's Answer match better:

Charley (To Amita): I haven't seen the Generalized Polynomial Hales-Jewitt Theorem used that way before.
Amita: I'm trying to prove the polynomial van der Waerden's theorem over the reals.

The conversation they DID have is connected to later in the show when they are trying to learn the maze. I can make a vague connection–the show did not do so.

Having said all that, it was a good episode.

By the way you can watch Numb3rs on-line for free.


  1. So let's play Lance's game -- the (Numbers)^-1 game of taking out science from (Numbers), while presumably Numbers tries to put science into entertainment.

    The question they were posing in the episode is

    given a sample of a maze, can you learn it?

    Of course the next question is "what do you mean by "learn"?"

    A possible formalization is

    In an unknown maze [of size at most n rooms] [, planar] [, bounded degree d] a master criminal follows a trajectory T [with some possible restrictions--max velocity, constant velocity, constant acceleration]
    [Moreover the maze is a submaze of a known maze of size N]

    (Items enclosed in square brackets are possible additional assumptions)

    The criminal is observed at times 1, 2, ... i. The task is to predict its location at time i+1.

    [in a way that as i grows the predictions become more accurate]

    Give bounds on the number of observations needed, the accuracy of the prediction, and the time needed for your algorithm, for each possible case.

    You only need to solve the cases where the solution is interesting.

  2. You might as well ask what the mathematical significance of 4, 8, 15, 16, 23 and 42 are.

  3. What would Sherlock Holmes do here? He'd reason that the writers are in LA so they'd go to the UCLA math department, ask the resident expert on Turing machines, Mark Burgin, for some authentic background terminology, and he helps them out with

    M. Burgin, Inductive Turing machines, Notices Acad. Sci. USSR 270 (1983) 1289–1293 (translated
    from Russian, v.27, No. 3).

    How Holmes ever got his man with his cockamamie guesses was always beyond me. I'm sure this guess is wrong.

  4. Sorry, replace "wrong" by "right" in the above. (Burgin's inductive Turing machines, which apparently he came up with on his own and later realized were already known, are related to Gold's notion of identification in the limit (JSL 1965, I&C 1967) and to omega-automata. What Janos said.)

  5. Dear Lance,
    Ignorance has never been the best feature of a person. That's why I decided to explain the situation with inductive Turing machines to you and to those who visit your weblog and might be interested to have correct knowledge. If you, or somebody else, read publications on machine learning, you will see that this community has never used the term "inductive Turing machine". What they use is "inductive inference machine". It is a different kind of automata. Thus, your attempt to define inductive Turing machines is based on a wrong assumption. Moreover, what you defined is neither inductive Turing machine nor inductive inference machine.
    However, your intuition that inductive Turing machines and inductive inference machines are related in some way was correct. Indeed, inductive inference machines are equivalent to the simplest kind of inductive Turing machines, which are at lowest level of the hierarchy of all inductive Turing machines. This is demonstrated in the book "Super-recursive algorithms".
    As I don't expect that you, or somebody else of those who are writing to this weblog, is ever going to read this book or any mathematical paper where inductive Turing machines are rigorously defined and studied, I give here a simplified description of an inductive Turing machine.
    The simplest inductive Turing machine M has the same structure as a Turing machine with three tapes and three heads. In the machine M, one tape is the read-only input tape, the second tape is the write-only output tape, and the third tape is the working tape of M. The machine M performs operations as a Turing machine. The only difference is in how M gets its results. Naturally, when M stops in a final state, then the word written in the output tape is the final result of M. This is the same as what a Turing machine is doing. That is why inductive Turing machines can simulate any Turing machine.
    However, M can also obtain a result in a different way. Namely, when at some step of computation, the word written in the output tape stops changing, then this word is also the final result of the simplest inductive Turing machine M.
    In spite of some incompetent comments, inductive Turing machines have much higher computing power than inductive inference machine and limit partial recursive functions.
    More powerful inductive Turing machines have a more efficient memory, more powerful control device or/and processor (head). In the similar way to computers, these components add power to inductive Turing machines.
    For those who read article for a general audience and then complain that they don't find rigor there, I would like to note (although it is evident) that this is not an exact definition. To find an exact definition, it is necessary to read the book "Super-recursive algorithms" or mathematical papers where inductive Turing machines are rigorously defined and studied.
    To understand why and how you invented your definition of inductive Turing machine, it is not necessary to be either Sherlock Holmes or even Sherlock Pratt. The first cause for giving that definition was your good mathematical education: you studied calculus and you know the concept of limit. Thus, it is possible to assume that a professional at your level of expertise can understand definitions of inductive Turing machines and inductive inference machines … he reads these definitions. Consequently, it possible to conclude that you did not read either of these definitions. Unfortunately, society looses the culture of reading. It was the second cause for giving that definition. The third cause was a confusing choice of the name for the term "limit recursive function" introduced by such a prominent and influential computer mathematician as Gold and persistently used in the area of inductive inference. Looking at the definition of a limit recursive function or limit partial recursive function, it is possible to have an impression that their result is obtained as a limit of an infinite process.
    However, this is not so with respect to the limits used in the calculus. In addition to other advantages, the construction of the simplest inductive Turing machine made it evident that this machine, as well as any limit recursive function or inductive inference machine, always gives its result (if any) after a finite number of steps. Unfortunately, the name "limit recursive function" implies connections to the concept of limit in calculus and these connections are wrong.
    To conclude, it is possible to remark that the definition you gave is meaningful although it is not rigorous. One of possible rigorous definitions of such a construction is the definition of a simple limit Turing machine. There are also other topological algorithms based on the concept of limit.

    Mark Burgin

    P.S. It is much easier to discuss a TV show than to read a mathematical paper (not speaking about a book) and at least, to understand the main constructions (not speaking about results).