Wednesday, October 07, 2015

Randomness by Complexity

Let n be the following number
135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563 (RSA 1024)

What is the probability that the fifth least significant bit of the smallest prime factor of n is a one? This bit is fully defined--the probability is either one or zero. But if gave me better than 50/50 odds one way or the other I would take the bet, unless you worked for RSA Labs.

How does this jibe with a Frequentist or Bayesian view of probability? No matter how often you factor the number you will always get the same answer. No matter what randomized process was used to generate the original primes, conditioned on n being the number above, the bit is determined.

Whether we flip a coin, shuffle cards, choose lottery balls or use a PRG, we are not creating truly random bits, just a complex process who unpredictability is,we hope, indistinguishable from true randomness. We know from Impagliazzo-Wigderson, and its many predecessors and successors, that any sufficiently complex process can be converted to a PRG indistinguishable from true randomness. Kolmogorov complexity tells us we can treat a single string with no short description as a "random string".

That's how we ground randomness in complexity: Randomness is not some distribution over something not determined, just something we cannot determine.


  1. Ah, the foundations of probability, a topic close to my heart.

    "Whether we flip a coin, shuffle cards, choose lottery balls or use a PRG, we are not creating truly random bits, just a complex process who unpredictability is,we hope, indistinguishable from true randomness."

    That is, until quantum mechanics enters the picture.

    "Kolmogorov complexity tells us we can treat a single string with no short description as a "random string"."

    But Kolmogorov complexity is uncomputable in general so it is not likely that the probabilities that are computed all over the place in our statistical analyses are based on Kolmogorov complexity.

    Now, let's get onto the main point. When you read about the subjective Bayesian interpretation of probability you will often hear things like "if T is a tautology then the Dutch book argument (or insert your favourite other decision theoretic argument) implies that p(T) = 1). This requires logical omniscience, i.e. the ability to know instantly whether any given statement is a tautology. If this argument were valid then it would mean that we could not account for the fact that mathematicians seem to have partial degrees of belief for things like whether or not the Reimann hypothesis is true. It also doesn't take account of the fact that T may be undecidable in the axiom system you are using, or that T may be decidable but one of a family of propositions that define a language with high computational complexity, i.e. not decidable by an efficient algorithm. Your example is of the latter type.

    Fortunately, the fix is quite simple. The argument for p(T) = 1 is, on my view, just wrong in a trivial way. Here is the argument as it is usually given. First, in the Dutch book argument $p(E) is the amount of money you would be willing to pay for a lottery ticket that pays $1 if E is true and $0 if E is false and also the amount you would be willing to sell the ticket for. Coherence states that you should not enter into a bet that guarantees a sure loss, whatever the outcome.

    If we are betting on a tautology T, then T is always certain to occur, so we only have to consider one possible outcome. Suppose that p(T) is greater than 1. Then, you would be willing to pay more than $1 for a ticket that pays out at most $1, so that violates coherence. On the other hand if p(T) is less than 1 then you would be willing to sell a ticket for less than $1 that guarantees that you will always have to pay out $1 later, so this is also incoherent.

    What is wrong with this argument is that it is run from the point of view of the "objective world" rather than from the point of view of the agent. Coherence should not be defined as avoiding bets that guarantee a sure loss from an objective point of view, but rather not entering into bets that the agent *knows* will guarantee a sure loss. Otherwise it is a useless and impractical principle that an agent could never possibly apply in practice.

    If we accept this then what happens if T is a tautology, but you do not know that T is a tautology. Then, as far as you are concerned both T and NOT T are still possible. This means that you can no longer derive p(T) = 1 because you are not certain that you will always have to pay out $1 if you sell a ticket. Therefore, you can quite rationally assign probabilities less than 1 to tautologies, and, by similar arguments, probabilities grater than 0 to contradictions.

    The converse of this is that if you believe that E is certain, but it is not a tautology, then you must assign p(E) = 1. This shows that the usual subjective Bayesian claim that there are no rationality constraints other than the axioms of probability theory is also wrong. For example, if you are a person who is convinced that the laws of Newtonian mechanics are true with certainty then you must assign p(E) = 1 to any consequence derived from them.

    1. This doesn't work because what you get can no longer really be called a probability function.

      An essential property of probability functions is that they assign probabilities to outcomes, not descriptions. Without this probability wouldn't be a useful concept. The reason we can say that the probability of a full house is such and such is that once someone agrees that the probability of drawing any particular card is 1/# of cards remaining etc.. then it doesn't matter if they KNOW the argument letting them compute the probability of a full house or not.

      People have tried long and hard to come up with a concept that is like probability but allows for limited rationality but to no avail. There just doesn't seem to be the same kind of useful calculus for this kind of confidence reasoning.


      To be slightly more formal. You never assign probabilities to tautologies. Bets are about states of the world and they can have multiple descriptions.

    2. Also nothing about the fact that mathematicians have degrees of belief or even a willingness to bet requires that those degrees of belief be probabilities.

  2. Hi Lance,

    I am not sure if you saw my blog post or this is an example of "great minds think alike" :)

    I am very interested in this question, and I do think that there are different types of pseudorandomness that arise when we take a computational view, and we haven't really explored all of them yet.

  3. The digits of π are not really random. The digits are only *unknown until you compute them*. An intelligent compressor might recognize the digits of π and encode it as a description meaning "the first million digits of pi", or as a program that reconstructs the data when run. -- Matt Mahoney

    An encryption algorithm is not considered secure unless it can withstand chosen plaintext attacks by *adversaries with full knowledge* of the algorithm. -- Matt Mahoney

    ... as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say *we know there are some things we do not know*. But there are also unknown unknowns – the ones we don't know we don't know. -- Donald Rumsfeld.

    "the probability is *either one or zero* ... (yet) you will *always* get the same answer"
    *Plurality* should not be posited without *necessity* -- William of Ockham

  4. "Randomness is not some distribution over something not determined, just something we cannot determine."

    That's fine and good if your only concern is how a sequence interacts with computable processes. If what you want to know is something about how secure your code is against someone with a computer that's good enough (modulo working out some philosophical issues about how we want to interpret our assertions).

    But that's not necessarily the only kind of thing we want to be concerned about. Nothing in principle says that the universe can't have non-computable phenomena. Even if a sequence is 1-random (or maybe you want it to be \Pi^1_1-random or as strong a notion as you want) if it is nevertheless perfectly predicted by some physical process I don't think we would call it random.

    The point is that the notion of randomness seems deeply tied up in our notions of causality and prediction and can't be conceptually reduced to some fact about definability.