## Monday, April 07, 2014

### How free is Randomness?

Matthew Green had a great post on the topic how do you know a random number generator is working.  Gee, I just look at the sequence and see if it LOOKS random. Probably not the best idea. Actually people often found pattersn where there aren't any.

Reminds me of a the following apocryphal story which would have to have taken place before everyone had a computer: A teacher assigns the class a HW assignment to flip a coin 600 times and record H's and T's. He warns them that if they just try to come up with a random sequence without flipping he will know. Sure enough, a few of them had sequences with NO runs of 6 or more H's or T's. He knows they are faked. One of the students fessed up that yes, he just wrote down H's and T's in a way that looked random. But another student claims that he DID flip the coins, but when he saw a sequence of 6 H's in a row he changed it since he THOUGHT the teacher would spot it and THINK it wasn't random.

Is randomness a free resource? Papers in Complexity Theory strive to find good RNG's while papers in Crypto claim they already exist. Who is right?

Faulty RNG's are surely a problem for crypto protocols. But are they a problem for randomized algorithms? People really do use randomized algorithms to test primes. What other algorithms do people in the real world routinely use randomness for? (Not counting crypto protocols).Is it ever a problem?

I believe there are stories where a faulty RNG led to a crypto system being broken.

Are there stories where a faulty RNG led to a randomized algorithm being wrong?

1. A bad RNG once cost a gameshow $110k: http://www.thisamericanlife.org/radio-archives/episode/412/million-dollar-idea?act=4#play 2. I thought he was only allowed to win$50k due to limits in place at the time :)

Also, this wasn't a RNG. It was known to the system creators that there were a fixed number of patterns and a fixed set of values for each spot.

3. Sure - PRNGs are used all the time in monte carlo simulations of physical (and increasingly, social) systems. There are many examples of situations where, by bad luck, problems with the PRNG happened to correlate with something of interest in the modelled system, and the results being incorrect calculations. Sometimes this is due to a researcher writing their own bad random number generator, but sometimes it has occurred using generators that were thought to be perfectly fine. Most famously, in the early '90s, a relatively respectable random number generator, r250 was used in calculations of the Wolff cluster algorithm of a 2-d Ising spin system, giving results (as energy-per-spin at the critical temperature) which were 42 standard deviations off from the expected value - http://arxiv.org/pdf/cond-mat/9512135.pdf.

4. I was wondering about a problem lately about randomness and I am not sure if the answer to this is already known (or obvious). Let say we have a black box that can return a random number, say in the range [0,1]. If somehow it is possible for us to open the box and see how it works (the algorithm maybe) , does the box loses its "random" nature? I mean if we find out how some object X works, does that reduces the state of X being random? Is randomness equiv to being unknown?

5. In my experience, a small amount of care with generating random bits seems to suffice for randomized algorithms in practice. That said, there are some reported cases of Monte-Carlo simulations giving quite different results under different random-number generators \cite{flw}; it has also been noted that direct implementations of certain RNC algorithms for list ranking and graph connectivity \cite{hsu:thesis,hrd} sometimes take longer time than expected.

\bibitem{flw}
A.~M. Ferrenberg, D.~P. Landau, and Y.~J. Wong.
\newblock Monte {C}arlo simulations: {H}idden errors from "good" random number
generators.
\newblock {\em Physical {R}eview {L}etters}, 69(23):3382--3384, 1992.

\bibitem{hsu:thesis}
T.{-s}. Hsu.
\newblock {\em Graph augmentation and related problems: theory and practice}.
\newblock PhD thesis, Department of {C}omputer {S}ciences, {U}niversity of
{T}exas at {A}ustin, October 1993.

\bibitem{hrd}
T.{-s}. Hsu, V.~Ramachandran, and N.~Dean.
\newblock Parallel implementation of algorithms for finding connected
components.
\newblock In {\em {DIMACS} {I}nternational {A}lgorithm {I}mplementation
{C}hallenge}, pages 1--14, 1994.

6. Mark Manasse, Frank McSherry and I were working at some point on weighted consistent sampling (weighted version of minhashing) and found our implementation using a weak (C random, iirc) prng to be extremely inaccurate, leading us to question the correctness of our theorems. Fortunately for us, it was the prng's fault, and replacing it with something like SHA, while very likely overkill, did the trick. The algorithm we were implementing is in this paper:http://research.microsoft.com/apps/mobile/Publication.aspx?id=132309
--Kunal