Teaching crypto for the first time next semester I am looking into lots of stuff I always meant to look into but now I have to. NOT a complaint- good to be forced to expand my horizons (within TCS).
I"m also finding out that the web is great for easy and hard stuff but not so good for medium stuff.
Here is what I want to know so I am reaching out to my readers.
KNOWN: you want to generate a unif rand seq of 0's and 1's. The bits you can generate are Independent (yeah) but biased (boo). So here is what you do: generate 2 at a time and
if see 00 then DO NOT USE
if see 11 then DO NOT USE
if see 01 then generate 0
if see 10 then generate 1
KNOWN: You can do similar things if you have 00, 01, 10, 11 independent. And also if you have 000, 001, blah blah , 111 independent.
I tried looking up if there is a better way and I came across some complicated papers. I need material for a senior course. So, are there INTERMEDIARY results Suitable for a classroom, on better ways to us an imperfect source to get unif rand?
There is a simple approach that is asymptotically optimal, though convergence is somewhat slow.
ReplyDeleteWe will construct an object called bitstream, which accepts iid random bits. The bitstream will have two daughter bitstreams: daughter1 and daughter2.
For every pair of received inputs a,b, the bitstream acts as follows:
1. If a != b, output a, and send 1 to daughter2.
2. If a = b, send a to daughter1 and 0 to daughter2.
The resulting data structure might seem infinite, but it becomes finite once you create the daughter bitstreams only when they're first accessed.
Let us denote by f(q) the average number of bits output by a bitstream when the input consists of iid Ber(q). Then
2f(q) = 2q(1-q) + (1-2q(1-q))f(q^2/(q^2+(1-q)^2)) + f(2q(1-q)).
You can check that the entropy function h(q) satisfies the same recurrence (using the chain rule), and so f(q) = h(q). In other words, the asymptotic output rate of this method is optimal. Analyzing the speed of convergence would require more work.
The algorithm is quite easy to implement in python, so here are some empirical results:
1. When p = 1/2, feeding the bitstream 10^6 random bits resulted in 982635 output bits, or an average of 0.982635 output bits per input bit (compared to the entropy, which is 1).
2. When p = 0.4, the same experiment yielded an average of 0.955164 ob/ib, compared to h(p) ~ 0.971.
3. When p = 0.3, we get 0.865629, compared to 0.881.
4. When p = 0.2, we get 0.708546, compared to 0.722.
5. When p = 0.1, we get 0.461367, compared to 0.469.
In all of these results, we get upwards of 98% of the optimum.
Clarify please: I understand how you get the three
Deletestreams
Output
Daughter1
Daughter2
but how do you put them together to get a uniform random
sequence?
This comment has been removed by the author.
DeleteI coded my algorithm in python. It is available as a Gist.
Delete(Unfortunately I was unable to figure out how to include it as part of a comment.)
As an example, the following code runs the generator with 20 iid bits distributed Bernoulli(0.4), generated from seed 1:
>>> test(.4, 20, 1)
[1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0]
The output is the 12 bits shown above.
Thanks- will look at your code and may email you later.
DeleteThis all sounds nice in theory, but how does it apply to real physical random sources, like the Dice-O-Matic? Maybe some of the papers looked complicated, because they just tackled the real physical situation head-on, instead of hiding behind convenient idealizations?
ReplyDeleteHow would you post process the output of the Dice-O-Matic? How would you describe what you can achieve (assuming that perfectly unbiased output is not possible)? Can you prove that achieving perfectly unbiased output is not possible, no matter how you post process the output of the Dice-O-Matic?
I don't know exactly what you're looking for, but here's one approach. Break your bitstream into blocks of (say) 7. Now, suppose you get 4 0's and 3 1's. There are binomial(7,4)=35 ways of doing this. Choose 3 of these ways to throw out; the other 32 give you 5 bits. Now, suppose you get 5 0's and 2 1's. There are binomial(7,2)=21 ways of doing this. Choose 5 of these to throw out and use the others to get 4 bits. If the probability of a head is roughly 1/2, you roughly get .517 bits/coin flip. This is only slightly better than your first method, but the reason is that you're using 35 possibilities to get 32 random bits, and throwing away the other 3 possibilities. You can use them.
ReplyDeleteTHANKS!
ReplyDeleteWe want to get truly random bits even if the original source is very
biased (but independent). Does yours do that, or can it be modified
to do that? I'm thinking no and yes.
Hi Bill,
ReplyDeleteMine should give truly random bits even if the original source is very biased (but independent). Whether it gives more bits than your easy protocol is another question. It can certainly be modified to do that using ideas from arithmetic coding, but I don't know whether there's an intermediate-level way of doing this without introducing a large amount of information theory for background.
Anyway, in my protocol, if you get 4 0's and 3 1's, you add 5 bits. You add all possible sequences of 5 bits with equal probability, so each of these added bits has an equal probability of being a 0 or a 1. Similarly for any other numbers of 0's and 1's.
How do you use the other 3 possibilities? Lets say these are 1110000, 1101000, 1100100. If you get the first one, add the bit 0 to your string. If you get the second, add the bit 1 to your string. And if you get the third, do nothing. This still gives each bit an equal probability of being 0 and 1. This technique gives you .544 bits/coin flip. You can use a modification of this similar to arithmetic coding to get .628 bits/coin flip. And making the length of the blocks go to infinity should make the efficiency go to 1 bit/coin flip.
ReplyDeleteMore generally, if m = 2^a_1 + ... + 2^a_k is the binary expansion of n, then you can partition {1,...,m} into parts of sizes 2^a_1, ..., 2^a_k, and extract a_i bits from the i'th part.
ReplyDeleteThe expected number of bits you extract in this way (for a random point in {1,...,m}) is log m - O(1) (in fact, the best constant is log(7)-10/7).
If you toss a p-biased coin n times, then you see roughly np ones. The corresponding binomial coefficient m = binomial(n,np) satisfies log m = h(p) n - O(log n). More generally, if k ~ Bin(n,p), then E[log m] = h(p) n - o(n). Therefore this strategy reaches the entropy in expectation.
To get an asymptotic rate of h(p) with high probability, it suffices to partition the input into omega(1) parts of size omega(1).