Monday, November 22, 2021

Finding an element with nonadaptive questions

Suppose you have a non-empty subset S of {1,...N} and want to find an element of S. You can ask arbitrary questions of the form "Does S contain an element in A?" for some A a subset of {1,...N}. How many questions do you need?

Of course you can use binary search, using questions of the form "is there number greater than m in S?". This takes log N questions and it's easy to show that's tight.

What if you have to ask all the questions ahead of time before you get any of the answers? Now binary search won't work. If |S|=1 you can ask "is there a number in S whose ith bit is one?" That also takes log N questions.

For arbitrary S the situation is trickier. With randomness you still don't need too many questions. Mulmuley, Vazirani and Vazirani's isolating lemma works as follows: For each i <= log N, pick a random weight wi between 1 and 2 log N. For each element m in S, let the weight of m be the sum of the weights of the bits of m that are 1. With probability at least 1/2 there will be an m with an unique minimum weight. There's a cool proof of an isolating lemma by Noam Ta-Shma.

Once you have this lemma, you can ask questions of the form "Given a list of wi's and a value v, is there an m in S of weight v whose jth bit is 1?" Choosing wi and v at random you have a 1/O(log N) chance of a single m whose weight is v, and trying all j will give you a witness.

Randomness is required. The X-search problem described by Karp, Upfal and Wigderson shows that any deterministic procedure requires essentially N queries.

This all came up because Bill had some colleagues looking a similar problems testing machines for errors.

I've been interested in the related question of finding satisfying assignments using non-adaptive NP queries. The results are similar to the above. In particular, you can randomly find a satisfying assignment with high probability using a polynomial number of non-adaptive NP queries. It follows from the techniques above, and even earlier papers, but I haven't been able to track down a reference for the first paper to do so.

1. The interplay resulting from what the questioner is allowed to ask can also be interesting. For example, if the goal is to guess a password or the hash of the password to enter a system, then powerful questions targeting single or a small group of bits are not available. Interestingly, in that cryptographic setting, the *expected* number of questions required even for the optimal sequence of atomic queries of X to discover an unknown random quantity X is related to the Renyi entropy, specifically the Renyi entropy of order 1/2, instead of the Shannon entropy where arbitrary queries can be made. See

https://crypto.stackexchange.com/questions/40286/entropy-requirements-for-encryption/40292#40292

2. For this specific question I think there is another way to isolate: You consider {1,...,N} to be the set of vectors over {0,1}^k (where k=log N), take uniformly random linear equations and ask "Does S contain a solution of l1,...,li". With constant probability, if i is the smallest iteration for which you get "No", you will have a single member of S that solves l1,...,li-1.