## Tuesday, December 13, 2011

### Solution to the reciprocals problem

In my last blog I asked you to look at this problem, try to solve it, and tell me if you think it is too hard for a HS competition. (I meant to post this on WED so I posted it at 12:06AM East Coast Time. I didn't know that the blog is on Chicago Time. So it got posted on Tuesday. Oh well.) Here is the problem:
Prove or disprove: there exist natural numbers x1,...,x10 such that
1. 2011=x1+... +x10 and
2. 1=1/x1+... +1/x10.
Several solutions and some other points of interest about this problem are here. The answer is YES and here are the solutions that I know of -- both my solution and the ones emailed to me. (The explanation of how they were obtained are at the paper pointed to above.)
1. My Solution: 2,4,5,80,80,80,160,320,640,640. This used known theorems.
2. Sam Solution: 2,4,5,40,120,160,300,300,480,600. Sam is a HS senior who is very good at these contests. It took him 30 minutes,
3. David Eppstein Solution: 3,4,7,16,16,16,20,43,80,1806. This used a bit of advanced knowledge and some hand computations that were probably above what you want for a UMCP Math Competition.
4. Matt Howell Solution: 2,4,5,50,100,100,250,500,500,500 This solution could have been found by a HS student (it is similar to Sam's solution.) Matt has a BS in Math and Engineering and did the problem in 20 minutes.
5. An anonymous commenter send me 6,6,10,10,12,15,15,15,62,1860. This solution could have been found by a HS student (it is similar to Sam's solution.)
6. Another one from same anon: 6,8,8,8,12,15,16,16,62,1860 This solution could have been found by a HS student (it is similar to Sam's solution.)
7. Mike Roman Solution: 5,5,6,8,8,12,15,32,960,960
So what to make of this?
1. Much to my surprise, enough people got it right and in ways that a HS student could have gotten it. And one did- Sam took the real exam.
2. The systematic solution that I got required very little hand calculation. All of the others, which includes the one I think a HS student could have or did get, required quite a bit of hand calculation. That may be a reason to not ask it.
3. I wonder how many solutions there are. This could be figured out by a Dynamic Program but there may be issues with large ints. (See later in this blog.)
4. I wonder if there are any that have distinct numbers. This can also be figured out by a Dynamic Program, though there may be an easier way.
5. I wonder about the computational complexity of the following problems:
1. Problem 1 Given a,b in N, does there exist x1,...,xa such that x1+...+xa=b and 1/x1+...+1/xa=1. (Can also ask with the stipulation that the x's are distinct.)
2. Problem 2 Given a,b in N and r in Q, does there exist x1,...,xa such that x1+...+xa=b and 1/x1+...+1/xa=r. (Can also ask with the stipulation that the x's are distinct.)

Possible Dynamic Program (more likely you would do it as a recurrence but save all answers found and check to see if you have already computed it, to avoid recomputing.) Let f(a,b,c,r) (where a,b,c are naturals and r is rational) be
num of sols to x1+...+xa=b and 1/x1+...+1/xa=r. where c ≤ x1 ≤ ... ≤ xa.
Note that
1. f(1,b,c,r) = 1 if r=1/b and r ≥ c
2. f(a,b,c,r) = sum as x in {c,c+1,..., min(b,floor(a/r)) } of f(a-1,b-x,x,r-1/x)
(NOTE- Use f(a-1,b-x,x+1,r-1/x) if you want to enforce distinct solutions.)

A math point: Ronald Graham showed that, for all n ≥ 78, n can be written as a sum of natural numbers whose reciprocals sum to 1. The lower bound of 78 is tight: 77 cannot be. A sketch of a proof of this is in the file pointed to. (The proof I give is inspired by one of the comments on the blog. YEAH BLOG COMMENTERS!) (ADDED LATER- Ronald Graham actually proved that for all n &ge 78 n can be written as the sum of DISTINCT natural numbers such that... . The proof I presented in the pointed to document just gives nat numbers, not necc distinct ones.)

1. There are a lot of solutions with all terms distinct. Here are just the ones a quick program I hacked together (no hand calculation this time!) found with lcm(solution)=960:

1/2 + 1/3 + 1/16 + 1/24 + 1/30 + 1/80 + 1/96 + 1/320 + 1/480 + 1/960
1/2 + 1/4 + 1/8 + 1/15 + 1/30 + 1/80 + 1/192 + 1/240 + 1/480 + 1/960
1/3 + 1/4 + 1/5 + 1/8 + 1/15 + 1/96 + 1/120 + 1/320 + 1/480 + 1/960

2. 2, 3, 16, 24, 40, 72, 90, 180, 288, 432, 864
2, 4, 9, 16, 24, 72, 120, 180, 288, 432, 864
2, 3, 16, 24, 50, 60, 72, 200, 288, 432, 864

3. As an anon who send you solution, I expected my full comment would be unblocked on wednesday; though half of my comment is in the solution, the other half, that shows a method that can be used to quickly solve such problem for other numbers and gives solution for years 2008-2013, is missing (as the end of the comment is present, I suppose the whole comment did get to you; btw I am the same anon who suggested 2x+8 2x+9 etc reductions, O(log(n)) terms are enough to represent n for n sufficiently large, 2297 and 2009 examples, 7 representation of 2011 - which you somewhat dismiss as "different problem" as it were not clear that it wasn't the solution to what was asked.)

I fail to see how your method is systematic - it uses reductions like 2x+8, but since you posed the problem, you had number of terms at your choice, so it is easier to do it whan you can choose the number of terms - for instance, when I tried to do reductions, I obtained 21 (and then said we were stuck), which can be 5 represented (21=3+6+6+6+6) and gives 13 or so representation of 2011. It's almost like posing an instance of NP complete problem starting from the solution, and than claiming that it was found systematically.

Also, Eppstein's use of "advanced math" consists of using Sylvester sequence, but one can try to start with any of the many similar sequences and tinker with it (like the one that sums to 1957 or some other I used - there is nothing special in Sylvester sequence except that is fastest growing and that was pointed out by another commenter; its hardly advanced math plus it won't help for some other nearby "years"). Eppstein pretty much tinkers with it in the similar way I did, and this tinkering has some method in it (sadly, part in which I illustrate how this can be quickly done is missing/witheld).

The HS solution is ingenious and uses a somewhat different idea, so I disagree with your assessment of the similarity/difference in the methods used.

As for my "method" (which is closest to the one used by Eppstein, but more general, as it does not need to start with Sylvester sequence), it consists in the following:
* Find a short "base" sequence of reciprocials summing to one, that adds to something less than 2000. This does not need to be Sylvester sequence, but can be constructed in similar, albeit more general way - one gets it by extending a known sequence of reciprocials summing to one, say a1..ak, by replacing ak by (c+a_k), (c+ak)a_k/c, where c is divisor of a_k (this was used in my method to get what I call "base" sequences, several of which I use, though it was not explained in my blocked comment)
*starting from the "base" sequence, we then see how much we miss to get to our target number, and how many extra numbers have to be used. Then we combine two of the smaller terms in the "base sequence" and replace them with two short sequences of reciprocials adding to 1 - say if we have 3 and 7 as members of "base" sequence, and 4 extra numbers to spend, we look for 3 representations of numbers a and b, which give sum of 3a+7b instead of 3+7 (or 4 and 2 representations) - we know that 3 representable numbers are 9, 10, 11, only 2 representable is 4, and 4 representable include 17, 18, 29 etc - and tinker with it, using congruences. If this can't be done, use different "base" sequence. To show that this method is systematic, I produced solutions for 2008-2013 in the time I was typing i.e relatively quickly (though it took some time to think up the method, as I had fun with this problem for a few hours total during which I also commented).

Perhaps I am wrong and you have a better systematic way, using theorems as you say, to get solutions. How would your method then produce a 10 representation for 2013 or 2018, for instance? Can you do it quickly, as you say your method "required very little hand calculation"?

4. I have two questions to your solutions paper.

1. Could you elaborate how you found parts (a), (b), and (c) in your "coolness-lemma"?

2. How does "If all are divisible by 20 ..." in Sam Zbarsky's solution come about? (Does it follow from the preceding assumption or is it just a guess?)

5. A brute force solution, using only numbers with a maximum prime factor of 11, found plenty of solutions with distinct numbers. The first one was 2,3,7,70,324,360,405,840.

6. Anon who debates if my method was more systematic'
GOOD POINT. What I really meant was that my way
(and ironically even more so with YOUR 2x+8, 2x+9 method)
can be used to show that EVERY sufficiently large number
can be written as a sum whose recip-sum is 1, while
the method you used I did not think could be used to prove that general theorem.

Can your method be used to prove the general theorem?

Anon who wants to know about how I got (a), (b), (c).
I had the notion of coolness and looked for a way
to get large numbers as quick as possible.
The observation that 1/2 + 1/4 + 1/5 = 19/20
was key for the 20x+11 one.

Anon who wants to know about Sam Z's divide by 20'
trick- realize that it did not have to work.
He GUESSED (common on math competitions) that if
the remaining 7 numbers were div by 20 then there
would be a solution, and it reduced the search
space so much that he was able to find a solution.
On a diff problem he would have been wrong- there might
not have been such a solution.

7. Bill,

I've written some code, and while I haven't tackled 2011 with it yet, I have tackled 77. Here are my solutions for this "unsolvable" problem:

[2,9,12,12,12,12,18]
[2,10,10,10,15,15,15]
[3,4,4,11,22,33]
[3,4,4,12,18,36]
[3,4,5,5,60]
[3,4,6,16,16,16,16]
[3,4,8,8,18,18,18]
[3,5,5,10,18,18,18]
[3,5,5,12,12,20,20]
[3,6,6,6,14,21,21]
[3,6,6,6,16,16,24]
[3,10,10,10,10,10,12,12]
[4,4,4,15,15,15,20]
[4,4,5,8,12,20,24]
[4,4,5,10,12,12,30]
[4,5,5,6,12,15,30]
[4,6,6,6,6,21,28]
[4,6,6,10,12,12,12,15]
[4,6,8,8,9,12,12,18]
[4,6,8,9,9,9,16,16]
[4,6,9,9,9,10,10,20]
[5,5,5,5,9,18,30]
[5,5,6,6,6,14,35]
[5,5,6,9,10,12,12,18]
[5,6,6,6,12,12,15,15]
[5,6,6,8,8,12,12,20]
[6,6,6,7,7,12,12,21]
[6,6,6,8,9,9,9,24]

8. Stuart- After seeing your post I re-looked at Ronald Graham's paper- he says all n \ge 78 BUT he insists
on DISTINCT numbers.

Sorry for the confusion, and good luck with 2011!
(thats 2011 EXCITEMENT, not 2011 factorial.

)

9. Looking at my solutions, I'd guessed that :-).

10. 23 is the smallest when allowing repeated numbers it seems.

I did not see solutions posted yet with very many numbers. Here's one with 19 numbers: 4,6,8,9,12,18,24,27,32,36,48,81,96,128,162,216,288,384,432

11. I dont know if my method, used for representing 2008-2013 etc, can be used to generate the general theorem - the reduction method is more suited for that, however the reduction method is not of much use when one has to find solutions with fixed number of terms - the method I suggested has more flexibility here.

Using reductions 20x+11, 2x+8 and 4x+6 can be used for 2011, but they do not work for 2013, 2015, 2017 etc, nor they seem to be enough to get the general theorem - none of the reductions can be applied. The HS student method also essentially uses reduction to 20x+11 but represents 100 then by hand, but for 2013 it would not work.

How would you use your method to get 2013 in 10 steps? With the reductions you mention, you can get rid of power 2^k quickly, but to change from odd to even number you need more reductions (also note that to get different numbers, reduction 2x+9 is useless - so one cannot prove the Graham theorem with using reductions mentioned in the Claim - but some other will work). Also, once we get to the small numbers, we might get stuck, but with more reductions (those presumably used in proof of Graham thm) one might get this to always work, albeit with number of elements used left undetermined.

On a more constructive note, the question how to find all solutions perhaps can be semi-efficiently answered on a computer (at least for representing numbers of order several thousands), by generating all sequences of given length that have sum of reciprocials equal to 1.

So my question is the following: given k, how can one generate all k-typles of (not necessarily different) natural numbers whose reciprocials sum to 1? What is asymptotic of f(k) - number of such k-typles, also what is asymptotic of number of such k-typles that have distinct numbers? Let f(n,k) be number of k representations of n, what is asymptotic of f(n,k) - with non distinct numbers, and also with distinct numbers (it appears that distinct k-typles are far fewer, but how much so)? Are these things known in the literature? Seems to be a good introductory research project in say analytic number theory.

12. http://oeis.org/A051882

13. The number of solutions seems to be exponential in n, and based on http://oeis.org/A051908, there are probably around 10^40 solutions for 2011.

14. There cant be 10^40 solutions for 2011, obviously, 10s natural numbers smaller than 2011 is 2011^10 which is certainly smaller than 10^40.

15. My program reports 215067 solutions. It does math on fractions, and solves the n=2 base case exactly.