Thursday, July 29, 2010

What is the complexity of these problems and metaproblems?

The following problem is from Doctor Eco's Cyberpuzzles. I have shortened and generalized it.
We are going to put numbers into boxes. If x,y,z are in a box then it CANNOT be the case that x+y=z. If you put the numbers 1,2,...,n into boxes, what is the smallest number of boxes you will need?

(CLARIFICATION ADDED LATER: I mean that x,y,z are ANY three numbers in a box. For example, 10,11, and 21 CANNOT all be in the same box. One of the comments thought they could by letting z=10, x=11, y=21. This is NOT what I intended.)

You can use a simple greedy algorithm to get some partition, but it might not be optimal. The following set is not NPC (by Mahaney's theorem) but also seems to not be in P: I suspect that the following set is not NPC and not in P:

{ (n,k) : there exists a way to partition {1,...,n} into at most k boxes so that no box has x,y,z with x+y=z }

There are many variants.

{ (n,k) : there exists a way to partition {1,...,n} into at most k boxes so that no box has x,y,z with x+y=2z }

This one I know is thought to be hard- it is asking for the min number of colors so that you can color {1,...,n} and not have any Monochromatic arithmetic sequences of length 3. This is an inverse van der Warden numbers; hence I am sure that it is not in P, and that there is no proof of this, and its not NPC.

Let E be a linear equation like x+y-z=0. We represent E by its coefficients., so E would be (1,1,-1). The statement E(a,b,c)=0 means a+b-c=0. Fix E on a fixed number of variables, say a. How hard is

{ (n,k) : there exists a way to partition {1,...,n} into at most k boxes so that no box has x1, ..., xa with E(x1,...,xa)=0 }

If we do not insist the set being partitioned was {1,..,n} we get more problems:

{ (a1,...,an,k,E) : there exists a way to partition {a1,...,an} into at most k boxes so that no box has x1,...,xm with E(x1,...,xm)=0 } I suspect this is NPC.

One can always use the Greedy Algorithm on the above problem, though it may not give the optimal answer. Consider the following meta problems: (``the problem'' can either refer to the version where we are partitioning {1,...,n} or a given set. Hence below there are eight problems, not four.)
  1. { E : For the problem with E, Greedy gives optimal }. This is in co-NP.
  2. { (E,a) : For the problem with E, Greedy gives within a*optimal }. This is in co-NP.
  3. { E : the problem with E is in P }
  4. { E : the problem with E is NPC }
For the last two I don't even know if they are decidable.

Tuesday, July 27, 2010

This Post is Quite Different from any you've ever read!!

I recently a letter from WETA (public TV) which I quote from:
This letter is quite different from any we've ever sent to you. For years we wrote to you about WETA's great programs and the need they filled in your life. Today, I must write to you about WETA's needs. And if friends like you don't respond to them, there will be far less programs to enjoy.
There is something wrong with this letter: I have gotten the exact same letter from them for about 4 years now. Hence the statement This letter is quite different from any we've send to you is not just false but verifiably false.

While I expect letters to exaggerate I do not expect to have easily verifiable lies that do not even help their cause. So why did they do this? I do not know. But whatever the reason, it is sheer incompetency. Hence I will not give to them. This raises the following question:

If a charity (or whatever Public TV is) asks for money they can exaggerate how much they need it. Should they?
  1. Some readers will say GOSH, they really need the money! I better give!
  2. Some readers will say They always need money. I am not going to bother.
We also have here a societal problem. Since many (legitimate) charities exaggerate about how dire their situation is or how serious their problem is, after a while it all gets tuned out. We have here a  Prisoners Dilemma problem- each one thinks (correctly?) that if they exaggerate their problems they will get more money. But if they all do it the public gets cynical and gives less money overall. (NOTE- I do not know this to be true, but I am curious. If anyone does know then let me know.)

How can they get out of this trap? I do not know. However, the least they can do is to not say things that are obviously false. There may be a well defined Game Theory or EC problem here. Or it may be a public policy problem. We won't know until its solved.

UPDATE IN May 2016: I keep getting these letters and got one today.

UPDATE IN July 2017: Got another letter today

UPDATE IN Sept 2017: Got another letter today

UPDATE IN March 2018: Got two in the month.

(I got some in between March 2018 and now but now I report it)

UPDATE IN JUNE 2022: Got one this month. 

Monday, July 26, 2010

A Seventh Mil. Problem

Richard Lipton had a wonderful post asking for a seventh Millennium Prize now that Poincare's conjecture has been solved. I posted a suggestion on his blog but got no comments. I'll expand on it and see if I get any comments here.

HISTORY: The original proof of VDW's theorem , in 1927, yields INSANE (not primitive recursive ) bounds on the VDW numbers. (Shelah (1988) later got primitive recursive bounds and Gowers (2001) got bounds you can actually write down!) Inspired by VDW's proof Erdos and Turan (1936), made two conjectures:
  1. If A is a subset of N of positive upper density then A has arbitrarily long arithmetic sequences. Proven by Szemeredi in 1975 (see here for more.)
  2. If &Sigmax ∈ A 1/x diverges then A has arbitrarily long arithmetic sequences. (This conjecture implies the first one.)
A proof of either of these yields a proof of VDW theorem. The hope was that it would lead to a proof with better bounds. Szemeredi's proof of the first conjecture did not yield better bounds; however, Gowers proved the first conjecture a different way that did yield better bounds on the VDW numbers.

The second conjecture is still worthwhile since it may yield even better bounds and because it is interesting in its own right. So, I propose the second conjecture of Erdos-Turan as the 7th Millennium problem. (It might need a snazzier name. The Divergence Conjecture? The k-AP Conjecture? Suggestions are welcome!)
  1. Greene and Tao have already shown that the primes have arbitrarily large arithmetic progressions.
  2. The work that has gone into Szemeredi's theorem and the Greene-Tao theorem spanned many areas of mathematics. Hence this is not just an isolated problem.
  3. The problem has been open since 1936. Hence it is a hard problem.
  4. Will more connections to other parts of math be made? Is the problem too hard? A NO answer to both of these would make it not that good a problem.
  5. The converse to the conjecture is not true. Note the following set:

    A = &cupk&isin N {2^k + i : 0 &le i < k }

    The set A has arbitrarily long arithmetic sequences but If &Sigmax ∈ A 1/x converges.
  6. Is there a plausible condition that characterizes the sets that have arbitrarily long arithmetic sequences?
  7. There is already (I think) a 3000 dollar bounty on the second conjecture. So the Clay Math Institute will have to just give 997,000 dollars.

Friday, July 23, 2010

CRA Snowbird Part II

Considerable discussion about funding at CRA Snowbird. Ken Gabriel, Deputy Director of DARPA, talked about how DARPA is restructuring its programs to become more university-friendly. They've made great progress though there are still some sticky issues of project-oriented proposals and security clearances. On a related note DARPA recently announced a new Crypto Program that may be of interest to the theory community.

Peter Harsha, the CRA director of public affairs, talked about NSF funding and how the renewal of the COMPETES act almost got derailed over pornography. NSF and CISE in particular did well in the administration's budget request but there is some uncertainty as we head into the fall elections.

The best part of the snowbird meeting is networking, talking to a number of CS leaders especially at the meals and breaks. The last session was small group meetings with current deans on how to deal with our own deans. Even though I'm not a chair I do find myself dealing with my dean and his staff quite often and we were able to get some good advice on quite a range of specific issues. Our group got lucky in matching up with  Dan Huttenlocher, Dean of Computing and Information Science at Cornell, and Martha Pollack, former Dean of the School of Information at Michigan. The best general advice: have a good working relationship with your dean and don't just ask or complain but really make the case on how the particular resource you need will benefit your school.

I'll be mostly on vacation and off the net for the next couple of weeks. Be nice to Bill while I'm gone.

Thursday, July 22, 2010

CRA Snowbird Part I

I just returned home from my first trip to the CRA Snowbird Conference, the biennial meeting of CS chairs and other leaders in the CS community. I really enjoyed the short meeting and saw many old friends who are now chairs, some people I've only known by email and many I've talked to for the first time. There are a few theorists who have become chairs and deans, and, in the case of Jeff Vitter, provost at Kansas. Unlike theory conferences where I am usually one of the old people, most CS chairs are just about my age.

I, as I had to remind most people I met, am not a chair. I attended as a speaker on the Peer Review in Computing Research panel giving my usual spiel on how the current publication culture hurts the community-building aspects of conferences. Jeannette Wing made a great argument of how our deadline-driven research and conservative program committees may lead our field to lose its "vibrancy, excitement and relevance". 

A number of people talked about the big projects they work on which make me almost rethink having gone into theory. Almost. The need for better algorithms shows up in many of these talks. Yoky Matsuoka from U. Washington talked about the artificial hand her group developed that has the full range of motions of a human hand but they lack the algorithms to have the hand do even some simple natural tasks. Illah Nourbakhsh from CMU talked about building electric cars and his ideas of using a supercapacitor as an energy cache for batteries so the batteries become smaller and cheaper but hits challenging cache optimizing issues. The group is running a contest, best algorithm wins an electric car. 

Sally Fincher from the University of Kent gave a surprisingly strong talk Why Can't Teaching Be More Like Research? We get judged by our research based on how we compare to the global community but teaching is much more local and Sally talked about the implications of this distinction.

Most disappointing was the discussion on the NRC Rankings. Charlotte Kuh, who served on the NRC committee putting together the "soon" to be released report, said it will not give a specific ranking of each department but rather a range, like University of Southern North Dakota is ranked between 8th and 36th. And not just one range but five ranges based on different weights of the various criteria. And you can create other rankings based on your own choice of weights. All based on 2005-6 data. And they used citation data from the ISI which doesn't include most CS conferences. The CRA board talked them out of that but now the CS data and rankings will use no citation information at all. But even outside of CS, with multiple ranking ranges and old data, the NRC report will be of little value.

Wednesday, July 21, 2010

Do you want to review a book

I will be sending my next book review column for SIGACT NEWS off on July 28, 2010. It has LOTS of books on its BOOKS I WANT REVIEWED list. YOU get a chance to look at the book list and request one to review before the list goes out (if you do this then I will modify the list).

Here is a link to the list of books I want reviewed HERE. If you see a book you want then email me at the name of the book and your postal address. We will then work out details over email- when the review is due, and whether me or the publisher sends it to you. I would like you to email me before July 28 so I can take those books off the list, but if you email after that date and the book you want is still available, that will be fine.

Before volunteering you should read my advice for reviewers.

Here is a LaTeX Template.

Monday, July 19, 2010

Factors for getting a job- Arbitrary, random, and complex

The Job Market in Theory (likely in all of academia) has more randomness and arbitrariness (are those the same?) then people may realize. Especially young PhD's who have never been to a faculty meeting where these things are discussed. The process is quite complex (is that the same as random and arbitrary?). I am NOT saying that merit plays no role. I am saying that its very hard to make a clean statement like Person X got a job an school Y because of Z.

With that in mind, I list out some criteria I have heard played a role in a hiring decision. My point is NOT to help you game the system (note that some of the factors I've heard cut both ways) nor to argue that the system is good, bad, or ugly. My point is only that the process is more arbitrary-random-complex then you might think. While its not all merit, its also not all who-you-know or politics.

I invite you to give leave comments on factors you have heard of, but to keep it civil please do not mention the people or schools involved.
  1. Merit: This is itself ambiguous. More papers? Longer papers? Co-authors? How about take a sum where each summand is (number of citations)*(importance of paper)*(number of pages)/(number of coauthors). And then there is grant potential.
  2. Does a postdoc have a better chance then a fresh PhD? A postdoc has had more time to increase his weighted sum mentioned under Merit, but the school KNOWS he has had more time.
  3. Two-body-problems can be GOOD or BAD. Often a school does not have two positions.
  4. Being a women can be GOOD or BAD (for getting a job).
  5. I would like to say Being black can be GOOD or BAD (for getting a job) however since there are so few black PhD's in computer scientists applying for academic jobs, I have not heard any stories about this. (NOTE- I use the term black instead of African American since they need not be American.)
  6. Being socially inept can be GOOD or BAD. How could it be good? It plays to the stereotype. He's so socially inept, he must be a genius. I DO NOT recommend pretending you are socially inept.
  7. Speaking your mind can be GOOD or BAD. He'll be a leader in the community or He'll be a pain in the ass
  8. Having a Blog- the jury is still out on this one. I have heard of a case where having a blog helped someone get an interview. (It was a serious blog about the field he was in. I can't imagine that complexity blog would help me get a job if I was looking.)
  9. Spending too much time deciding what to have for lunch can be a negative. If he can't figure out what to have for lunch then how can he formulate a coherent research plan.
  10. Area can be a factor- does a school need or want someone doing area X? This is sometimes formalized, for example the school has money targeted to hire someone who does, a particular area. (Side topic- Targeted positions- Good or Bad? Good in that you are not going to have to compare people in diff areas. Bad in that there may be someone really good in another area who you want to hire but can't.)
  11. Subarea is a factor- for example, we want the kind of theorist who can talk to our people in systems.
  12. Having a champion in the dept who is helping push your case HELPS unless the person doing the pushing is a jerk or not good at pushing a case.
  13. Being an ex-convict probably hurts. If there was a theory genius who served 10 years in jail, that might be a negative. Might be mitigated if he could pull in some serious grant money. However, if the jail term was for embezzling grant money, then maybe not. Actually, the whole question might depend on what he was in jail for and is he now reformed.
  14. A very big factor is how good is the job market when you get out. This may be a much bigger factor than anything else on this list.

Friday, July 16, 2010

How I Find Homework Problems

What do you get out of this paragraph (from Charlie Stross' The Atrocity Archives via Daniel Lemire)
The [Turing] theorem is a hack on discrete number theory that simultaneously disproves the Church-Turing hypothesis (wave if you understood that) and worse, permits NP-complete problems to be converted into P-complete ones. This has several consequences, starting with screwing over most cryptography algorithms—translation: all your bank account are belong to us—and ending with the ability to computationally generate a Dho-Nha geometry curve in real time.
I get a new homework question.
Show that NP-complete = P-complete if and only if NP = L.
Wave if you solve it.

Thursday, July 15, 2010

Sparse problems in NP thought to not be in P

(This post is similar to this old post. I am posting this anyway since when I first posted I made fundamental mistake. I fixed it the point I was trying to make get lost.)

About once a semester I get asked
What are the natural problems that are in NP, not known to be in P, and not known to be NPC? What is known about them?
And I give the standard answers:
  1. Graph Isomorphism. If its NPC then PH collapses so it is though to NOT be NPC. There is no real consensus about its status with respect to P.
  2. Factoring. If its NPC then NP=coNP so it is thought to NOT be NPC. Most people think that it is NOT in P since people really want to solve this one and have not been able to. This is not a rigorous argument. Factoring is in QP. If quantum computers are ever practical then we may need to rethink how we think about these things.
  3. Discrete Log. Actually, are there reasons to think this is NOT NPC?
The answer above are standard and I suspect most of my readers know them. However, there is a large source of problems that are in NP, likely not NPC, likely not in P, that people don't seem to talk about much: SPARSE PROBLEMS that are thought to be hard. The ones I know about are from Ramsey Theory. I give one example but there are many like it:

{(1n,1m,c) : the n×m grid can be c-colored and not have any mono squares}

  1. This is clearly in NP since the coloring itself is the witness.
  2. Since this is a sparse set it is not NPC unless P=NP. (Actually it can be coded as a tally set.)
  3. I want to say People think this is hard. You may ask Name Three of Them! Indeed, the number of people who work on Computational Ramsey Theory is fairly small. However, Ramsey Theory itself has many people working on it and nobody has anything close to a result indicating that this problem is in P. I personally think that it is hard.
  4. For a fixed c there is a finite number of grids G1,...,Gp such that the n×m grid is c-colorable iff none of G1,...,Gp can fit inside the n×m grid. Hence, for fixed c, the problem is in P. (So the problem is Fixed Parameter Tractable.)
  5. Everything said above holds if you replace squares with rectangles or other configurations in the above.
  1. Are there other sparse problems in NP that we think are NOT in P? (that is, ones that DO NOT come from Ramsey Theory).
  2. Should we be teaching these in our complexity classes as examples of possible intermediary problems? The proof that they are likely not NPC is easy. However, to argue that these problems are likely not in P is hard. Also, these problems may appear unnatural to some students. (They may appear unnatural to some of my readers.)
  3. It is easy to argue that Factoring is probably not in P since you can point to the fact that people REALLY want to solve it quickly and so far have not. This is not a rigorous argument but I do count it as evidence. Is there a similar argument for GI? Is there a similar argument for my Ramsey Problems?
  4. Is there a mathematical reason to think that Factoring, GI, or my Ramsey Problems are not in P?

Wednesday, July 14, 2010

GCT Workshop (Guest Post by Josh Grochow)

Last week, the Center for Computational Intractability hosted a Geometric Complexity Theory Workshop. Geometric complexity theory (GCT) is an approach to P vs NP and related problems proposed by Ketan Mulmuley and Milind Sohoni, in which they were naturally led to using techniques in algebraic geometry and representation theory. This workshop was the most recent attempt to explicate GCT to both mathematicians and computer scientists, and discuss recent related results. Without looking at the attendance list, my impression was that the workshop was approximately 3/8 complexity theorists and 3/4 classical mathematicians (meaning that about 1/8 fell into both camps), and was about 1/3 students and 2/3 postdocs/professors. This turned out to be a pretty good mix.

The first two days were scheduled to be all Ketan all the time. Despite the fact that Ketan is an excellent presenter, talking for two days straight and keeping the audience interested would be a tough task for anyone. Not everyone made it to the end, but a significant fraction of the audience did. Ketan's lectures were interspersed with a lot of discussion and debate, including a couple short impromptu guest lectures from audience members on some background topics. Pointed questions came from both classical mathematicians and complexity theorists, and as often as not were answered by audience members from the other field. It was refreshing to see the humility of the giants of multiple fields: great classical mathematicians asked complexity theory questions that a complexity undergrad could answer, and vice versa. I think this is one of the few times I have ever heard of serious classical mathematicians mingling with complexity theorists, and I think it worked quite well. This type of interaction and more seems necessary for the GCT program, and can only be a Good Thing for both fields.

And now for two technical items from the workshop.

Avi Wigderson and others (sorry to the others, but Avi sticks out most in my memory, and I didn't learn everyone's names!) repeatedly suggested applying the techniques of GCT to easier problems than P vs NP and see if they can be pushed through to fruition for easier lower bounds. At the moment, GCT reduces P vs NP to certain hard conjectures in algebraic geometry and representation theory. The corresponding conjectures arising in easier lower bound problems would hopefully be easier, maybe even easy enough to solve in our lifetimes! Peter Burgisser has been looking at matrix multiplication using these techniques (an idea originally suggested by his adviser, Volker Strassen, long before GCT came along), and his student Christian Ikenmeyer gave a presentation on their results as one of the research talks on the third day.

Christian's talk was probably the most salient for complexity theorists not intimately familiar with GCT, so I'll mention a bit more about it to finish off the post. GCT introduces the idea of a "geometric obstruction" (=some type of representation-theoretic object) to "there are circuits of size nc for SAT up to length n" (NP vs P/poly). If a geometric obstruction exists for all (or infinitely many) input lengths n, then P is not equal to NP. The conjectures arising in GCT imply that such obstructions exist. But even in smaller problems, such as matrix multiplication, no one had ever seen such geometric obstructions! Peter Burgisser and Christian Ikenmeyer did computer calculations and found geometric obstructions to the border-rank of 2x2 matrix multiplication. (The border-rank of 2x2 matrix multiplication roughly captures how many multiplications are needed to approximate 2x2 matrix multiplication, in a specific sense.) The geometric obstructions they found show that the border rank is at least 6 (it is known to be 7, by a significant algebro-geometric result of Joseph Landsberg). Although this only proved a weaker result than what was already known, for a comparatively easy lower bound (compared to P vs NP), this is an important proof-of-concept for the GCT approach of using geometric obstructions to prove complexity-theoretic lower bounds. This work also raises the intriguing possibility (also suggested in the GCT series of papers) that one might be able to "verify P neq NP up to length n" for larger and larger n, the same way people verify the Goldbach Conjecture or Riemann Hypothesis up to a certain numbers of integers/zeroes. Unfortunately, doing this in the GCT setting even up to n=10 seems to take too much time (whereas Goldbach and Riemann have been verified for millions of integers/zeroes).

Finally, I should mention even in the case of 2x2 matrix multiplication, the conjectures that arise seem almost as difficult as the ones arising in the P vs NP problem. Avi and others suggested we look at even easier problems.

UPDATE 7/20: Slides of the research talks are now available here, and videos will be added soon.

Monday, July 12, 2010

Can you ever be denied Full Prof? Can you ever really fail a PhD defense.?

  1. Is it possible for someone to be denied Full Prof? Yes, but it is rare.
  2. Is it possible for someone to fail a PhD defense? Yes, but it is rare.
These questions are similar. Why might either happen?
  1. I suspect that the most common reason for someone to be denied Full Prof is that they went up without the Chairman's or the Departments approval. That is, they insisted on going up. I really cannot see a reason for a professor to do this except some sort of bad logic which I will discuss later.
  2. I suspect that the most common reason for someone to fail a PhD defense is that they insisted on defending even though the advisor didn't think they were ready. While the advisor should have stopped it there may be other reasons (e.g., a job offer to the student has been made so he really needs the PhD NOW, or some deadline is coming up) that came into play. Could also be bad logic which I will discuss later.
  3. Could also be because of politics. I've heard of this happening but never really saw a case I could verify myself. One problem: Everyone who is ever denied Tenure or Full Prof says that it was political. Hence its hard to tell when it really is.
  1. A prof might think Everyone who I've ever seen go up for Full Prof has gotten it, so all I need to do is go up for it.
  2. A student may think Everyone who I've ever seen defend their thesis has passed so all I need to do is get a defense scheduled. I have seen the following: nobody fails a defense for several years because the adviser don't put you up until you defend. Then the mentality sets in that all you need to do is defend and you'll pass. Then someone pushes this and ends up failing. THEN people are scared for the next few years, until its forgotten. That is why someone fails a PhD defense every 6 years or so.

WHY DENY SOMEONE FULL PROF? If you deny someone TENURE they LEAVE. Hence something is accomplished. But if you deny someone Full Prof they are still there, just annoyed. Hence it seems like a really bad idea to deny someone Full Prof.

WHY DO YOU WANT TO BE A FULL PROF? Salary increase? My school is now not giving raises. You get to write letters for more people. Is this really a good thing? You get to be on more committees. Is this really a good thing? You get more respect? This is questionable. People stop asking So, are you a full prof yet? This is a good thing.

HOW BAD IS IT: In both cases you can re-do. That is, you can come up for full prof later and you can fix your thesis and defend later. Even so, it can be a trauma.

FYI: I passed my PhD defense first try in May of 1985. I got Assistant prof in 1985, Associate Prof (Tenure) in 1991, and Full Prof in 1998.

Thursday, July 08, 2010

Conflicts of Interest

a conflict-of-interest? Some thoughts.

Thought One

PROF: I can't vote on Professor X's Full Prof case since I have a conflict.

CHAIRMAN: (There are not that many Full Profs around so he is concerned about having a quorum.) Really? Whats your conflict?

PROF: My wife works as an F.R.A (Faculty Research Assistant) for Prof X.

CHAIRMAN: Is that really a conflict?

PROF: (Surprised) Uh--- I really think it is.

CHAIRMAN: Which way would it bias you?

PROF: (Even more surprised) I don't think that matters.

The odd thing is that it really is hard to say which direction it would bias PROF in. Does the wife like her job? Does she know that the prof is really good at what he does? Really bad at what he does? It could go in any direction.

Thought Two

I have reviewed two books that my advisor wrote in my SIGACT NEWS column. For his excellent books BLOWN TO BITS I have at the beginning of the review:
Disclaimer: Harry Lewis, one of the authors, was my PhD advisor.
That seems fair. However, the reader may wonder which direction the bias goes. In my case I thought Harry was an excellent advisor (Disclaimer: I also know that he reads this blog). But what if I thought he was a terrible advisor? I wonder if the reader has a right to know which direction the conflict goes in. Perhaps disclaimers should be of the following form.
Disclaimer: Harry Lewis, one of the authors, was my PhD advisor; however, I believe this review is unbiased.
Disclaimer: Harry Lewis, one of the authors, was my PhD advisor; He was an AWESOME advisor. Hence this review may be positively biased.
Disclaimer: Harry Lewis, one of the authors, was my PhD advisor; He was a TERRIBLE advisor. Hence this review may be negatively biased.
(The advantage of the last one is that if it is a positive review then the book must be really really good.)

Thought Three

Lets say I was having the NSF theory director over to my house for dinner on his birthday. Since he may give me a grant someday, should I charge him for it? How about a compromise- he pays for the dinner but I pay for desert. I think the rule may be that I can spend less than ten dollars over the course of the year. (Richard- I hope you enjoyed your birthday desert, since that's it for the year.)

Wednesday, July 07, 2010

Drowning in Data

At a CCC Council meeting last week, a common theme emerged. The Army is "swimming with sensors and drowning in data". Biologists are considering trashing some of the DNA sequences they have already acquired because it will likely be cheaper to resequence in the future maintain the current data. We've been collecting tons of information, what should we do with it and if we don't have the tools yet to deal with all this data, should we bother keep it?

This reminds me of one of my favorite homework problems in complexity: Given two log-space computable functions f and g (the read-only input tape and write-only output tape don't count for space), show that the composition f(g(x)) is also log-space computable. The direct approach doesn't work for you don't have the space to store g(x).

The solution is to recompute the bits of g(x) as you need them. The lesson: When storage is expensive, it is cheaper to recompute what you've already computed. And that's the world we now live in: Storage is pretty cheap but data acquisition and computation are even cheaper.

So who says complexity has nothing to say about the real world?

Friday, July 02, 2010

Theory Happenings

FOCS accepts, abstracts and PDFs. The conference itself will be held October 23-26 in Las Vegas.

There is a proposed TCS version of Math Overflow, a Q&A site to let the crowd help your research. The site needs your commitments or it won't happen. More on the Geomblog.

I plan a post about NSF goings on after some expected announcements of new programs and personnel but just a reminder that the CAREER deadline is fast approaching, July 20 for CS.

Thursday, July 01, 2010

Is this solution cheating?

Consider the following problem:

A hole is drilled through the center of a sphere. The cylinder-with-caps is removed. The length of the removed cylinder (it also has caps on it which do not count for the length) is 6 inches. What is the volume of the remaining solid?

There are two ways to do this problem.
  1. Here is the solution using calculus:
  2. Here is a solution which you may consider cheating. The very asking of the question implies that the answer can be determined from the data given. Hence we can CHOOSE an instance of the problem and KNOW that our solution for this instance is always the solution. We choose to have a cylinder of radius 0 (so its just is a line of length 6). Hence the answer is the Volume of a Sphere that is 6 inches in diameter: (4/3)(π)33=36π.
There are two ways this may be considered is cheating.
  1. Minor one: Deriving the volume of a sphere itself requires calculus so I didn't really get around that issue. However, the Volume of a sphere is well known so I think this is a quibble. (Does anyone know a non-calculus proof for the formula for the the volume of a sphere?)
  2. Major one: We used the fact that the answer can be determined from the data to find the answer. Is this appropriate?
How would you grade this if given as an answer on an exam? Here are some thoughts:
  1. If you put this on an exam what would you do if a student had this solution? Reward them for thinking outside the box or penalize them for not showing they know calculus?
  2. What if it was on a mathematics competition?
  3. Best solution might be to make it a multiple choice question so they do not need to show how they did it. Those that think of the clever solution are rewarded by spending less time on it--- unless it took them a long time to think of the clever solution. Those that do it via calculus also get it right. You might want to make one of the choices Cannot be determined from the data given.