Wednesday, April 24, 2024

Is Persistence an Anachronism?

Guest post by Martin Bullinger

Very recently, Vijay Vazirani's paper A Theory of Alternating Paths and Blossoms, from the Perspective of Minimum Length got accepted to Mathematics of Operations Research. For the first time, it gives a complete and correct proof that the Micali-Vazirani algorithm finds a maximum cardinality matching in time \(\mathcal O\left(m\sqrt{n}\right)\). I would like to give an account of the extraordinary story of this proof and how Vazirani's contribution inspires persistence.

My fascination for matching already started during my undergrad when I gave a talk on Edmonds' blossom algorithm. It was at this time that I first heard about the Micali-Vazirani (MV) algorithm. Naturally, I was quite excited when I got to know Vazirani personally years later. When I talked to him about the MV algorithm I was, however, shocked: Vazirani admitted that even to that day, there did not exist a complete proof of its correctness. How can a theoretical result be accepted to FOCS without a proof?

Now, 44 years after publication of the algorithm, a proof exists and has been peer-reviewed in great depth. But why did it take so long? Apparently, some results just need time. Sometimes a lot of time. Think of Fermat's Last Theorem, whose proof took 358 years! So what is the story behind the MV algorithm? It can without a doubt be seen as a lifework. Together with his fellow PhD student Silvio Micali, Vazirani discovered it in the first year of his PhD in 1979-80. Without even attempting a proof, it was published in the proceedings of FOCS 1980. The first proof attempt by Vazirani was published in 1994 in Combinatorica. Unfortunately, this proof turned out to be flawed. It took another 30 years until his current paper.

What kept Vazirani going for so long? In the acknowledgements of his paper, he thanks matching theory for its gloriously elegant structure. Vazirani was driven by his passion for the subject matter---but passion by itself can only go so far. Even more important was his belief in the correctness of the algorithm and the theory, which he had broadly outlined in his 1994 paper. Similar to Andrew Wiles' story, his perseverance led him to the idea which clinched the proof. In Vazirani's case, this was to use the new algorithmic idea of double depth-first search, which forms the core of the MV algorithm, and now, its proof as well. But Vazirani's result is also the story of an excellent research environment. Finding deep results requires colleagues or friends to discuss ideas with. Vazirani had these in the form of strong postdocs and PhD students. About ten years ago, he had been discussing ideas towards his proof with his former postdoc Ruta Mehta, and in the last three years, he discussed the final touches of his proof with his current PhD student Rohith Gangam. Needless to say, both of them gained a lot from these discussions.

So why should we care for the MV algorithm? I have several reasons. First, without doubt, it is a historic result within combinatorial optimization. Matching is one of the most fundamental objects in discrete mathematics and we keep finding new applications for it, for example, in health, labor markets, and modern day matching markets on the Internet, basically in every part of our lives. But there is more. Once again, one can look at Vazirani's paper where he describes the impact of matching to the development of the theory of algorithms: Matching theory has led to foundational concepts like the definition of the complexity classes \(\mathcal P\) (Edmonds, 1965a) and \(\# \mathcal P\) (Valiant, 1979), the primal-dual paradigm (Kuhn, 1955), and polyhedral combinatorics (Edmonds, 1965b). The impact of matching on complexity theory was an earlier topic of this blog.

Despite being around for decades, the MV algorithm is still the fastest known algorithm for computing a maximum cardinality matching. This is surprising, to put it mildly. Similar to many other fundamental problems in combinatorial optimization, I would have expected the discovery of better algorithms in the last four decades. Why has this not happened? Vazirani appears to have gotten to the essence of the problem: a profound theory that interleaves algorithmic invariants and graph-theoretic concepts. It seems to be the kind of theory which would play an active role in the field of combinatorial optimization.

However, Vazirani's result proves something else, possibly even more important: the massive gains to be made by single-minded persistence. In a world in which departments and promotion procedures focus on publishing large numbers of papers, it seems impossible to work on one result for more than a year, let alone for decades. Vazirani managed to achieve both: pursue his passion and get the unfinished job done, but not let it come in the way of the rest of his otherwise-active research career. As a young researcher, this inspires me! In the end, it is through such persistence that science will take big steps forward.

This blog post evolved from many enjoyable discussions, which I had with Vijay Vazirani during a research stay at UC Irvine in spring 2024. I am grateful to Ruta Mehta for feedback on the initial version of this post. Vazirani recently presented his paper in a mini series of two talks available online.

17 comments:

  1. "the MV algorithm is still the fastest known algorithm for computing a maximum cardinality matching"

    The above comment is wrong as written, since matrix multiplication based algorithms are faster than the Micali-Vazirani algorithm for maximum matching.

    ReplyDelete
    Replies
    1. That result is mentioned on page 6 of the paper.

      Delete
    2. I’m not saying the paper is wrong — I’m quoting the blog post, and pointing out that a statement it makes is wrong.

      Delete
    3. I don't think the two algorithms are comparable in terms of asymptotic complexity. MV is faster for sparse graphs but slower for dense graphs.

      Delete
    4. That's a good point---it still means that the statement "the MV algorithm is still the fastest known algorithm for computing a maximum cardinality matching" is wrong as written.

      Delete
  2. seems to be a good candidate for coq or similar proof assistants.

    ReplyDelete
  3. The story of the Micali-Vazirani algorithm really bothers me, and I'm sure that's true for many others too. Indeed it is well-known that there were gaps in earlier proofs. Fine, that happens. But I cannot find any published statement about this, except in other papers in the interim that all purport to fill in the gaps. Is this how we want to set the norms of our field? Is it ok to publish yet another proof of correctness without retracting and explaining the issues in all previous attempts? I'm disappointed that this blog, which is already well past its prime, doesn't discuss these kinds of issues, which are definitely relevant to our field.

    ReplyDelete
    Replies
    1. Agree that a result known to be incorrect/incomplete should be retracted (in unfixable). However, the new paper does a nice job examining/explaining the mistakes/gaps in the original attempt.

      Delete
    2. I distinctly remember a talk of Vazirani roughly ten years ago where he started with, "Everybody knows my FOCS 1980 paper is wrong!" There are many things disturbing and deeply wrong about this statement and the context in which he made it that turned me off of TCS research. Mainly it was:
      -The statement is false: not everybody could know, as there was no full version of that paper available (to even try to check), and there no erratum or withdrawal of the paper.
      -The speaker was unashamed of naming the prestigious conference in which the extended abstract was published, which surely boosted his career, despite the errors, to the detriment of others.

      Delete
    3. Everybody is you and your friends.

      At the age of Internet, you need to publish and publisize the incorrectness of your claim as soon as you notice it. That is what academic honesty requires.

      Delete
    4. When people say FOCS/STOC are biased in what they accept, this is a good example. If you are well-known or connected to well-known, you will get a pass that others won't get.

      And this in turn leads to bias in credentials that are used to decide who gets what position and what resources.

      It might have been ok when the field was young and small and a small clique of people who all knew each other, but we are way past that time.

      Delete
    5. The editors of Combinatorica should retract Professor Vazirani’s 1994 proof of correctness of his algorithm, citing the many errors discussed in the new MOR paper.

      Delete
  4. There was a proof or an idea-for-a-proof-in-the-minds-of-MV when it was first published. How does the final proof relate to the initial attempt? Entirely new proof? Based on it but needed to fill in a few details (I doutb that since it would not have taken over 40 years)? I suspect somewhere inbetween but I would be curious if Varzarani or Bulliger can tell us.

    Rather than complain about the situation I ask a question:
    a) How common is it for a result in a prestige conference to be WRONG (note that the MV result was not WRONG)

    b) How common is it for a result in a prestige conference to have a proof or proof sketch that is NOT EVEN CLOSE to the final proof, needs new ideas, though the original result is still true.

    c) How common is it for the proof sketch to not have some detail that experts could fill in but non-experts could not. I would guess this is common.

    For important papers the problems will probably be discovered. But for unimportant papers that nobody reads, an error can linger for decades. That can be a problem if some unimportant paper becomes important as a lemma to a later result.

    ReplyDelete
    Replies
    1. Is it okay to say a result is "not WRONG" even though it's proof as written is wrong, just because later correct proofs are found?

      For example, many incorrect P not equal to NP proofs are posted online nowadays. In the future if someone presents a correct proof that P does not equal NP, is it now justified to say that all those initial posts were "not WRONG"?

      Delete
  5. Regarding alternatives to the MV algorithm, please see these two papers:

    "The weighted matching approach to maximum cardinality matching"
    H.N. Gabow, Fundamenta Informaticae (Elegant Structures in Computation:
    To Andrzej Ehrenfeucht on His 85th Birthday), 154, 1-4, 2017, pp 109-130. This algorithm
    finds a maximum cardinality matching in time $O((\sqrt n) m)$.

    ``Maximum cardinality $f$-matching in Time $O(n^{2/3}m)$''
    H.N. Gabow, arXiv:2311.14236, 24 pages. Currently submitted for publication.
    The cited bound is for simple graphs with arbitrary degree constraint function $f$. The algorithm also runs in time $O(min( \sqrt {f(V)}, n) m)$ for multigraphs with arbitrary $f$, $f(V) = \sum_v f(v)$.
    When $f$ is identically 1 this is again the MV bound.

    These algorithms don’t use double depth-first search, just ordinary depth-first search.

    ReplyDelete
    Replies
    1. Please also note that the above 2017 paper is a short and self-contained presentation of an O(\sqrt n m) matching algorithm.

      Delete
  6. The paper cites Wikipedia!

    ReplyDelete