Thursday, January 08, 2009

Sorting a Partial Order

I liked the SODA2009 paper Sorting and Selection in Posets since it makes you look at ordinary sorting in a new way.

What does it mean to sort a list? We usually think (correctly) that we take a list of ordered objects and put them into an ordered list. How to extend this to partial orders? We need to re-look at total orders.

Say that making a comparison between elements of the ordered set is HARD Then you want to make as few as possible. But you want to, when you are done, have a data structure that makes comparisons easy. Hence we view sorting as follows:
Given a set A of n elements from a totally ordered set come up with a data structure of size O(n) such that the operation Given x,y &isin A which one is bigger? can be done in O(1) time. While setting up the data structure you would like to to do this with as few comparisons as possible. We will assume that comparing two elements of {1,...,n} is easy.
Given a sorted array you can easily obtain such a data structure: pair each element with its index in the array. To compare x,y quickly just compare index(x) and index(y). Note that we think of comparing numbers in {1,...n} as easy but comparing elements of the ordered set as being hard.

The papers Sorting and Recognition problems for Ordered Sets (SICOMP 1988, Faigle and Turan) and Sorting and Selection in Posets (SODA 2009, Daskalakis, Karp, Mossel, Riesenfeld, Verbin) study this issue.
Definition: Let P=(X,<) be a partial order.
  1. A chain decomposition of P is a partition of X into sets each one of which is totally ordered under < .
  2. A chain merge data structure for P is a chain decomposition together with the following. For each x &isin X, the following information: (1) which i has x &isin C_i, and (2) for all j, what is the largest element in C_j that is < x. It is easy to see that from such a data structure you can do comparisons in O(1).
  3. The width of P is the min number of elements in a chain decomp.
RESULTS:
  1. Faigle and Turan showed that if P has width w then it can be sorted in O(wnlog n) comparisons. (Can also see Daskalakis et al paper for this.)
  2. Daskalakis et al showed that if P has width w then it can be sorted in O(n(w+log n)). They use the Chain Merge Data Structure. This bound matches the info-theoretic min.

5 comments:

  1. Interestingly, in the Daskalakis et al paper, in the optimal algorithm that takes O(n(w+logn)) comparisons, it takes a long (but still polynomial) _running time_ (with potentially large exponent) to decide which comparisons to make. It's still open to get an optimal algorithm that also takes O(n(w+logn)) _time_. I don't even have any idea how to do it if you assume correctness of the 1/3--2/3 conjecture.

    I wonder if an algorithm that works like quicksort works. I.e. choose a random element, compare all elements to it, get all the information you can and put it in a "knowledge base" (i.e. a directed graph), and keep doing this. Each time, if you picked x and the knowledge that x &lt y or x &gt y or that they're incomparable is already in your database, then you don't compare x to y. Furthermore, it might be helpful to perform these comparisons in a random order, and some of them may be spared because you might get information in the process.

    Note that the rules of updating the data base should include the following two rules (and maybe only them and their symmetric counterparts). First of all, if you already knew that x &lt y and you now find out that y &lt z, then you add the knowledge that x &lt z. Second of all, if you already know that z &lt x &lt y and that both z and y are incomparable to w, then you conclude that x is incomparable to w. I wonder how well this does...

    ReplyDelete
  2. &lt stands for the less-than sign. Opps.

    ReplyDelete
  3. So a quick newbie question. I understand that you can obtain a minimum(?) chain decomposition of a partial order in linear time. Can you please point out some references to such algorithms? Thank you very much.

    ReplyDelete
  4. Hi, I've 1 comment and 1 question

    Comment to Anonymous' question: computing a minimal chain decomposition of a poset takes more than quadratic time, as asserted here:

    http://www.inf.unibz.it/~franconi/papers/appl-int-94.ps.gz

    The same paper provides a heuristic algorithm.

    Question: it is tightly related to (but different from) the problem of sorting a poset:

    If you have a poset (P,<) and an element x not in P, which is the complexity of determining the cover of x, i.e., the set of minimal elements in P that dominate x (a.k.a. parents of x)?

    If P is decomposed into w chains (ChianMerge data structure in SODA09 paper), then O(w*logn + w^2) clearly suffices:
    - O(w*logn) to perform binary search on the w chains to find the parent of x (if any) in each chain; this yields a superset of the cover of x.
    - O(w^2) to test that such parents are indeed incomparable (if p1 and p2 are both returned from step 1 and p1 < p2 then p2 is not a parent of x)

    But the above is quadratic in w :-(
    Any reference?

    ReplyDelete
  5. I've found this post after reading the "Poset Sorting Paper". So I hope somebody is reading this comment after almost 3 years :)

    pciaccia's statement: "computing a minimal chain decomposition of a poset takes more than quadratic time" is based on a sentence found in the paper he links.

    Anyway, we have to carefully notice that what Daskalakis et al. do is to "keep" a chain decomposition of q <= w chains. In other words, for each element considered in loop at 4. maintaining the decomposition C is O(w log n) as it requires, at most, w binary search to find the right chain.

    Am I wrong?

    ReplyDelete