Monday, August 23, 2010

Is Scheduling a Solved Problem? (Guest Post)

(Guest Post by Ben Fulton.)

"At first glance, scheduling does not seem like a topic that requires much attention from computer scientists".

This was how I wanted to start my review of a book on scheduling, but Bill Gasarch called me out on it.  "Really?" he said.  "I don't know any computer scientists who think scheduling isn't important."  It's true but computer scientists aren't the ones taking first glances at scheduling problems.

They're curriculum designers trying to determine which instructors are needed to teach all of the classes for the fall semester.  Shop managers wondering how the widgets could be sent through the assembly line more efficiently.  Kids on the playground choosing up sides for a soccer game (a problem identical to partitioning a set of jobs with known running times between two processors).   These are the people with scheduling problems.  They'll think about their problem for a few minutes, and come up with a good solution in each case.

The curriculum designer - possibly Michael Mitzenmacher  - might decide that the most important goal is to keep all instructors at around the same number of teaching hours.   In that case, each next available assignment would be given to the least busy instructor.   In doing so, he'll choose the List scheduling algorithm first proposed by Graham in 1966 and known to be no worse than twice as slow as an optimal schedule.

The kids will choose Greedy.  They'll choose a couple of captains and alternate picking the "best remaining" player, in the same way that a scheduler would choose the "largest remaining" job.   Greedy is a heuristic that runs in polynomial time.  It's not guaranteed to find the best solution it's The Easiest Hard Problem but the kids are interested in having competitive teams, not making sure that all possible sets of children can be perfectly divided in polynomial time.

The shop manager is likely to have a lot of different constraints to take into account, but she might notice that a station can stop working on one widget and start on another one, if the second is likely to be finished more quickly.  She probably won't realize it, but the Shortest Remaining Processing Time algorithm is known to optimize the average time to completion of the widgets.

In all three cases, the algorithms they choose are simple, easy to describe, and should work fairly well in their situations.  The schedulers aren't computer scientists - just people with problems to solve.   They'll take a first glance, and they'll solve them with a minimal amount of effort.  Even if you showed them a way to solve their problems that was twice as efficient, but also much more difficult to understand, they'd probably reject it.

So what's the point of studying scheduling?  The practical problems are solved. 

That's the first glance.  You've got to dig a little deeper to find the interesting problems in scheduling.  For example, the complexity of simply describing scheduling problems is a subject that hasn't fully been explored yet.  Problems are typically broken down three ways: the number of processors available; the constraints on when jobs can be run; and the criteria for determining whether one schedule is better than another.  Even if the first and third items are fairly simple, setting up a job precedence graph, lag times, and perhaps a few rules involving a specific processor needing to run a specific job, will likely generate a description so complex that an engineer trying to solve the problem might not even recognize it.

Scheduling gurus Anne Benoit, Loris Marchal, and Yves Robert also ask this question.  In response, they outline some areas of study involving distributed-memory parallel computing that could give rise to some interesting practical improvements. 

That's where you go when you're past the first glance.   And that's why computer scientists need to pay attention to scheduling.



  1. One characteristic of all the problems you describe is that they are relatively small: in all your examples a single person can handle all the input. But what about larger scheduling problems: think FedEx, airlines, major online retailers, Google. I think they all have scheduling problems where, if someone could show them an algorithm that gives a 2% better solution they'd be very happy, because that 2% translates into X millions of dollars per year.

  2. I'd say that when you introduce elements to it like uncertainty, the problems become more difficult to solve. Look no further than the Washington Redskins and their attempt to fill a roster spot. The greedy algorithm says to choose the 53 most talented players, but suppose a few of those top 53 are injury prone (ala Donovan Mcnabb), then how does that injury affect you choosing that player or not?

  3. Josh: No doubt, unless they're using an algorithm that's already been shown to be optimal. In which case they'd have to do some more work to make their model more precise.

    Thought: Scheduling doesn't really involve itself with the evaluation routine. It's a bit like calculating possible moves in chess - you can find all the possible moves, but whether one board is better than another is still a bit subjective.