## Tuesday, December 22, 2009

### How to tell how good a TV show is

(This is my last blog of the year. Lance will interrupt his blog sabbatical to do an END OF THE YEAR blog later.)

The TV show MONK recently finished its 8th and final season. My wife and I are big fans and have seasons 1-7 on DVD (and we will get 8). But this post is not about Monk. Its about the question: How to determine how good a TV show is? I am sure that whatever I say here may apply to other problems.

First assign to each episode a number between 1 and 10 depending on how much you liked it. (This could be the hardest part of the method.) Let t be a parameter to be picked later. t stands for threshold. If your criteria is How likely is it that an episode is OUTSTANDING? then you would pick t large, perhaps 9. If your criteria is How likely is it that an episode DOES NOT SUCK? then you would pick t small, perhaps 2. Some of the methods use t, some do not.

There are many different ways to do this. We give a few of them:
1. The mean or median of all of the episodes.
2. The probability that a randomly chosen episode is rated above t. (Could also get into prob that it is within one standard deviation from t.)
3. The probability that a randomly chosen disc has an episode rated above t.
4. The probability that a randomly chosen disc has fraction f of its episodes rated above t.
5. Rate each disc in the DVD set for the entire season. The mean or median of all of these ratings.
6. The mean or median of the best season.
7. The mean or median of the worst season.
There are others as well. But the question really is, given a set of numbers grouped in a natural way (in this case roughly 8 sets of 16 numbers, and each set of 16 in groups of 4) how do you judge the quality?
For those who are fans of the show MONK here are my choices for OUTSTANDING and UNWATCHABLE episodes: here

1. as for the post, fair enough. but everything aside, interupt lacks an "r".
the question now turns into, how do we conclude how good a post is ? Do we take typos into account ?

2. I know we do it all the time in theory PCs, but is there any actual meaning to taking averages when given input in the form of human rankings on a scale of 1 to 10? Is there any sense in which this scale is linear? Or should we interpret it purely as an ordering, in which case the only thing that makes sense is the median?

3. These methods wouldn't work well for shows like The Wire where individual episodes are subsumed by the overarching narrative. I propose instead that you pick threads that run through all the episodes and evaluate them. e.g.,

What is the probability that a randomly chosen character is compelling? You may want to assume you sample according to screen time or importance to the story.

What is the probability that a randomly chosen actor will deliver a good performance? Again, you probably want to sample according to screen time.

What is the probability that a randomly chosen writer will deliver a good script?

What is the probability that story elements brought up in earlier episodes will pay off in later episodes?

What is the probability that the story will have a satisfying conclusion?

etc. Then you can define value preferences: how much do I value good acting versus a good ending; how much do I value good characters versus a good story; and so on.

4. Off topic -- I'd like to say thanks to Bill and Lance, and to all the commenters, for an informative, entertaining year on the blog. Happy holidays to everyone.

5. lance, don't let us down12:19 PM, December 23, 2009

Lance, I am disappointed by you making hype about wolframalpha. Please stick to open source alternatives. For someone in your position Advertising for a company with a notorious reputation is somewhat of a let down ...

6. Talking of wolframalpha, wouldn't the bing iphone app actually suffice?

7. WolframAlpha isn't worth even mentioning. It's a black box that that is flawed as it is based on NKS work. If I remember correctly I once saw a slashdotted story on how flawed work in NKS is. A mathematician disproved like 1000 claims. Now, assume that your output result you obtain via wolfram alpha are as reliable as claims made by nks.

no wonder how uncreative Microsoft's BING is, it relies on something as unreliable as Wolfram.

8. If I remember correctly I once saw a slashdotted story on how flawed work in NKS is. A mathematician disproved like 1000 claims.
I read it on the internet. It must be true.

Seriously dude. Even some near future version of Wolfram Alpha may be smart enough to know that "some of NKS is broken" and "alpha may have some connection to NKS" does not imply much about the reliability of alpha.

In fact, I am not sure Wolfram Alpha has anything to do with NKS. It contains an online version of Mathematica, and has reasonably good NLP skills. It has access to several databases that it can look up in a sensible way. It can be useful for certain things (e.g. type in "1,1,2,3,5,..." or "1+1/4+1/9+..." or "integrate .1 to .2 sin x/ x "). At least these parts have exactly zero connection to NKS, as far as I can tell.

I look forward to your examples of the unreliability of Alpha, which you would no doubt put on slashdot soon.

9. Looking forward forward to lance's end of year wrap up! And his progress on his book ? Does anyone know why lance decided to write the book ?

10. I think the story referenced is
http://science.slashdot.org/article.pl?sid=07/12/23/1817233
in which 44 claims (not conjectures!) in Wolfram's book are shown to be wrong.

11. > I look forward to your examples of the unreliability of Alpha, which you would no doubt put on slashdot soon.

I'm not the same anon, but
http://www.wolframalpha.com/input/?i={{0%2C1}%2C{1%2C-7}}+*+{{7%2C1}%2C{1%2C0}}

shows that W|A thinks that [0,1;1,-7] times its inverse [7,1;1,0] is *not* the identity matrix.

12. I'm not the same anon, but
http://www.wolframalpha.com/input/?i={{0%2C1}%2C{1%2C-7}}+*+{{7%2C1}%2C{1%2C0}}

shows that W|A thinks that [0,1;1,-7] times its inverse [7,1;1,0] is *not* the identity matrix.

This is indeed bad UI design... it comes from Mathematica where matrix product is '.' and '*' is the TIMES operator, which is a strange operator that multiplies point by point, so [a, b ; c, d] * [x,y;z,w] is [ax,by; cz, dw]. Not sure why they made such a stupid design choice in mathematica, and why it continues in alpha. Nevertheless, one can't blame NKS for this design choice that goes back to mathematica :)

13. i don't trust wolframalpha's output at all. particularly after seeing the publication.

14. (Coming back to choosing a TV show) To add to the analysis, one non-quantitative factor I think of is how badly I want to watch an episode even if someone gave me a jist of what's going to happen therein.

So, conditional probability could be used to make the following a good measure of how good the TV show is:
Pr[Watch the show|Story] (Pr. that I will watch the show given that I know the story).

P(story) can be decided by the narrator who tells me as much portion of the story while Pr(watching the show) can be chosen using any of the ways you mentioned.

Vineet
India