Monday, July 08, 2013

AltaVista versus Google

Today Yahoo is closing AltaVista, the best search engine before Google. The news caught me by surprise, AltaVista still existed? A number of commentators attribute bad management for AltaVista losing its dominance to Google. But it was an algorithm that killed the search engine.

AltaVista made its claim to fame in the mid-90's by indexing a large number of web pages. AltaVista did very well for obscure search terms like "fortnow" but didn't do so well for more common searches. I used to run a test on search engines by looking for "Holiday Inn", a popular hotel chain in the US. When you search AltaVista for Holiday Inn, the first thing listed was a Holiday Inn in Buffalo, New York. The Holiday Inn home page was nowhere to be found on the search results.

For searches like Holiday Inn, one had to use Yahoo, which back then was not a search engine but a directory tree of web sites. We needed our own directories as well. Ian Parberry maintained the TCS Virtual Rolodex, a list of home pages of theoretical computer scientists, most of which had names common enough that AltaVista wouldn't find them.

A Stanford professor (I can't remember which one) came to give a talk at the University of Chicago around 1997 and he mentioned a research project at Stanford developing a new search engine known as Google. I tested Google with my Holiday Inn test and was in shock when the Holiday Inn home page showed up as the first time. Google passed every other test I could throw at it and I've rarely used any other search engine since. Google made AltaVista, the Yahoo directory and the TCS rolodex irrelevant. Google's PageRank algorithm simply took search to a new level, like the way that Steve Jobs didn't create the first smart phone but completely changed the game with the iPhone. AltaVista managed to survive for another 15+ years but never recovered market share.

The AltaVista story leads to a lesson we still tackle today. Collecting and storing big data is a huge technical challenge but data by itself is of limited value without the algorithms to find the important parts among the muck.

9 comments:

  1. But maybe the era of algorithms has already passed its peak?

    ReplyDelete
  2. But maybe the era of algorithms has already passed its peak?

    Yes, just like the era of electricity before it.

    ReplyDelete
  3. I recall a couple of years where I still used Altavista for cases where Google didn't work, usually when what I wanted was "overshadowed" by a more popular interpretation of the search term. The available Boolean connectives in Altavista were handy then.

    Later Google got improved and Altavista got a horrid homepage, though probably people who know how to use Boolean connectives would not have been enough to keep Altavista afloat anyway.

    ReplyDelete
  4. Stanford Professor Lance can't remember == Jeff Ullman.

    ReplyDelete
  5. Today may mark the day Google began its downfall. Not because of
    Lance's blog praising it, but because it has a doodle which seems
    to slow down my browser (This could be an optical illusion, but it
    still seems slower) and the doodle didn't point to anything to tell me
    what it was- I finally had to look on wikipedia to find out.

    One of Google's strenght was a simplistic display. The Doodles may be
    fun to look at, and to make, but when they begin impacting functionality
    it could be the beginning of the end. Also-there may be controversies
    like `why didn't Google do a doodle on X-day' or `why did they do a doodle
    about Y'.

    ReplyDelete
    Replies
    1. oh good, it wasn't just me that thought it was painful slow to play that google doodle

      Delete
    2. My 10 year old laptop started to get superhot and nearly froze. Sure, I am not using their 'recommended' browser :-). The HTML5 or whatever cleverness many websites are using these days needs much better client side management. Unlike flash or java there seems no easy way for the user to only turn on those widgets that they really wish to see. Workarounds will be found in due time for blocking this nonsense.

      Delete
  6. Why didn't AltaVista adopt pagerank themselves? (Or did they and it was too late?)

    ReplyDelete