August 17, 1999

A Computer Program That Plays a Hunch

  • Join a Discussion on Artifical Life

    Anyone who has been to the racetrack knows that picking a horse is part art and part science: a gambler can pick the horse with the best odds, pick a horse that won its last race, pick the horse that has the best winning streak or choose on the basis of a hunch. It was intuition, after all, not statistics, that rewarded those who picked Charismatic to win the Kentucky Derby in May at 32-to-1 odds.

    Now, a small group of researchers say that such intuition can, at least in part, be duplicated by computers. Rules of thumb that are only partially accurate can be combined into precise prediction tools using an artificial intelligence technique called boosting.

    Boosting has two qualities that distinguish it from existing computer prediction methods. While highly accurate, it is simpler to execute than many other programs. And it is able to make predictions even when the programmer provides very general -- or even conflicting -- guidelines. In other words, it simulates, albeit in a more sophisticated and accurate manner, human intuition.

    "Philosophically it's an interesting result," said Robert Schapire, who has done seminal work on boosting with fellow AT&T researchers Yoav Freund and Yoram Singer. "You can take a set of prediction rules that is better than random guessing and combine them into something that is very accurate. It's kind of counterintuitive."

    Boosting is just beginning to emerge from academia into commercial use, but researchers believe it may have a wide range of potential applications.

    Companies, for example, might use it to identify patterns in consumer behavior, a practice called datamining. And the technique, which can handle not only simple yes or no inquiries but multicategory classification and ranking, as well, might come in handy in filtering junk E-mail, making medical diagnoses, helping banks or other institutions identify a person's handwriting, or allowing speculators to pick a winner in the stock market, or at the racetrack.

    In the past, machine learning focused on developing large, complex sets of analytical rules that the computer could use to make predictions. But boosting shifts the emphasis to simply finding rules that do better than chance.

    A number of studies have shown that boosting is often as accurate or even more accurate than older predictive techniques. In addition, it can be applied to already complex methods, like neural networks and decision trees, to further "boost" their accuracy. Boosting acts as a big computational blender that can combine a wide variety of predictive techniques from rough rules of thumb (which computer scientists consider "weak") to already highly accurate algorithms (which are considered "strong").

    In essence, boosting works by assigning different priorities to rules of thumb by analyzing historical data fed in by the programmer. The technique has five steps, which it repeats until its accuracy cannot be improved upon.

    Given a rule of thumb, it tries it out, then analyzes the errors, and assigns a priority ranking. Solving the errors made in the first round is given a higher priority when the next rule of thumb is tried.

    Eventually, the results from all trials are combined and the program "votes" on the final answer.

    In racetrack prediction, boosting might work as follows: A programmer would feed the boosting algorithm historical data on the race, the horses and their records, in addition to a number of rules of thumb, to be executed in order.

    For example, the algorithm might first ask how often the horse with the best odds won. Then the computer might go on to the next rule of thumb to see how often a horse that won its previous race won again. In a third analysis, how often a horse with the largest winning streak continued winning might be examined, and so on. In the end, the horse that satisfied the most conditions with the greatest accuracy would be selected.

    Boosting has earned fans among computational learning researchers, in part because it provides a happy marriage of theory and experiment. "It's very easy to implement and understand. It's very efficient. It yields good results. And for scientists, it has a firm theoretical standing," Dr. Singer said.

    The algorithmic simplicity of boosting also provides financial advantages, which could make it attractive to companies.

    "Someone who finished a first-year programming class and understands algorithms can implement it," he said. In contrast, many other predictive analysis techniques require sophisticated knowledge of programming and statistics.

    Boosting is not without its flaws. One problem that faces all predictive techniques based on historical data is that past is not always prologue. Also, because boosting focuses on items that produce repeated errors, it can be influenced by extreme examples or incorrectly entered data. At the same time, the technique makes it easier to identify such unrepresentative cases, which can then be thrown out.

    The possibility of boosting was first suggested by Leslie Valiant and Michael Kearns in a 1988 paper presented at Harvard University.

    The first algorithms were developed within a year or two, and experiments began in 1992. Much work has come from AT&T's laboratories.

    At first, many were skeptical that boosting was anything more than a theoretical concept. But since 1995, mounting experimental evidence has fueled an interest in boosting among researchers in machine learning and statistics.

    Boosting research is still in its early stages. "By analogy with aircraft, Rob et al have demonstrated that it's possible to fly," noted Dr. Ross Quinlan, an Australian researcher, referring to Dr. Schapire's early work. "Now we're at the stage of building bigger, faster planes with different engines," Dr. Quinlan said.

    Boosting, like other aspects of computational learning, raises fundamental questions about the nature of artificial intelligence. Despite the accuracy of boosting, the algorithm does not really "know" anything: It does not search for certainty, but merely makes mathematically calibrated predictions based on precedent.

    "There can be surprising outcomes when you don't look for the truth and just go for the prediction," said Dr. Freund, who worked with Dr. Schapire to create one of the first practical boosting algorithms.

    One experiment called for a computer to make deductions based on statistical analysis of newspaper text.

    "I asked the computer, 'Where is the Taj Mahal?' " Dr. Freund said. "It answered, 'Atlantic City,' because according to the text that was where the Taj Mahal was mentioned most."

  • Home | Site Index | Site Search | Forums | Archives | Marketplace

    Quick News | Page One Plus | International | National/N.Y. | Business | Technology | Science | Sports | Weather | Editorial | Op-Ed | Arts | Automobiles | Books | Diversions | Job Market | Real Estate | Travel

    Help/Feedback | Classifieds | Services | New York Today

    Copyright 1999 The New York Times Company