CIS Homeline

 

CIS Home divider Penn Engineering divider PENN   spacer
 

 
  Craig Nevill-Manning: Finding needles in a 20 TB haystack,  200 million times per day  

Google faces two large technical challenges: ensuring that our search results are as relevant as possible, and serving hundreds of millions of queries in a fraction of a second at a reasonable cost. To solve the first problem, we perform an offline matrix computation to produce PageRank, a query independent measure of page reputation, and combine it with more traditional query-specific scoring. To solve the distributed computing problem, we use tens of thousands of commodity PCs and highly fault-tolerant software. I will discuss some details of these solutions, and alsoshare some interesting statistical tidbits about search and the web.

For more information about Craig please visit his web site:
http://craig.nevill-manning.com/

Back to main Colloq Page


 
 
CIS Home divider Penn Engineering divider PENN   spacer