From: Alex Kulesza [kulesza@cis.upenn.edu] Sent: Monday, October 16, 2006 1:01 AM To: Michael Kearns Subject: Re: Important SSS announcment Attachments: alex-sss-cars.pdf I decided to try and analyze the behavior of sponsored search markets for related keywords when they are modified with terms that more closely specify the interests of the user. I settled on car brands as a set of base keywords since cars are expensive, have unique brand names, and are sold by a wide array of dealers. (Items sold exclusively by a manufacturer tend to have a limited number of advertising bidders.) As modifiers, I chose 'new', 'lease', and 'used'. Thus, I collected bid information for the searches 'ford', 'new ford', 'lease ford', 'used ford', etc. My goal was to see if the sub-markets defined by the modifiers showed different behaviors than those of the brand names alone. The results are attached in a pdf; for each brand/modifier combination (including no modifier), the second advertiser's bid is recorded. I also tried first bids, bid averages, numbers of bids, etc.; second bids seemed the least noisy and have the advantage of actually reflecting what the winner is paying per click. The brands have been sorted by the bid for the base search with no modifier. The results are rather interesting. While it is difficult to guess why Chevrolet gets the highest bids and a similar (popular, American) brand like Dodge is near the bottom, the general trends for the different sub-markets do seem to be revealing. Notably, the 'new' and 'lease' results roughly track the base results (correlation coefficients of 0.62 and 0.37, respectively), while the 'used' results show a significant negative trend (correlation of -0.75). It seems plausible to conclude, therefore, that the majority of users who search for brand names of cars and then click on ads are in fact looking to lease or buy a new car -- or, at least, that the market believes that to be so. The bids are not only correlated but also of similar scale, suggesting that having added 'new' or 'lease' to one's search does not increase the expected amount of money that a user will spend after clicking on an ad. On the other hand, a user who modifies a brand name with the term 'used' is very different. Perhaps unsurprisingly, the new and used car markets appear complementary on a per-brand basis -- maybe according to the brand's expected depreciation, though I have no data to back this up. It is obvious that 'used' searches in general have much lower bids than the others, which seems reasonable in light of the fact that used cars are usually cheaper. It is also reasonable to imagine that users searching for used cars are more likely to include the term 'used' in their search than users searching for new cars are to include 'new'. Thus, the apparent assumption that a brand-only search primarily implies interest in new vehicles is probably valid. One might argue that the base keyword bids are in fact merely *maximums* over the modified-keyword bids, and do not necessarily reflect the proportion of users intending, say, the more specific 'new' context. The argument would be: an advertiser selling new cars can afford to pay more per click than an advertiser selling used cars, so even if the users searching brand names alone were split 10/90 into new and used car searchers, the new car ads would dominate despite appealing only to that 10% of users. This is possible; however, I suspect that users are primed by their own search (and, maybe, faith in Google/Yahoo) into believing that ads are relevant to them. If there were really a large fraction of users searching for used cars, many of them would end up clicking on the new car ads, spending no money, and costing advertisers in the process. In fact, if the argument presented above were stretched to its limit, every search would be dominated by mesothelioma ads. This doesn't happen because your average user is not worth much to ambulance-chasers and cannot be trusted to click only relevant ads. Thus, keywords remain an important indicator of user intent, and, in this case, a brand-name-only search appears to indicate interest in new cars. Alex Michael Kearns wrote: > All --- In order to get the creative juices flowing for empirical projects, > I am giving a little assignment required of all students taking the seminar > for credit. I am attaching an update of Kuzman's Script for obtaining > Overture > price data, which he has kindly modified to give nicely formatted output. > You need to have access to a unix or linux box in order to run it, which > I assume you all do. > > You should all use this script to obtain Overture prices for some moderately > > large (say, at least in the dozens) set of phrases of your own choosing. > Presumably these phrases will be "related" in some interesting way (e.g. the > > names of all 50 states). Please try to present an analysis of the prices, > what > explains their similarities, differences, rankings, etc. I will give an > example > in class today to help clarify. I'd like everyone to send me their brief > analysis > by this coming Sunday night, Oct 16, for discussion in next Monday's > session. > > Best > Prof Kearns