Strategic Classification from Revealed Preferences

Jinshuo Dong, Aaron Roth, Zachary Schutzman, Bo Waggoner, Steven Wu

We study an online linear classification problem, in which the data is generated by strategic agents who manipulate their features in an effort to change the classification outcome. In rounds, the learner deploys a classifier, and an adversarially chosen agent arrives, possibly manipulating her features to optimally respond to the learner. The learner has no knowledge of the agents' utility functions or ``real'' features, which may vary widely across agents. Instead, the learner is only able to observe their ``revealed preferences'' --- i.e. the actual manipulated feature vectors they provide. For a broad family of agent cost functions, we give a computationally efficient learning algorithm that is able to obtain diminishing ``Stackelberg regret'' --- a form of policy regret that guarantees that the learner is obtaining loss nearly as small as that of the best classifier in hindsight, even allowing for the fact that agents will best-respond differently to the optimal classifier.