Accuracy for Sale: Aggregating Data with a Variance Constraint

Rachel Cummings, Katrina Ligett, Aaron Roth, Steven Wu, and Juba Ziani.
[PDF]

We consider the problem of a data analyst who may purchase an unbiased estimate of some statistic from multiple data providers. From each provider i, the analyst has a choice: she may purchase an estimate from that provider that has variance chosen from a finite menu of options. Each level of variance has a cost associated with it, reported (possibly strategically) by the data provider. The analyst wants to choose the minimum cost set of variance levels, one from each provider, that will let him combine his purchased estimators into an aggregate estimator that has variance at most some fixed desired level. Moreover, he wants to do so in such a way that incentivizes the data providers to truthfully report their costs to the mechanism. We give a dominant strategy truthful solution to this problem that yields an estimator that has optimal expected cost, and violates the variance constraint by at most an additive term that tends to zero as the number of data providers grows large.