Gerome Miklau is visiting Webdam Wednesday 2 June at 2pm. He is an Assistant Professor at the Computer Science Department of University of Massachusetts, Amherst. He will present his work on differential privacy in the meeting room of LSV at ENS Cachan.
Title: Optimizing Linear Counting Queries Under Differential Privacy
Abstract: Differential privacy is a rigorous privacy standard that protects against powerful adversaries, offers precise accuracy guarantees, and has been successfully applied to a range of data analysis tasks. When differential privacy is satisfied, participants in a dataset enjoy the compelling assurance that information released about the dataset is virtually indistinguishable whether or not their personal data is included.
Differential privacy is achieved by introducing randomness into query answers. The original algorithm for achieving differential privacy, commonly called the Laplace mechanism, returns the true answer after the addition of random noise drawn from a Laplace distribution. If an analyst requires only the answer to a single query about the database, then the Laplace mechanism is known to be optimal. But the Laplace mechanism can be highly suboptimal when a set of correlated queries are submitted, and despite much recent work, optimal strategies for answering a collection of correlated queries are not known in general.
In this talk I will review the basic principles of differential privacy and then describe the “matrix mechanism”, a new algorithm for answering a workload of predicate counting queries. Given a workload, the mechanism first requests answers to a different set of queries, called a query strategy, which are answered using the standard Laplace mechanism. Noisy answers to the workload queries are then derived from the noisy answers to the strategy queries.
When the strategy queries are chosen appropriately, this two stage process increases accuracy (with no cost in privacy) by answering the workload queries using a more complex, correlated noise distribution. I will show that two recently-proposed algorithms, which provide accurate answers for the set of all range queries, can be seen as instances of the matrix mechanism. I will then present results on optimally choosing the query strategy to minimize the error for any given workload.
This talk is based on forthcoming work that will appear in PODS 2010 and VLDB 2010, and is joint with Chao Li, Michael Hay, Vibhor Rastogi, Andrew McGregor, and Dan Suciu.