CABDRIVER: Concentration to Accurate Boundaries while Distorting Randomly Input Variables to Elude Recognition

Konferenz: WSA 2021 - 25th International ITG Workshop on Smart Antennas
10.11.2021 - 12.11.2021 in French Riviera, France

Tagungsband: ITG-Fb. 300: WSA 2021

Seiten: 6Sprache: EnglischTyp: PDF

Thesmar, Raphael (ECE, Cornell University, USA)
Thesmar, Joseph (CS, Worcester Polytechnic Institute, USA)
D'Oliveira, Rafael G. L.; Medard, Muriel (RLE, Massachusetts Institute of Technology, USA)

We consider the problem of privately aggregating mobile users’ locations on a city map. This setting involves a single (potentially untrusted) server and multiple users, each having a list of visited locations. The scheme we propose, named Cabdriver, works by randomly distorting a user’s input so that the output is (approximately) statistically indistinguishable from a uniformly random one. This guarantees, in particular, that Cabdriver is locally differentially private. Despite the outputs appearing uniformly distributed, after aggregating the data from all users, the server is able to obtain an estimate for the amount of times each location has been visited. These estimates are used to build a heat map of visited locations. Moreover, we show that by means of a threshold operation, one can, with high probability, remove all false positives – essentially the noise – from the heat map while retaining locations which have been visited by more than a square root of the amount of users. We also showcase Cabdriver by running it on location data from the NYC Taxi and Limousine Commission (TLC) trip records dataset. More details about these experiments, together with the possibility of running them for different parameters, can be found on a Jupyter notebook we have uploaded to github.