
Abstract: The turnstile continual release model of differential privacy captures scenarios where a privacy-preserving real-time analysis is sought for a dataset evolving through additions and deletions. In typical applications of real-time data analysis, both the length of the stream T and the size of the universe |U| from which data come can be extremely large. This motivates the study of private algorithms in the turnstile setting using space sublinear in both T and |U|. In this paper, we give the first sublinear space differentially private algorithms for the fundamental problem of counting distinct elements in the turnstile streaming model. Our algorithm achieves, on arbitrary streams, O(T^{1/3}) space and additive error, and a (1+\eta)-relative approximation for all \eta \in (0,1). Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear, approaching the known Omega(T^{1/4}) additive error lower bound for arbitrary streams. Moreover, when a bound W on the number of times an item appears in the stream is known, our algorithm provides O(\sqrt{W}) additive error, using O(\sqrt{W}) space. This additive error asymptotically matches that of prior work which required linear space. Our results address an open question about designing low-memory mechanisms for this problem. We complement these results with a space lower bound, which shows that any algorithm that uses similar techniques must use space Omega(T^{1/3}) on arbitrary streams.
Joint work with Alessandro Epasto, Jieming Mao, Tamalika Mukherjee, Tingting Ou, and Peilin Zhong.
Bio: Dr. Rachel Cummings is an Associate Professor of Industrial Engineering and Operations Research and (by courtesy) Computer Science at Columbia University, where she is also a member of the Data Science Institute and co-chairs the Cybersecurity Research Center. Before joining Columbia, she was an Assistant Professor of Industrial and Systems Engineering and (by courtesy) Computer Science at the Georgia Institute of Technology, and she received her Ph.D. in Computing and Mathematical Sciences from the California Institute of Technology. Her research interests lie primarily in data privacy, with connections to machine learning, algorithmic economics, optimization, statistics, and public policy. Dr. Cummings is the recipient of numerous awards including an NSF CAREER award, a DARPA Young Faculty Award, a DARPA Director’s Fellowship, an Early Career Impact Award, multiple industry research awards, a Provost’s Teaching Award, two doctoral dissertation awards, and Best Paper Awards at DISC 2014, CCS 2021, and SaTML 2023. Dr. Cummings also serves or has served on the U.S. Census Scientific Advisory Committee, the ACM U.S. Technology Policy Committee, the IEEE Standards Association, and the Future of Privacy Forum’s Advisory Board, and was a Fellow at the Center for Democracy & Technology.