Events

DMS Statistics and Data Science Seminar

Time: Sep 25, 2024 (02:00 PM)
Location: ZOOM

Details:

lanluo

Speaker: Dr. Lan Luo (Assistant Professor, Department of Biostatistics and Epidemiology at Rutgers University)

Title: Online statistical inference with streaming data: renewability, dependence, and dynamics

 

Abstract: New data collection and storage technologies have given rise to a new field of streaming data analytics, including real-time statistical methodology for online data analyses. Streaming data refers to high-throughput recordings with large volumes of observations gathered sequentially and perpetually over time. Such data collection scheme is pervasive not only in biomedical sciences such as mobile health, but also in other fields such as IT, finance, services, and operations. Despite a large amount of work in the field of online learning, most of them are established under strong independent and identical data distribution, and very few target statistical inference. This talk will center around three key components in streaming data analyses: (i) renewable updating, (ii) cross-batch dependency, and (iii) time-varying effects. I will first introduce how to conduct a renewable updating procedure, in the case of independent data batches, with a particular aim of achieving similar statistical properties to the offline oracle methods but enjoying great computational efficiency. Then I will discuss how we handle the dependency structure that spans across a sequence of data batches to maintain statistical efficiency in the process of renewable updating. Lastly, a dynamic weighting scheme will be integrated into the online inference framework to account for time-varying effects. I will provide both conceptual understanding and theoretical guarantees of the proposed method and illustrate its performance via numerical examples.