Research Directions

Representation Learning in RL

Representation learning is a way to transform high-dimensional observation to low-dimensional embeddings to enable efficient learning, essentially a fancier name for dimension reduction. Many believe that deep nets are doing representation learning implicitly to achieve good generalization with relatively small training data. In contrast, we design methods to explicitly construct representations, which enable efficient exploration in RL and tractable learning in many challenging settings such as multi-player games and partially observatble environments.

[ICLR 2021] Model-based representation learning.
[ICML 2022] Model-free representation learning, with Pytorch implementation.
[COLT 2023] Representation Learning enables efficient transfer across different environments.
[ICLR 2023] Representation learning escapes the curse of multi-agent in multi-player Markov games.
[ICML 2023] Representation learning empowers computationally tractable learning in POMDP, for the first time.

Robust Machine Learning against Data Corruption

Real-world data is noisy. If we wish to deploy an intelligent system into the wild without human babysitting, the system must be able to hold its ground and learn reliable in the face of noisy, biased and sometimes adversarially corrupted observations. Our goal is to design machine learning algorithms that are scalable and guaranteed to be robust against noisy data.

[AAAI 2018] Even a small set of clean data can help with combatting noisy data in supervised learning.
[Neurips 2019, ICML 2020] Vanilla RL algorithms are very vulnerable to data corruption, even more so than supervised learning due to the interactive nature of the learning process.
[ICML 2021, AISTATS 2022] One of the first results on corruption-robust online and offline RL.
[AISTATS 2023] New challenges arise in the distributed learning setting due to data splitting. We designed a state-of-the-art robust mean estimation algorithm to optimally learn from batches of data that can be very different in sizes.

Machine Learning for (Natural and Social) Science and Beyond

One of my goals at CDS is to collaborate with social and natural science researchers by providing data-science technology and expertise. Together, we could solve impactful problems more efficiently and accurately by making better use of data. See below for some of the works we have done recently.

[CogSci 2020] Using machine teaching theory to explain human teaching behavior and design better learning agent that learns from human instructions.
[Neurips 2022] A new bandit style approach for accelerated protein design.
[Preprint 2023] An RL approach to adaptive influence maximization on social media.