Main picture

Ph.D. Candidate

Department of Statistics
University of California, Los Angeles
Advisor: Ying Nian Wu and Tao Gao

Email: minglu.zhao@ucla.edu

Google ScholarLinkedInGitHub

Bio

I am a fifth-year Ph.D. student at UCLA, advised by Ying Nian Wu and Tao Gao. My research explores the intersections of language modeling, decision-making, representation learning, and human cognition, with a focus on developing generative models and latent variable approaches that enhance language understanding and improve decision-making processes in complex environments.

I obtained my B.S. in Statistics and B.S. in Cognitive Science also at UCLA. Go Bruins!! 🐻

 

News

 

Selected Publications

* denotes equal contribution.
Place Cells as Position Embeddings of Multi-Time Random Walk Transition Kernels for Path Planning
, , , ,
Under Review
We propose a novel framework for modeling hippocampal place cells as proximity-preserving neural embeddings that encode multi-scale random walk transitions. Our experiments demonstrate localized place fields, multi-scale tuning, and adaptability to environmental changes, offering a biologically plausible model that unifies spatial and temporal coding, with potential extensions to theta-phase precession and grid cell integration.
Latent Adaptive Planner for Dynamic Manipulation
, , , , ,
CoRL 2025
We present the Latent Adaptive Planner (LAP), a trajectory-level latent-variable policy for dynamic nonprehensile manipulation (e.g., box catching) that formulates planning as inference in a low-dimensional latent space and is learned effectively from human demonstration videos. Through challenging box catching experiments with varying object properties, LAP demonstrates superior success rates, trajectory smoothness, and energy efficiency by learning human-like compliant motions and adaptive behaviors.

Latent Thought Models with Variational Bayes Inference-Time Computation
, , , , , , , , , ,
ICML 2025
We introduce Latent-Thought Language Models (LTMs), a novel language model family that incorporates explicit latent thought vectors. LTMs leverage dual-rate optimization, rapidly updating local latent vectors while gradually refining global decoder parameters. This approach unlocks new scaling dimensions, achieving superior efficiency, perplexity, and zero-shot performance over traditional models. They also exhibit emergent few-shot reasoning, highlighting their potential for advanced language tasks.
Inverse Attention Agent in Multi-Agent System
, , , , ,
ICLR 2025
We introduce Inverse Attention Agents, leveraging Theory of Mind concepts through an attention mechanism to enable adaptability in dynamic multi-agent environments. These agents infer the goals and attentional states of other agents, refining their attention weights for improved decision-making. Tested across cooperative, competitive, and mixed tasks, our approach enhances performance and human-like cooperation compared to conventional models.
A Minimalistic Representation Model for Head Direction System
, , , ,
CogSci 2025
We propose a minimalistic representational model for the Head Direction (HD) system, a crucial component of spatial navigation in mammals. Our model leverages the symmetry of the rotation group U(1) and the inherent circular geometry of the head direction. We develop fully connected and convolutional versions of the model, both aiming to learn a high-dimensional representation of head direction that captures essential properties of HD cells.
Latent Plan Transformer: Planning as Latent Variable Inference
, , , , , , , ,
NeurIPS 2024
We introduce the Latent Plan Transformer (LPT), a novel model that leverages a latent space to connect a Transformer-based trajectory generator and the final return. This architecture enables planning without step-wise rewards, addressing temporal consistency challenges in long-term tasks. LPT uses maximum likelihood estimation on trajectory-return pairs, with posterior sampling of latent variables for consistent sub-trajectory abstraction. During inference, LPT deduces the latent variable based on expected returns, realizing a planning-as-inference approach.
Intention beyond desire: Spontaneous intentional commitment regulates conflicting desires
, , , , , , ,
In Cognition, 2023
We explore how coherent actions emerge from conflicting desires, contrasting classical desire-driven behavior with intention-driven action. Through 2D navigation games, we identify three unique markers of human intentional commitment—goal perseverance, self-binding, and temporal leap—that distinguish human actions from purely desire-driven agents. Our findings suggest that humans form committed intentions to manage conflicting desires, enhancing predictability and reducing computational load in action planning.
Sharing rewards undermines coordinated hunting
, , , , , ,
In Journal of Computational Biology, 2022
We investigate the impact of reward sharing on coordinated hunting using Multi-agent Reinforcement Learning (MARL) and reveal surprising findings: rather than facilitating coordination, sharing rewards undermines it due to issues like the free-rider problem and coordination limits at larger group sizes. Individually rewarded agents outperform those sharing rewards, particularly in challenging scenarios. Our results suggest that reward sharing may not be crucial for animal coordination, challenging assumptions in AI models that rely on shared rewards to motivate group coordination.
Exploring an imagined “we” in human collective hunting: Joint commitment within shared intentionality
, , , , , , ,
CogSci 2022
We examine human collaboration in goal selection, demonstrating that shared intentionality allows humans to form robust commitments to collective goals without communication. In a real-time cooperative hunting game, humans maintained high-quality cooperation, even with many targets. We develop a Bayesian "Imagined We" (IW) model which mirrored this behavior, outperforming a Reward Sharing (RS) model, which struggled with coordination as target numbers rose. These findings highlight shared intentionality as central to human cooperation, offering insights into its computational basis.
Modeling communication to coordinate perspectives in cooperation
, , , , , , ,
CogSci 2021
We introduce the Imagined We for Communication framework, a model where agents leverage shared agency to interpret overloaded signals in ambiguous contexts. By simulating rational cooperators, our model demonstrates strong performance in high-ambiguity settings, even with minimal reasoning depth, underscoring how shared knowledge and cooperative logic support effective communication.
Bootstrapping an Imagined We for Cooperation
, , , , ,
CogSci 2020
We develop a Bayesian-Theory-of-mind-based framework named the Imagined We (IW), showing how agents can reliably converge on a joint intention in uncertain, multi-choice settings through bootstrapping. In a real-time cooperative hunting task, our model proves resilient to challenges like numerous choices, approximate partner models, and noisy perceptions, highlighting its robustness in maintaining joint commitment under imperfect conditions.

 

Teaching