From perception to action
We connect visual evidence, language, and control for reliable manipulation and embodied decision-making in physical workspaces.
The Multimodal Intelligence Lab (MILab) is a University of Washington research lab based at UW Tacoma that studies how AI systems learn, reason, and act from multimodal evidence in open-world environments. We develop multimodal foundation models, embodied AI, and world models that connect perception, reasoning, planning, and action for robust real-world decision-making.
We connect visual evidence, language, and control for reliable manipulation and embodied decision-making in physical workspaces.
We study how agents use memory, goals, and prediction to navigate and interact in human environments over longer horizons.
We model social context and spatial evidence so AI agents can act safely around people in shared spaces.
We build predictive models that connect instructions, observations, and possible future states for embodied planning.
We use memory and simulation to help agents compare possible futures before acting in dynamic scenes.
We design AI systems for mobility, public safety, and secure decision support under real-world constraints and operational risks.
Our research spans multimodal learning, embodied intelligence, robotics, mobility intelligence, and responsible AI, with applications to public safety and human-centered systems.
Multimodal models for perception, reasoning, learning, and generation across language, vision, video, audio, and structured signals.
Embodied agents for navigation, interaction, planning, memory, and control in human-aware physical and simulated environments.
Real-world systems for mobility, safety, sensing, monitoring, and decision-making with secure and responsible deployment.
Fengyi Wu received GSFEI recognition for research excellence.
Sign-language survey and GoVIG navigation papers accepted to ACL 2026.
Lossless hierarchical decoding work accepted as an ICLR 2026 oral paper.
Congratulations to former visiting student Yuxuan Zhou. Our paper MaxSup: Overcoming Representation Collapse in Label Smoothing was accepted for an oral presentation at NeurIPS 2025.
Yifei Dong recognized for world-model navigation research.
Securing the Skies received Best Paper workshop recognition.
arXiv preprint, 2026
arXiv preprint, 2026
NeurIPS 2024 Spotlight
NeurIPS 2024
ICLR 2026 Oral
NeurIPS 2025 Oral
MILab welcomes prospective Ph.D. students, postdoctoral researchers, UW students across Seattle, Tacoma, and Bothell, and collaborators with focused research interests.