Ph.D. candidate in Computer Science at the Institute for Advanced Study, Tsinghua University. Research interests lie at the intersection of robotics and multimodal intelligence, with a focus on embodied AI, spatial reasoning, and vision-language models for robotic understanding.