Research Direction

LLM / VLM / VLA

My primary research focus is on multi-agent interaction, self-improving AI, continual learning, and efficient model memory across language, vision-language, and vision-language-action models. I study how multiple agents collaborate, generate synthetic experience, maintain knowledge over time, and manage memory efficiently—creating systems that get smarter through interaction and can operate within real-world hardware constraints.

Key Research Topics

Multi-Agent Interaction

How multiple LLM/VLM agents collaborate, debate, verify, and refine each other's outputs. Research on agent orchestration, role specialization, emergent communication protocols, and multi-agent self-play for improving reasoning and task decomposition.

Self-Improving AI

Systems that generate their own training signal through synthetic data, self-reflection, and iterative refinement. Investigating feedback loops where agents evaluate their own outputs, generate preference pairs, and continuously improve without human annotation.

Vision-Language-Action (VLA)

Unified models that perceive (vision), reason (language), and act (control). Research on grounding language in embodied environments, action prediction from multimodal inputs, and bridging the sim-to-real gap for robotic and interactive agents.

Pretraining & Post-Training

Full model lifecycle from large-scale pretraining, through supervised fine-tuning (SFT), to post-training alignment with RLHF/DPO. Focus on how each stage contributes to multi-agent capability and self-improvement potential.

Agent Orchestration Frameworks

Building scalable frameworks for multi-agent pipelines — task routing, tool use, memory systems, and self-correction loops. How to design agent architectures that are reliable, composable, and can scale from single tasks to complex workflows.

Continual Learning

Enabling LLM/VLM/VLA agents to learn new knowledge, skills, and domains over time without catastrophic forgetting. Developing replay-free and parameter-efficient continual learning methods so deployed agents can adapt and grow rather than remain frozen.

Efficient Model Memory

How models store, compress, retrieve, and forget information efficiently. Research on KV-cache optimization, memory-augmented architectures, retrieval-augmented generation, episodic memory for agents, and parameter-efficient representations that maximize knowledge per byte of VRAM.

Related Publications

Featured

Multimodal Synthetic Data Finetuning and Model Collapse

Zizhao Hu, et al.

2025ACM International Conference on Multimodal Interaction (ICMI)
View Paper