Accepted
Bias-Restrained Prefix Representation Finetuning for Mathematical Reasoning
Sirui Liang, Pengfei Cao, Jian Zhao, Cong Huang, Jun Zhao, Kang Liu
AAAI 2026
[code]
LLM-SMAC: Solving Multi-Agent Decision-Making Tasks via LLM Decision Tree Code Generation
Yue DENG, Weiyu Ma, Yuxin Fan, Ruyi Song, Yin Zhang, Haifeng Zhang, Jian Zhao
AAMAS 2026
Sim2Sea: Sim-to-Real Policy Transfer for Maritime Vessel Navigation in Congested Waters
Xinyu Cui, Xuanfa Jin, Xue Yan, Yongcheng Zeng, Luoyang Sun, Wei Siying, Ruizhi Zhang, Jian Zhao, Haifeng Zhang, Jun Wang
AAMAS 2026
Under Review
STEMVerse: A Dual-Axis Diagnostic Framework for STEM Reasoning in Large Language Models
Xuzhao Li, Xuchen Li, Jian Zhao, Shiyu Hu
Preprint
Beyond Accuracy: Evaluating Grounded Visual Evidence in Thinking with Images
Xuchen Li, Xuzhao Li, Renjie Pi, Shiyu Hu, Jian Zhao, Jiahui Gao
Preprint
Synergizing Multi-Turn Chain-of-Thought Reasoning and Reinforcement Fine-Tuning for Detecting and Grounding Multi-Modal Manipulation
Saijie Hou, Yuan Liu, Zikang Li, Saihui Hou, Jian Zhao, Zhaofeng He
Preprint
MemeMind: A Large-Scale Multimodal Dataset with Chain-of-Thought Reasoning for Harmful Meme Detection
Hexiang Gu, Qifan Yu, Yuan Liu, Zikang Li, Saihui Hou, Jian Zhao, Zhaofeng He
Preprint
EvoCurr: Self-evolving Curriculum with Behavior Code Generation for Complex Decision-making
Yang Cheng, Zilai Wang, Weiyu Ma, Yue Deng, Wenhui Zhu, Yujing Hu, Tangjie Lv, Changjie Fan, Jian Zhao
Preprint
Learning How to Remember: A Meta-Cognitive Management Method for Structured and Transferable Agent Memory
Sirui Liang, Pengfei Cao, Jian Zhao, Wenhao Teng, Xiangwen Liao, Jun Zhao, Kang Liu
Preprint
Training LLMs to Self-Refine via Iterative Preference Optimization
Yongcheng Zeng, Xinyu Cui, Xuanfa Jin, Qirui Mi, Guoqing Liu, Zexu Sun, Mengyue Yang, Dong Li, Weiyu Ma, Ning Yang, Jian Zhao, Jianye HAO, Haifeng Zhang, Jun Wang
Preprint
LLM-SMAC: Generating Interpretable Multi-Agent Policies through Programmatic Synthesis
Yue Deng, Weiyu Ma, Xiaoxia Cheng, ZiRui Wang, Yujing Hu, Tangjie Lv, Changjie Fan, Yin Zhang, Jian Zhao
Preprint
MAMBO-G: Magnitude-Aware Mitigation for Boosted Guidance
Shangwen Zhu, Qianyu Peng, Zhilei Shu, Yuting Hu, Han Zhang, Andy Zheng, Xinyu Cui, Jian Zhao, Ruili Feng, Fan Cheng
Preprint
Pisces: Video Generation over Events via Time Interval Encoding
Zhilei Shu, Shangwen Zhu, Bo Ye, Andy Zheng, Qianyu Peng, Xinyu Cui, Xiangrui Ke, Tingting Liao, Zipeng Ji, Shucheng Huang, Yiming Li, Xiang Li, Fan Cheng, Jian Zhao, Zheng-Jun Zha
Preprint
PromptManual: Beyond Black-Box Refinement with a Taxonomy-Driven Framework for Interpretable T2I Prompt Optimization
Xingxi Yin, Yan Gao, Yicheng Li, FengTao, Shuxin Zheng, Jian Zhao, Cong Huang, Yue Deng, Yin Zhang
Preprint
\pi-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Siting Wang, Xiaofeng Wang, Minnan Pei, Xinyu Cui, Cheng Deng, Zheng Zhu, Jian Zhao, Guan Huang, Haifeng Zhang, Jun Wang
Preprint
ToMAgent: Enhancing ToM-based Decision-Making in Multi-Agent Interactions via Bi-Level Self-Play
Xuanfa Jin, Xinyu Cui, Zhijian Ma, Xue Yan, Jian Zhao, Haifeng Zhang, Jun Wang
Preprint
Information-Manifold Proximal Policy Optimization
Yongcheng Zeng, Xinyu Cui, Yan Song, Guoqing Liu, Hengtong Lu, Kaike Zhang, Chen Wei, Jian Zhao, Haifeng Zhang, Jun Wang
Preprint