|
Peng-Tao Jiang
My name is Peng-Tao Jiang (姜鹏涛). Currently, I am a lead researcher in
vivo BlueImage Lab.
Before that, I was a post-doc researcher at Zhejiang University, working with
Prof. Chunhua Shen. I received my PhD from
Nankai University, advised by Prof. Ming-Ming Cheng.
I have also completed internships at SenseTime
and Tencent YouTu.
My recent research interests mainly focus on the following topics:
Universal scene parsing and understanding:
SDMatte (ICCV25), DepthMaster (arXiv 2024), MLoRE (CVPR 2024), TaskDiffusion (ICLR 2025)
Mobile image enhancement:
RRW (CVPR24), LAL (ICML25), ConsisSR (arxiv 2024)
Generative models and applications:
B-TTDM (ECCV24), DDAEBM (ICML24), ADM (ACM MM24)
Photography editing:
Any2Bokeh (arxiv 25), MagicTryOn (arxiv 25), PPC (NeurIPS25)
Email  / 
CV  / 
Google Scholar  / 
Github
|
|
News
[21.04,2026]: We release our new work SmartPhotoCrafter, which redefines photo editing by eliminating the need for explicit human instructions. Instead of relying on users to describe desired adjustments, it autonomously understands image quality deficiencies, reasons about the improvement strategies, and generates stunning, photo-realistic results—all in one tightly coupled process.
[05.04,2026]: I am serving as an external member and advisor for the AI4X team, which is led by Xiaoqi Zhao.
[01.04,2026]: Our SD-based DepthMaster has been accepted by TCSVT. Congrats to Ziyang.
[20.02,2026]: [Call for papers] We are holding the 2nd international workshop on vision intelligence for real-world challenges at CVPR2026.
[20.02,2026]: Four papers have been accepted by CVPR 2026.
[21.11,2025]: Four papers have been accepted by ICLR 2026.
[21.11,2025]: Two papers have been accepted by AAAI 2026 ORAL.
[21.11,2025]: One paper has been accepted by TCSVT.
[08.11,2025]: Two oral papers have been accepted by AAAI 2026.
[08.24,2025]: Three papers have been accepted by NeurIPS 2025.
[08.24,2025]: One paper has been accepted by TPAMI.
[06.26,2025]: Three papers have been accepted by ICCV 2025.
|
Hiring
We are looking for self-motivated interns, working on the following topics:
- instruction-based image editing
- unified model for image understanding and generation
- MLLM for aesthetic understanding
If you are interested in our group, please drop me a resume via (pt.jiang at vivo.com or pt.jiang at mail.nankai.edu.cn).
|
Research
* denotes equal contributions, # denotes corresponding authors.
|
|
|
MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration
Guangyuan Li, Bo Li, Jinwei Chen, Xiaobin Hu, Lei Zhao#, Peng-Tao Jiang#
arxiv, 2026
paper
/code
/project
|
|
|
CameraMaster: Unified Camera Semantic-Parameter Control for Photography Retouching
Qirui Yang, Yang Yang, Ying Zeng, Xiaobin Hu, Bo Li, Huanjing Yue, Jingyu Yang#, Peng-Tao Jiang#
arxiv, 2025
paper
/code
/project
|
|
|
Q-Ponder: A Unified Training Pipeline for Reasoning-based Visual Quality Assessment
Zhuoxuan Cai, Jian Zhang, Xinbin Yuan, Peng-Tao Jiang, Wenxiang Chen, Bowen Tang, Lujian Yao, Qiyuan Wang, Jinwen Chen, Bo Li#
arxiv, 2025
paper
/code
/project
|
|
|
MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on
Guangyuan Li*, Siming Zheng*, Hao Zhang, Jinwei Chen, Junsheng Luan, Binkai Ou, Lei Zhao#, Bo Li, Peng-Tao Jiang#
arxiv, 2025
paper
/code
/project
|
|
|
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
Ziyang Song*, Zerong Wang*, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang#, Tianzhu Zhang#
TCSVT, 2026
paper
/code
/project
|
|
|
C2FG: Control Classifier-Free Guidance via Score Discrepancy Analysis
Jiayang Gao, Tianyi Zheng, Jiayang Zou, Fengxiang Yang, Shice Liu, Luyao Fan, Zheyu Zhang, Hao Zhang, Jinwei Chen, Peng-Tao Jiang, Bo Li, Jia Wang
ICLR, 2026
paper
|
|
|
Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling
Yuan Wang, Yuhao Wan, Siming Zheng, Bo Li, Qibin Hou, Peng-Tao Jiang#
ICLR, 2026
paper
/code
|
|
|
Any-to-Bokeh: Arbitrary-Subject Video Refocusing with Video Diffusion Model
Yang Yang*, Siming Zheng*, Jinwei Chen, Boxi Wu#, Xiaofei He, Deng Cai, Bo Li, Peng-Tao Jiang#
ICLR, 2026
paper
/code
/project
|
|
|
TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation
Hongyu Zhang, Yufan Deng, Zilin Pan, Peng-Tao Jiang, Bo Li, Qibin Hou, Zhen Dong, Zhiyang Dou, Daquan Zhou
ICLR, 2026
paper
/code
|
|
|
I-DRUID: Layout to image generation via instance-disentangled representation and unpaired data
Fengxiang Yang, Tianyi Zheng, Bangjie Yin, Shice Liu, Peng-Tao Jiang, Jinwei Chen, Bo Li#
ICLR, 2026
paper
|
|
|
Any-to-Bokeh: Arbitrary-Subject Video Refocusing with Video Diffusion Model
Yang Yang*, Siming Zheng*, Jinwei Chen, Boxi Wu#, Xiaofei He, Deng Cai, Bo Li, Peng-Tao Jiang#
ICLR, 2026
paper
/code
/project
|
|
|
Realism Control One-step Diffusion for Real-World Image Super-Resolution
Zongliang Wu*, Siming Zheng*, Peng-Tao Jiang#, Xin Yuan#
AAAI, 2026, ORAL
paper
/code
/project
|
|
|
Bidirectional Noise Injection: Enhancing Diffusion Models via Coordinated Input-Output Perturbation
Tianyi Zheng, Tianyi_Zheng, Jiayang Gao, Peng-Tao Jiang, Fengxiang Yang, Ben Wan, Hao Zhang, Jinwei Chen, Jia Wang, Bo Li
AAAI, 2026, ORAL
paper
|
|
|
Towards Natural Image Matting in the Wild via Real-Scenario Prior
Ruihao Xia, Yu Liang, Peng-Tao Jiang#, Hao Zhang, Qianru Sun, Yang Tang#, Bo Li, Pan Zhou
TCSVT, 2025
paper
/code
|
|
|
Photography Perspective Composition: Towards Aesthetic Perspective Recommendation
Lujian Yao*, Siming Zheng*, Xinbin Yuan, Zhuoxuan Cai, Pu Wu, Jinwei Chen, Bo Li, Peng-Tao Jiang#
NeurIPS, 2025
paper
/code
/project
|
|
|
Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
Xinbin Yuan, Jian Zhang, Kaixin Li, Zhuoxuan Cai, Lujian Yao, Jie Chen, Enguang Wang, Qibin Hou, Jinwei Chen, Peng-Tao Jiang, Bo Li
NeurIPS, 2025
paper
|
|
|
Learning Differential Pyramid Representation for Tone Mapping
Qirui Yang, Yinbo Li, Peng-Tao Jiang, Qihua Cheng, Biting Yu, Yihao Liu, Huanjing Yue, Jingyu Yang
NeurIPS, 2025
paper
/demo
|
|
|
Bidirectional Beta-Tuned Diffusion Model
Tianyi Zheng, Jiayang Zou, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Jia Wang, Bo Li
TPAMI, 2025
paper
|
|
|
DSDNet: Raw Domain Demoireing via Dual Color-Space Synergy
Qirui Yang, Fangpu Zhang, Yeying Jin, Qihua Cheng, Peng-Tao Jiang, Huanjing Yue, Jingyu Yang
ACM MM, 2025
paper /
demo
|
|
|
SDMatte: Grafting Diffusion Models for Interactive Matting
Longfei Huang, Yu Liang, Hao Zhang, Jinwei Chen, Wei Dong, Lunde Chen, Wanyu Liu, Bo Li, Peng-Tao Jiang#
ICCV, 2025
paper /
code
|
|
|
PGformer: Proxy-Bridged Game Transformer for Multi-Person Highly Interactive Extreme Motion Prediction
Yanwen Fang, Jintai Chen, Peng-Tao Jiang, Chao Li, Yifeng Geng, Eddy K. F. Lam, Guodong Li
ICCV, 2025
paper /
code
|
|
|
MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration
Tao Wang, Peiwen Xia, Bo Li, Peng-Tao Jiang, Zhe Kong, Kaihao Zhang, Tong Lu, Wenhan Luo
ICCV, 2025
paper /
code
|
|
|
Learning Adaptive Lighting via Channel-Aware Guidance
Qirui Yang*, Peng-Tao Jiang*#, Hao Zhang, Jinwei Chen, Bo Li, Huanjing Yue, Jingyu Yang#
ICML, 2025
paper /
demo
|
|
|
Multi-Task Dense Predictions via Unleashing the Power of Diffusion
Yuqi Yang*, Peng-Tao Jiang*, Qibin Hou#, Hao Zhang, Jinwei Chen, Bo Li
ICLR, 2025
paper /
code
|
|
|
High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
Qian Yu*, Peng-Tao Jiang*#, Hao Zhang, Jinwei Chen, Bo Li, Lihe Zhang#, Huchuan Lu
ICLR, 2025
paper /
code
|
|
|
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
Yuti Liu, Shice Liu, Junyuan Gao, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li#
AAAI, 2025
paper
|
|
|
Boosting Vision State Space Model with Fractal Scanning
Haoke Xiao*, Lv Tang*, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li#
AAAI, 2025, ORAL
paper
|
|
|
Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation
Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Bo Li#, Yang Tang#, Pan Zhou
NeurIPS, 2024
paper /
code
|
|
|
Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection
Lv Tang, Peng-Tao Jiang, Zhihao Shen, Hao Zhang, Jinwei Chen, Bo Li
ACM MM, 2024
paper /
code
|
|
|
Non-uniform Timestep Sampling: Towards Faster Diffusion Model Training
Tianyi Zheng, Cong Geng, Peng-Tao Jiang, Ben Wan, Hao Zhang, Jinwei Chen, Jia Wang#, Bo Li#
ACM MM, 2024
paper
|
|
|
Beta-Tuned Timestep Diffusion Model
Tianyi Zheng, Peng-Tao Jiang, Ben Wan, Hao Zhang, Jinwei Chen, Jia Wang#, Bo Li#
ACM MM, 2024
paper
|
|
|
Towards Training-free Open-world Segmentation via Image Prompt Foundation Models
Lv Tang*, Peng-Tao Jiang*, Haoke Xiao*, Bo Li#
IJCV, 2024
paper
|
|
|
Improving Adversarial Energy-Based Model via Diffusion Process
Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li
ICML, 2024
paper
|
|
|
Revisiting Single Image Reflection Removal In the Wild
Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li
CVPR, 2024
paper
/code
|
|
|
Multi-Task Dense Prediction via Mixture of Low-Rank Experts
Yuqi Yang*, Peng-Tao Jiang*, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li
CVPR, 2024
paper
/code
|
|
|
Traffic Scene Parsing through the TSP6K Dataset
Peng-Tao Jiang*, Yuqi Yang*, Yang Cao, Qibin Hou, Ming-Ming Cheng, Chunhua Shen
CVPR, 2024
paper
/code
/dataset[password:Wi9qFT]
|
|
|
Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections
Jiaxiong Qiu, Peng-Tao Jiang, Yifan Zhu, Ze-Xin Yin, Ming-Ming Cheng, Bo Ren
CVPR, 2023
paper
/code
|
|
|
RDNeRF: Relative Depth Guided NeRF for Dense Free View Synthesis
Jiaxiong Qiu*, Yifan Zhu*, Peng-Tao Jiang, Ming-Ming Cheng, Bo Ren
TVC, 2023
paper
/code
|
|
|
Deeply Explain CNN via Hierarchical Decomposition
Ming-Ming Cheng*, Peng-Tao Jiang*, Ling-Hao Han, Liang Wang, Philip Torr
IJCV, 2023
paper
/demo
|
|
|
L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation
Peng-Tao Jiang, Yuqi Yang, Qibin Hou, Yunchao Wei
CVPR, 2022
paper
/code
|
|
|
Attention mechanisms in computer vision: A survey
Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng, and Shi-Min Hu
CVMJ, 2022
paper
/code
/Best Paper Award
|
|
|
Personalized Image Semantic Segmentation"
Yu Zhang, Chang-bin Zhang, Peng-Tao Jiang, Feng Mao, Ming-Ming Cheng
ICCV, 2021
paper
/code
|
|
|
Online Attention Accumulation for Weakly Supervised Semantic Segmentation
Peng-Tao Jiang*, Ling-Hao Han*, Qibin Hou, Ming-Ming Cheng, Yunchao Wei
TPAMI, 2021
paper
/code
|
|
|
Delving Deep into Label Smoothing
Chang-bin Zhang*, Peng-Tao Jiang*, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li, Ming-Ming Cheng
TIP, 2021
paper
/code
|
|
|
LayerCAM: Exploring Hierarchical Class Activation Maps for Localization
Peng-Tao Jiang*, Chang-bin Zhang*, Qibin Hou, Ming-Ming Cheng, Yunchao Wei
TIP, 2021
paper
/code
|
|
|
Integral Object Mining via Online Attention Accumulation
Peng-Tao Jiang, Qibin Hou, Yang Cao, Ming-Ming Cheng, Yunchao Wei, Hongkai Xiong
ICCV, 2019
paper
/code
/project
|
|
|
Self-Erasing Network for Integral Object Attention
Qibin Hou, Peng-Tao Jiang, Yunchao Wei, Ming-Ming Cheng
NeurIPS, 2018
paper
/code
|
|
|
DEL: Deep Embedding Learning for Efficient Image Segmentation
Yun Liu, Peng-Tao Jiang, Vahan Petrosyan, Shi-Jie Li, Jiawang Bian, Le Zhang, and Ming-Ming Cheng
IJCAI, 2018
paper
/code
|
- vivo
2023.05-now, Lead Researcher
Develop new AI algorithms for mobile photography.
- Tencent YouTu 2022.02-2022.05, Research Intern
Research: plam keypoint detection and object detection.
- SenseTime 2018.11-2019.04, Research Intern
Research: change detection.
|
- Reviewer for TPAMI, TIP, CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, MM, etc.
|
|