ProEdit: Inversion-based Editing From Prompts Done Right

ProEdit : Inversion-based Editing From Prompts Done Right

Dec 2025

(* equal contributions, † corresponding authors)

¹ Sun Yat-sen University    ² CUHK MMLab
³ College of Computing and Data Science, Nanyang Technological University
⁴ The University of Hong Kong
⁵ Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, China

Abstract

Inversion-based visual editing provides an effective and training-free way to edit an image or a video based on user instructions. Existing methods typically inject source image information during the sampling process to maintain editing consistency. However, this sampling strategy overly relies on source information, which negatively affects the edits in the target image (e.g., failing to change the subject's atributes like pose, number, or color as instructed). In this work, we propose ProEdit to address this issue both in the attention and the latent aspects. In the attention aspect, we introduce KV-mix, which mixes KV features of the source and the target in the edited region, mitigating the influence of the source image on the editing region while maintaining background consistency. In the latent aspect, we propose Latents-Shift, which perturbs the edited region of the source latent, eliminating the influence of the inverted latent on the sampling. Extensive experiments on several image and video editing benchmarks demonstrate that our method achieves SOTA performance. In addition, our design is plug-and-play, which can be seamlessly integrated into existing inversion and editing methods, such as RF-Solver, FireFlow and UniEdit.

Text-driven Image / Video Editing

FLUX 🎨 Flow-based Image Generation Model

Background: Trees ➡ Mountain

− Mint Leaves

+ Reading Book

Umbrella ➡ Umbrella

Bench ➡ Sofa

Tiger ➡ Cat

Cat ➡ Fox

Shirt ➡ Sweater

Cat ➡ Dog

HunyuanVideo 🎥 Flow-based Video Generation Model

+ Roof Rack

+ Crown

Red Car ➡ Black Car

Deer ➡ Cow

BibTeX

If you find our work useful, please consider citing our paper:

@misc{ouyang2025proedit, title={ProEdit: Inversion-based Editing From Prompts Done Right}, author={Ouyang, Zhi and Zheng, Dian and Wu, Xiao-Ming and Jiang, Jian-Jian and Lin, Kun-Yu and Meng, Jingke and Zheng, Wei-Shi}, year={2025}, eprint={2512.22118} archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2512.22118} }

ProEdit : Inversion-based Editing From Prompts Done Right

Video

Abstract

Excessive source image information injection phenomeno in RF-Solver

Pipeline of ProEdit

Text-driven Image / Video Editing

Editing By Instruction

BibTeX