We propagate user-provided, single-view 2D edits to the multi-view representation of a 3D asset. This enables fast, mask-free, high-fidelity, and consistent 3D editing with intuitive control.
Source view
Edited view
Source view
Edited view
Source view
Edited view
Source view
Edited view
Source view
Edited view
We present EditP23, a method for mask-free 3D editing that propagates 2D image edits to multi-view representations in a 3D-consistent manner. In contrast to traditional approaches that rely on text-based prompting or explicit spatial masks, EditP23 enables intuitive edits by conditioning on a pair of images: an original view and its user-edited counterpart. These prompts guide an edit-aware flow in the latent space of a pre-trained multi-view diffusion model, coherently propagating the edit across views. Operating in a feed-forward manner (no optimisation), our method preserves the object’s identity in both structure and appearance. We demonstrate effectiveness across diverse categories and edits, achieving high fidelity to the source without masks.
Results of our method across diverse object categories. Each block compares a source object (top) with its edited versions (below). The leftmost column shows the conditioning views (source and target) used to prompt the edit, while the remaining columns present novel viewpoints of the result. Our approach consistently applies the desired edit while preserving the object’s structure and identity across all viewpoints.
Original
Pixar style
Wings
Original
Old
Zombie
Original
Fantasy
Original
Donut
Original
Tail
Original
Oreo
Cartoonish car
Car with wings
Fox with open eyes
Golden R2D2
Robot with sunglasses
Terrier with beanie
Batman with backpack
Grogu with red robe
Fox with tuxedo
Gothic cathedral
Terrier with Paddington's hat
Vespa
Grogu in the-force pose
Superman in Superman pose
Batmobile
Lego figure of Grogu