EditP23: 3D Editing via Propagation of Image Prompts to Multi-View

3D Editing via Propagation of Image Prompts to Multi-View

We propagate user-provided, single-view 2D edits to the multi-view representation of a 3D asset. This enables fast, mask-free, high-fidelity, and consistent 3D editing with intuitive control.

Abstract

We present EditP23, a method for mask-free 3D editing that propagates 2D image edits to multi-view representations in a 3D-consistent manner. In contrast to traditional approaches that rely on text-based prompting or explicit spatial masks, EditP23 enables intuitive edits by conditioning on a pair of images: an original view and its user-edited counterpart. These prompts guide an edit-aware flow in the latent space of a pre-trained multi-view diffusion model, coherently propagating the edit across views. Operating in a feed-forward manner (no optimisation), our method preserves the object’s identity in both structure and appearance. We demonstrate effectiveness across diverse categories and edits, achieving high fidelity to the source without masks.

Multi-View Edit Gallery

Results of our method across diverse object categories. Each block compares a source object (top) with its edited versions (below). The leftmost column shows the conditioning views (source and target) used to prompt the edit, while the remaining columns present novel viewpoints of the result. Our approach consistently applies the desired edit while preserving the object’s structure and identity across all viewpoints.

Deer – Pixar style & Wings

Cond. View View 1 View 2 View 3

Original

Pixar style

Wings

Person – Old & Zombie

Cond. View View 1 View 2 View 3

Original

Old

Zombie

Ship – Fantasy

Cond. View View 1 View 2 View 3

Original

Fantasy

Stormtrooper – Wearing a Donut Pool Float

Cond. View View 1 View 2 View 3

Original

Donut

Dragon – Tail Pointing Up

Cond. View View 1 View 2 View 3

Original

Tail

Cake – Oreo

Cond. View View 1 View 2 View 3

Original

Oreo

Motorcycle – Vintage & Modern

Cond. View View 1 View 2 View 3

Original

Vintage

Modern

German Shepherd – Plush & Pixar Style

Cond. View View 1 View 2 View 3

Original

Plush

Pixar

How EditP23 Works: From a 2D Edit to a 3D Model

The EditP23 pipeline propagates your 2D edit into a full, 3D-consistent object modification. The process is designed to be intuitive, requiring only a single edited view to guide the entire 3D update. Here’s how it unfolds:

Rendering & Initial Setup

The process begins with a 3D object, which is rendered to generate a multi-view grid (mv-grid) composed of six different viewpoints, and an additional fixed prompt view.

The 2D Edit

The user can take the prompt view and modify it with any preferred 2D editing tool, such as painting or generative AI. This user-edited image becomes the target view, which guides the 3D edit.

Edit Propagation via Multi-View Diffusion

The core of the method is a technique called "edit-aware denoising". At each step, the system runs two parallel processes within a multi-view diffusion model:

Source Branch: The original source mv-grid and source view are fed into the model to predict the velocity towards the original object.
Target Branch: The current, in-progress grid is conditioned on the target view to predict the velocity towards the final, edited object.

By subtracting the source prediction from the target prediction, the model calculates an "edit-only delta" (v_Δ). This delta isolates the user's intended changes, ensuring the rest of the object's structure and appearance are preserved. This delta then guides the update for the next iteration, consistently propagating the edit across all views in the grid.

Final 3D Reconstruction

After the diffusion process completes, the final, edited multi-view grid is generated. This edited grid is then converted into a textured 3D mesh using a reconstruction module like InstantMesh. The output is a fully edited 3D object.

Interactive 3D Reconstruction Gallery

Original

Edited

No texture

Cartoonish car

Original

Edited

No texture

Car with wings

Original

Edited

No texture

Fox with open eyes

Original

Edited

No texture

Golden R2D2

Original

Edited

No texture

Robot with sunglasses

Original

Edited

No texture

Terrier with beanie

Original

Edited

No texture

Batman with backpack

Original

Edited

No texture

Grogu with red robe

Original

Edited

No texture

Fox with tuxedo

Original

Edited

No texture

Gothic cathedral

Original

Edited

No texture

Terrier with Paddington's hat

Original

Edited

No texture

Vespa

Original

Edited

No texture

Grogu in the-force pose

Original

Edited

No texture

Superman in Superman pose

Original

Edited

No texture

Batmobile

Original

Edited

No texture

Lego figure of Grogu