3D object manipulation in a single image usint generative models

Applications of OMG3D.

Abstract

Object manipulation in images aims not only to edit the object appearances but also to animate objects. Previous methods faced challenges in simultaneously handling static editing and dynamic generation, while also struggling to achieve fidelity in object appearance and scene lighting. In this work, we introduce OMG3D, a novel framework that combines precise geometric control with the generative power of diffusion models, thus achieving significant improvements in visual performance. Our framework first converts 2D objects into 3D, enabling user-directed modifications and lifelike motion at the geometric level. To address texture realism, we propose CustomRefiner, a texture refinement module that pre-trains a customized diffusion model, aligning the details and style of coarse renderings from the rough 3D model with the original image, further refining the texture. Additionally, we introduce IllumiCombiner, a lighting processing module that estimates and corrects background lighting to match human visual perception, resulting in more realistic shadow effects. Extensive experiments demonstrate the outstanding visual performance of our approach in both static and dynamic scenarios. Remarkably, all of these steps can be executed on a single NVIDIA 3090. The code and project will be released upon the acceptance of the paper.

Comparison with other image animation methods.

By Pika	By DynamiCrafter	By CogVideoX
Text Description: An elephant walks on the ground.
By Wan 2.1	By Image Sculpting	By Our

Text Description: A boxtoy greets and waves its hand.
By Pika	By DynamiCrafter	By CogVideoX
By Wan 2.1	By Image Sculpting	By Our

Text Description: A woodenman sits and cheers.
By Pika	By DynamiCrafter	By CogVideoX
By Wan 2.1	By Image Sculpting	By Our

Text Description: A rotated pumpkin jumps on a stump.
By Pika	By DynamiCrafter	By CogVideoX
By Wan 2.1	By Image Sculpting	By Our

3D object manipulation in a single image using generative models

Applications of OMG3D.

Abstract

Model structure of our model, OMG3D.

Comparison with other image edit methods.

Comparison with other image animation methods.

Text Description: An elephant walks on the ground.

Text Description: A boxtoy greets and waves its hand.

Text Description: A woodenman sits and cheers.

Text Description: A rotated pumpkin jumps on a stump.

Video Demo