TL;DR: Our proposed Reference-based Painterly Inpainting framework (RefPaint) allows controlling the strength of reference semantics and background style when performing inpainting. Compared to Stable Diffusion, which uses text prompts as the reference, our RefPaint captures the reference information better and is able to generate consistent styles.
Have you ever imagined how it would look if we placed new objects into paintings? For example, what would it look like if we placed a basketball into Claude Monet's ``Water Lilies, Evening Effect''?
We propose Reference-based Painterly Inpainting, a novel task that crosses the wild reference domain gap and implants novel objects into artworks. Although previous works have examined reference-based inpainting, they are not designed for large domain discrepancies between the target and the reference, such as inpainting an artistic image using a photorealistic reference. This paper proposes a novel diffusion framework, dubbed RefPaint, to ``inpaint more wildly'' by taking such references with large domain gaps. Built with an image-conditioned diffusion model, we introduce a ladder-side branch and a masked fusion mechanism to work with the inpainting mask. By decomposing the CLIP image embeddings at inference time, one can manipulate the strength of semantic and style information with ease. Experiments demonstrate that our proposed RefPaint framework produces significantly better results than existing methods. Our method enables creative painterly image inpainting with reference objects that would otherwise be difficult to achieve.
Given input quadruplets () that consists of an object-centric reference image , a background image and their corresponding binary masks and , the goal is to inpaint the input object into the masked region . We utilize a ladder-side branch and a masked fusion block to incorporate additional mask information. The framework is trained in a self-supervised manner.
Original
Reference
RefPaint
Stable Diffusion
Visual results for Reference-based Painterly Inpainting framework (RefPaint) results using random inpainting masks and random objects from COCO Captions dataset. Blue bounding box represents the edited regions where we would like to inpaint. Red boundaries indicate the reference object.
If our work is helpful for you, please cite:
@article{xu2023refpaint,
author = {},
title = {Reference-based Painterly Inpainting via Diffusion: Crossing the Wild Reference Domain Gap},
journal={arxiv preprint},
year={2023}
}