Pose-guided person image synthesis in the non-iconic views
Published in IEEE Transactions on Image Processing, 2020
Generating realistic images with the guidance of reference images and human poses is challenging. Despite the success of previous works on synthesizing person images in the iconic views, no efforts are made towards the task of pose-guided image synthesis in the non-iconic views. Particularly, we find that previous models cannot handle such a complex task, where the person images are captured in the non-iconic views by commercially-available digital cameras. To this end, we propose a new framework - Multi-branch Refinement Network (MR-Net), which utilizes several visual cues, including target person poses, foreground person body and scene images parsed. Furthermore, a novel Region of Interest (RoI) perceptual loss is proposed to optimize the MR-Net. Extensive experiments on two non-iconic datasets, Penn Action and BBC-Pose, as well as an iconic dataset - Market-1501, show the efficacy of the proposed model that can tackle the problem of pose-guided person image generation from the non-iconic views. The data, models, and codes are downloadable from https://github.com/loadder/MR-Net.
Recommended citation: Xu, C., Fu, Y., Wen, C., Pan, Y., Jiang, Y. G., & Xue, X. (2020). Pose-guided person image synthesis in the non-iconic views. IEEE Transactions on Image Processing, 29, 9060-9072.
Download Paper