Disentangling the latent space of GANs for semantic face editing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Disentanglement research is a critical and important issue in the field of image editing. In order to perform disentangled editing on images generated by generative models, this paper presents an unsupervised, model-agnostic, two-stage trained editing framework. This work addresses the problem of discovering interpretable, disentangled directions of edited image attributes in the latent space of generative models. This effort’s primary objective was to address the limitations discovered in previous research, mainly (a) the discovered editing directions are interpretable but significantly entangled, i.e., changes to one attribute affect the others and (b) Prior research has utilized direction discovery and direction disentanglement separately, and they can’t work synergistically. More specifically, this paper proposes a two-stage training method that discovers the editing direction with semantics, perturbs the dimension of the direction vector, adjusts it with a penalty mechanism, and makes the editing direction more disentangled. This allows easy distinguishable image editing, such as age and facial expressions in facial images. Experimentally compared to other methods, the proposed method outperforms them both qualitatively and quantitatively in terms of interpretability, disentanglement, and distinguishability of the generated images. The implementation of our method is available at https://github.com/ydniuyongjie/twoStageForFaceEdit.

Related collections

Most cited references 21

Record: found
Abstract: not found
Conference Proceedings: not found

Deep Learning Face Attributes in the Wild

Ziwei Liu, Ping Luo, Xiaogang Wang … (2015)

0 comments Cited 672 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Analyzing and Improving the Image Quality of StyleGAN

Tero Karras, Samuli Laine, Miika Aittala … (2020)

0 comments Cited 160 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: not found

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, Luke Metz, Soumith Chintala (2015)

In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations. Under review as a conference paper at ICLR 2016

0 comments Cited 119 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Yongjie Niu:

ORCID: https://orcid.org/0000-0001-9266-1444

Role: Formal analysisRole: MethodologyRole: SoftwareRole: ValidationRole: Writing – original draft

Mingquan Zhou: Role: ConceptualizationRole: Methodology

Zhan Li: Role: ValidationRole: Writing – review & editing

Feng Ding: Role: Editor

Journal

Journal ID (nlm-ta): PLoS One

Journal ID (iso-abbrev): PLoS One

Journal ID (publisher-id): plos

Title: PLOS ONE

Publisher: Public Library of Science (San Francisco, CA USA )

ISSN (Electronic): 1932-6203

Publication date (Electronic): 26 October 2023

Publication date Collection: 2023

Volume: 18

Issue: 10

Electronic Location Identifier: e0293496

Affiliations

[1 ] School of Information Science and Technology, Northwest University, Xi’an, China

[2 ] College of Mathematics and Computer Science, Yan’an University, Yan’an, China

Nanchang University, CHINA

Author notes

Competing Interests: The authors have declared that no competing interests exist.

* E-mail: mqzhou@ 123456nwu.edu.cn

Author information

Yongjie Niu https://orcid.org/0000-0001-9266-1444

Article

Publisher ID: PONE-D-23-07425

DOI: 10.1371/journal.pone.0293496

PMC ID: 10602338

PubMed ID: 37883462

SO-VID: 7e844919-71b0-46d1-813f-ed0eb2a6177f

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 19 April 2023

Date accepted : 13 October 2023

Page count

Figures: 6, Tables: 1, Pages: 17

Funding

Funded by: Natural National Science Foundation of China

Award ID: 61731015

Award Recipient : Mingquan Zhou

Funded by: funder-id http://dx.doi.org/10.13039/100014718, Innovative Research Group Project of the National Natural Science Foundation of China;

Award ID: 62271393

Award Recipient : Mingquan Zhou

This work was supported by the Natural National Science Foundation of China (NSFC) under the Grants 62271393, 61731015, in part by the Shanxi Provincial Key Research and Development Project under the Grant 2019ZDLGY10-01. The funder provided valuable advice during the research design process and provided the necessary hardware and excellent environmental support for the operation of the experiment. With the support of the funder, our experiments can be so detailed and complex.

Custom metadata

Data Availability All data and code files are available from the GitHub repository ( https://github.com/ydniuyongjie/twoStageForFaceEdit). The pre-trained model StyleGAN2 with a resolution of 256×256 can be obtained from the GitHub repository ( https://github.com/rosinality/stylegan2-pytorch).

Disentangling the latent space of GANs for semantic face editing

Read this article at

Abstract

Related collections

PLOS Climate

Most cited references 21

Deep Learning Face Attributes in the Wild

Analyzing and Improving the Image Quality of StyleGAN

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Author and article information

Contributors

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 37

Most referenced authors 240