Created by Jiali Duan (Guest Lecture) on March.1st, 2019 2nd year PhD at USC, supervised by C.-C. Jay Kuo
Research Interests
A Counterfeiter-police game between two components: generator and discriminator
Competition drives both to improve, until counterfeits are indistinguishable from genuine currency
A min-max game between generator G and discriminator D
$ min_{G}max_{D}V =E_{x\sim p_{data}(x)}[logD(x)] + E_{z\sim p_{z}(z)}[log(1-D(G(z)))]$
Intuitively
We've seen discriminative models for far
Generative models
Train Discriminator
Train Generator
Training algorithm
GAN sample code
Image to Image translation (CycleGAN)
Facial Expression Synthesis (GANimation)
Multi-domain translation (StarGAN)
Pose conditioned generation (PoseGAN)
Video to Video Generation (VideoGAN)
Interactive Editing (SC-FEGAN/GAN-Dissect)
3D model generation
Music Generation
Image Generation (Progressive GAN/BigGAN)
DeepFake
Difficulty and Problems
$ V(D,G) =E_{x\sim p_{data}(x)}[logD(x)] + E_{z\sim p_{z}(z)}[log(1-D(G(z)))]$
$\nabla_{\theta_{G}}V(G,D)=\nabla_{\theta_{G}}E_{z\sim p_{z}(z)}[log(1-D(G(z)))]$
Recall that:
$\nabla_{a}log(1-\sigma(a))=\frac{-\nabla_{a}\sigma (a)}{1-\sigma (a)}= \frac{-\sigma(a)(1-\sigma(a))}{1-\sigma(a)}=-\sigma(a)$
Therefore:
$\nabla_{\theta_{G}}V(G,D)=-D(G(z))=0$, when D is confident
Can be proved that Nash-equilibrium of this min-max game is:
Optimizing $E_{z\sim p_{z}(z)}[log(1-D(G(z)))]$:
when D is optimal, minimizing the loss is equal to minimizing the JS divergence (Arjovsky & Bottou, 2017)
$L(D^{*}, g(\theta))=2JSD(P_{r}|P_{g})-2log2$
The JS divergence for the two distributions $P_{r}$ and $P_{g}$ is (almost) always log2 because $P_{r}$ and $P_{g}$ hardly can overlap (Arjovsky & Bottou, 2017, Theorem 2.1~2.3)
Thus, will give arise to gradient vanishing problem
One suggest solution is: $E_{z\sim p_{z}(z)}[-log(D(G(z)))]$
However, when D is optimal, minimizing the loss is equal to minimizing the KL divergence meanwhile maximizing the JS divergence (Arjovsky & Bottou, 2017, Theorem 2.5):
$KL(P_{g}|P_{r})-2JSD(P_{g}|P_{r})$
$KL(P_{g}|P_{r})-2JSD(P_{g}|P_{r})$
Note that KL and JSD have the same sign
Thus, will give rise to unstable training
But let’s consider one extreme case where G is trained extensively without updates to D. The generated images will converge to find the optimal image x* that fool D the most.
$x^{*}=argmax_{x}D(x)$
In this extreme, x* will be independent of z.
Conditional GANs (Mirza & Osindero, 2014)
$V(D,G) = E_{x\sim p_{data}(x)}[logD(x|y)] + E_{z\sim p_{z}(z)}[log(1-D(G(z|y)))]$
Improved GAN (Salimans et al., 2016)
iGAN (Zhu et al., 2016)
Pix2pix (Isola et al., 2017)
Pix2pix (Isola et al., 2017)
Pix2pix (Isola et al., 2017)
LAPGAN (Denton et al., 2015)
Matching-aware Discriminator (Reed et al., 2016)
Matching-aware Discriminator (Reed et al., 2016)
$L_{D} = log(s_{r}) + (log(1-s_{z}) + log(1-s_{f}))/2$
StackGAN (Zhang et al., 2016)
StackGAN (Zhang et al., 2016)
DCGAN (Radford et al., 2016)
DCGAN (Radford et al., 2016)
DCGAN (Radford et al., 2016)
pix2pix (Isola et al., 2017)
GP-GAN (Wu et al., 2017)
Illustrations
Symptoms
Reasons
$min_{G}max_{D} \neq max_{D}min_{G}$
Mode Regularized GANs (Che et al., 2017)
Add an additional Encoder to:
The end result is: fake data becomes harder to recognize
Mode Regularized GANs (Che et al., 2017)
Mode Regularized GANs (Che et al., 2017)
Also proposed Manifold-Diffusion GANs (MDGAN)
Mode Regularized GANs (Che et al., 2017)
Energy-based GANs (Zhao et al., 2017)
Digression: Another view of GAN
$ V(D,G) = E_{x\sim p_{data}(x)}[logD(x)] + E_{z\sim p_{z}(z)}[log(1-D(G(z)))]$
$ V(D,G) = E_{x\sim p_{data}(x)}[log(1-D(x))] + E_{z\sim p_{z}(z)}[logD(G(z))]$
Fake =1, Real =0
Energy-based GANs (Zhao et al., 2017)
Modified Game Plans
$D^{*} = argmin_{D}E_{x\sim p_{data}(x)}[log(1-D(x))] + E_{z\sim p_{z}(z)}[logD(G(z))]$
Energy-based GANs (Zhao et al., 2017)
$D(x)=||Dec(Enc(x))-x||_{MSE}$
DiscoGAN (Kim et al., 2017)
CycleGAN (Zhu et al., 2017)/DualGAN (Yi et al., 2017)
CycleGAN (Zhu et al., 2017)
Wasserstein GANs (Arjovsky et al., 2017)
Wasserstein-1 Distance (Earth-Mover Distance):
$W(P_{r},P_{g})=inf_{\gamma \sim \Pi (P_{r},P_{g})}E_{(x,y)\sim \gamma}||x-y||$
Why is it superior than KL or JS Divergence?
Wasserstein GANs (Arjovsky et al., 2017)
$W(P_{r},P_{g})=inf_{\gamma \sim \Pi (P_{r},P_{g})}E_{(x,y)\sim \gamma}||x-y||$
Intuitively, $\gamma(x,y)$ indicates how much "mass" must be transported from x to y in order to transform the distribution from $P_{r}$ to $P_{g}$
Wasserstein GANs (Arjovsky et al., 2017)
Wasserstein GANs (Arjovsky et al., 2017)
Continuous and everywhere differentiable
Wasserstein GANs (Arjovsky et al., 2017)
By applying the Kantorovich-Rubinstein duality (Villani, 2008), Wasserstein GANs becomes:
$min_{G}max_{D} E_{x\sim P_{r}}[D(x)] - E_{\hat{x}\sim P_{g}}[D(\hat{x})]$
Wasserstein GANs (Arjovsky et al., 2017)
$min_{G}max_{D} E_{x\sim P_{r}}[D(x)] - E_{\hat{x}\sim P_{g}}[D(\hat{x})]$
However, above equation requires that D must satisfy 1-Lipschitz continuity:
$|f(x_{1})-f(x_{2})|\leq K|x_{1}-x_{2}|$
To satisfy this, WGAN enforces the weights of D to lie in a compact space [-c,c]
Wasserstein GANs (Arjovsky et al., 2017)
WGAN removes sigmoid layer in D, because it's regression now!
Wasserstein GANs (Arjovsky et al., 2017)
Wasserstein GANs (Arjovsky et al., 2017)
PortraitGAN (Duan et al., 2018)
Motivation
PortraitGAN (Duan et al., 2018)
PortraitGAN (Duan et al., 2018)
PortraitGAN (Duan et al., 2018)
Robot Learning via Human Adversarial Games (Duan et al., 2019)
System Overview
Robot Learning via Human Adversarial Games (Duan et al., 2019)
$r=R^{R}(s,a^{R},s^{+})-\alpha R^{H}(s^{+},a^{H},s^{++})$
Robot Learning via Human Adversarial Games (Duan et al., 2019)
Robot Learning via Human Adversarial Games (Duan et al., 2019)
Robot Learning via Human Adversarial Games (Duan et al., 2019)
https://github.com/davidsonic/Interactive-mujoco_py
- GAN is a milestone in computer vision
- GAN is a fun and fast-developing research field
Many thanks to Prof. Keith Chugg, Brandon Franzke, C.-C.Jay Kuo, Stefanos Nikolaidis
Courtesy to all authors mentioned in the slides: Ian Goodfellow, Yanran Li, Binglin, Shashank, Bhargav, Jon Krohn, Francois Chollet. etc.