Semantic Image Generation
PISE: Person Image Synthesis and Editing with Decoupled GAN
Task
•
Person image synthesis
Contributions
•
Propose two-stage model with per-region control to decouple the shape and style of clothing.
•
Propose joint global and local per-region encoding and normalization to predict the reasonable style of clothing for invisible regions, and preserve the original style of clothing in the target image.
•
Propose a spatial-aware normalization to retain the spatial context relationship in the source image, and transfer it by modulating the scale and bias of the generated image feature.
Related
•
Image Generator SEAN
Image Synthesis
"Image Generators with Conditionally-Independent Pixel Synthesis"
Task : Image Synthesis
Intuition
•
Recent methods rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner.
•
color value at pixel = G( random latent vector , position of pixel )
(NO Conv. Layer / propagate information across the pixel !)
•
Position encoding
Result
FFID
Precision & Recall
Segmentation
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
Intuitions
•
Recent works focus only on mining “local” context.
◦
e.g) dependencies between pixels within individual images
◦
by context-aggregation modules: dilated convolution, neural attention
or structure-aware optimizations: IoU-like loss.
•
Thus, "Global" context of training data was ignored.
◦
e.g) rich semantic relations between pixels across different images.
Proposed Method
•
Inspired by Unsupervised constrative representation learning
•
pixel-wise contrastive algorithm for semantic segmentation in the fully supervised setting.
Representation Learning / Latent Space Discovory
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
Learning Statistical Texture for Semantic Segmentation
Task
•
Semantic segmentation
Proposed Method
Quantized and Counting Operation (QCO)
•
First quantize the input feature into multiple level (texture stastics)
•
then count the intensity of each level for texture feature encoding
Texture Enhancement Module (TEM)
•
Inspired by histogram equalization,
•
TEM is designed to build a graph to propagate information of all original quantization levels for texure enhancement.
Pyramid Texture Extraction Module (PTEM)
•
exploits the texture information from multiple scales with a texture feature extraction unit and pyramid structure.