pix2pixhd¶
Pix2Pix HD Implementation.
See: “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs” [1]
Global Generator + Local Enhancer
[1] | High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs: https://arxiv.org/abs/1711.11585 |
Classes
GlobalGenerator |
Global Generator from pix2pixHD paper. |
LocalEnhancer |
Local Enhancer module of the Pix2PixHD architecture. |
ResNetBlock |
ResNet Blocks. |
-
class
ashpy.models.convolutional.pix2pixhd.
GlobalGenerator
(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks=9, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]¶ Bases:
ashpy.models.convolutional.interfaces.Conv2DInterface
Global Generator from pix2pixHD paper.
- G1^F: Convolutional frontend (downsampling)
- G1^R: ResNet Block
- G1^B: Convolutional backend (upsampling)
-
__init__
(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks=9, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]¶ Global Generator from Pix2PixHD.
Parameters: - input_res (int) – Input Resolution.
- min_res (int) – Minimum resolution reached by the downsampling.
- initial_filters (int) – number of initial filters.
- filters_cap (int) – maximum number of filters.
- channels (int) – output channels.
- normalization_layer (
tf.keras.layers.Layer
) – normalization layer used by the global generator, can be Instance Norm, Layer Norm, Batch Norm. - non_linearity (
tf.keras.layers.Layer
) – non linearity used in the global generator. - num_resnet_blocks (int) – number of resnet blocks.
- kernel_size_resnet (int) – kernel size used in resnets conv layers.
- kernel_size_front_back (int) – kernel size used by the convolutional frontend and backend.
- num_internal_resnet_blocks (int) – number of blocks used by internal resnet.
-
class
ashpy.models.convolutional.pix2pixhd.
LocalEnhancer
(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks_global=9, num_resnet_blocks_local=3, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]¶ Bases:
tensorflow.python.keras.engine.training.Model
Local Enhancer module of the Pix2PixHD architecture.
Example
# instantiate the model model = LocalEnhancer() # call the model passing inputs inputs = tf.ones((1, 512, 512, 3)) output = model(inputs) # the output shape is # the same as the input shape print(output.shape)
(1, 512, 512, 3)
-
__init__
(input_res=512, min_res=64, initial_filters=64, filters_cap=512, channels=3, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, num_resnet_blocks_global=9, num_resnet_blocks_local=3, kernel_size_resnet=3, kernel_size_front_back=7, num_internal_resnet_blocks=2)[source]¶ Build the LocalEnhancer module of the Pix2PixHD architecture.
See High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs [2] for more details.
Parameters: - input_res (int) – input resolution.
- min_res (int) – minimum resolution reached by the global generator.
- initial_filters (int) – number of initial filters.
- filters_cap (int) – maximum number of filters.
- channels (int) – number of channels.
- normalization_layer (
tf.keras.layers.Layer
) – layer of normalization - Instance Normalization or BatchNormalization or LayerNormalization) ((e.g.) –
- non_linearity (
tf.keras.layers.Layer
) – non linearity used in Pix2Pix HD. - num_resnet_blocks_global (int) – number of residual blocks used in the global generator.
- num_resnet_blocks_local (int) – number of residual blocks used in the local generator.
- kernel_size_resnet (int) – kernel size used in resnets.
- kernel_size_front_back (int) – kernel size used for the front and back convolution.
- num_internal_resnet_blocks (int) – number of internal blocks of the resnet.
[2] High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs https://arxiv.org/abs/1711.11585
-
-
class
ashpy.models.convolutional.pix2pixhd.
ResNetBlock
(filters, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, kernel_size=3, num_blocks=2)[source]¶ Bases:
tensorflow.python.keras.engine.training.Model
ResNet Blocks.
The input filters is the same as the output filters.
-
__init__
(filters, normalization_layer=<class 'ashpy.layers.instance_normalization.InstanceNormalization'>, non_linearity=<class 'tensorflow.python.keras.layers.advanced_activations.ReLU'>, kernel_size=3, num_blocks=2)[source]¶ Build the ResNet block composed by num_blocks.
- Each block is composed by
- Conv2D with strides 1 and padding “same”
- Normalization Layer
- Non Linearity
The final result is the output of the ResNet + input.
Parameters: - filters (int) – initial filters (same as the output filters).
- normalization_layer (
tf.keras.layers.Layer
) – layer of normalization used by the residual block. - non_linearity (
tf.keras.layers.Layer
) – non linearity used in the resnet block. - kernel_size (int) – kernel size used in the resnet block.
- num_blocks (int) – number of blocks, each block is composed by conv, normalization and non linearity.
-