Attention

Inheritance Diagram

Inheritance diagram of ashpy.layers.attention.Attention

class ashpy.layers.attention.Attention(filters)[source]

Bases: tensorflow.python.keras.engine.training.Model

Attention Layer from Self-Attention GAN [1].

First we extract features from the previous layer:

\[f(x) = W_f x\]
\[g(x) = W_g x\]
\[h(x) = W_h x\]

Then we calculate the importance matrix:

\[\beta_{j,i} = \frac{\exp(s_{i,j})}{\sum_{i=1}^{N}\exp(s_{ij})}\]

\(\beta_{j,i}\) indicates the extent to which the model attends to the \(i^{th}\) location when synthethizing the \(j^{th}\) region.

Then we calculate the output of the attention layer \((o_1, ..., o_N) \in \mathbb{R}^{C \times N}\):

\[o_j = \sum_{i=1}^{N} \beta_{j,i} h(x_i)\]

Finally we combine the (scaled) attention and the input to get the final output of the layer:

\[y_i = \gamma o_i + x_i\]

where \(\gamma\) is initialized as 0.

Examples

  • Direct Usage:

    x = tf.ones((1, 10, 10, 64))
    
    # instantiate attention layer as model
    attention = Attention(64)
    
    # evaluate passing x
    output = attention(x)
    
    # the output shape is
    # the same as the input shape
    print(output.shape)
    
  • Inside a Model:

    def MyModel():
        inputs = tf.keras.layers.Input(shape=[None, None, 64])
        attention = Attention(64)
        return tf.keras.Model(inputs=inputs, outputs=attention(inputs))
    
    x = tf.ones((1, 10, 10, 64))
    model = MyModel()
    output = model(x)
    
    print(output.shape)
    
    (1, 10, 10, 64)
    
[1]Self-Attention Generative Adversarial Networks https://arxiv.org/abs/1805.08318

Methods

__init__(filters) Build the Attention Layer.
call(inputs[, training]) Perform the computation.

Attributes

activity_regularizer Optional regularizer function for the output of this layer.
dtype
dynamic
inbound_nodes Deprecated, do NOT use! Only for compatibility with external Keras.
input Retrieves the input tensor(s) of a layer.
input_mask Retrieves the input mask tensor(s) of a layer.
input_shape Retrieves the input shape(s) of a layer.
input_spec Gets the network’s input specs.
layers
losses Losses which are associated with this Layer.
metrics Returns the model’s metrics added using compile, add_metric APIs.
metrics_names Returns the model’s display labels for all outputs.
name Returns the name of this module as passed or determined in the ctor.
name_scope Returns a tf.name_scope instance for this class.
non_trainable_variables
non_trainable_weights
outbound_nodes Deprecated, do NOT use! Only for compatibility with external Keras.
output Retrieves the output tensor(s) of a layer.
output_mask Retrieves the output mask tensor(s) of a layer.
output_shape Retrieves the output shape(s) of a layer.
run_eagerly Settable attribute indicating whether the model should run eagerly.
sample_weights
state_updates Returns the updates from all layers that are stateful.
stateful
submodules Sequence of all sub-modules.
trainable
trainable_variables Sequence of variables owned by this module and it’s submodules.
trainable_weights
updates
variables Returns the list of all layer variables/weights.
weights Returns the list of all layer variables/weights.
__init__(filters)[source]

Build the Attention Layer.

Parameters:filters (int) – Number of filters of the input tensor. It should be preferably a multiple of 8.
Return type:None
call(inputs, training=False)[source]

Perform the computation.

Parameters:
  • inputs (tf.Tensor) – Inputs for the computation.
  • training (bool) – Controls for training or evaluation mode.
Return type:

Tensor

Returns:

tf.Tensor – Output Tensor.