Attention¶
Inheritance Diagram
-
class
ashpy.layers.attention.
Attention
(filters)[source]¶ Bases:
tensorflow.python.keras.engine.training.Model
Attention Layer from Self-Attention GAN 1.
First we extract features from the previous layer:
\[f(x) = W_f x\]\[g(x) = W_g x\]\[h(x) = W_h x\]Then we calculate the importance matrix:
\[\beta_{j,i} = \frac{\exp(s_{i,j})}{\sum_{i=1}^{N}\exp(s_{ij})}\]\(\beta_{j,i}\) indicates the extent to which the model attends to the \(i^{th}\) location when synthethizing the \(j^{th}\) region.
Then we calculate the output of the attention layer \((o_1, ..., o_N) \in \mathbb{R}^{C \times N}\):
\[o_j = \sum_{i=1}^{N} \beta_{j,i} h(x_i)\]Finally we combine the (scaled) attention and the input to get the final output of the layer:
\[y_i = \gamma o_i + x_i\]where \(\gamma\) is initialized as 0.
Examples
Direct Usage:
x = tf.ones((1, 10, 10, 64)) # instantiate attention layer as model attention = Attention(64) # evaluate passing x output = attention(x) # the output shape is # the same as the input shape print(output.shape)
Inside a Model:
def MyModel(): inputs = tf.keras.layers.Input(shape=[None, None, 64]) attention = Attention(64) return tf.keras.Model(inputs=inputs, outputs=attention(inputs)) x = tf.ones((1, 10, 10, 64)) model = MyModel() output = model(x) print(output.shape)
(1, 10, 10, 64)
- 1
Self-Attention Generative Adversarial Networks https://arxiv.org/abs/1805.08318
Methods
__init__
(filters)Build the Attention Layer.
call
(inputs[, training])Perform the computation.
Attributes
activity_regularizer
Optional regularizer function for the output of this layer.
dtype
dynamic
inbound_nodes
Deprecated, do NOT use! Only for compatibility with external Keras.
input
Retrieves the input tensor(s) of a layer.
input_mask
Retrieves the input mask tensor(s) of a layer.
input_shape
Retrieves the input shape(s) of a layer.
input_spec
Gets the network’s input specs.
layers
losses
Losses which are associated with this Layer.
metrics
Returns the model’s metrics added using compile, add_metric APIs.
metrics_names
Returns the model’s display labels for all outputs.
name
Returns the name of this module as passed or determined in the ctor.
name_scope
Returns a tf.name_scope instance for this class.
non_trainable_variables
non_trainable_weights
outbound_nodes
Deprecated, do NOT use! Only for compatibility with external Keras.
output
Retrieves the output tensor(s) of a layer.
output_mask
Retrieves the output mask tensor(s) of a layer.
output_shape
Retrieves the output shape(s) of a layer.
run_eagerly
Settable attribute indicating whether the model should run eagerly.
sample_weights
state_updates
Returns the updates from all layers that are stateful.
stateful
submodules
Sequence of all sub-modules.
trainable
trainable_variables
Sequence of variables owned by this module and it’s submodules.
trainable_weights
updates
variables
Returns the list of all layer variables/weights.
weights
Returns the list of all layer variables/weights.