Imports

tensor, autograd, nn_primitives, init

Types

GRUGate[TT] {.final.} = ref object of Gate[TT]: For now the GRU layer only supports fixed size GRU stack and Timesteps Source Edit
GRULayer[T] = object w3s0*, w3sN*: Variable[Tensor[T]] u3s*: Variable[Tensor[T]] bW3s*, bU3s*: Variable[Tensor[T]]: Source Edit

Procs

proc forward[T](self: GRULayer[T]; input, hidden0: Variable): tuple[ output, hiddenN: Variable]: Inputs:
input: Input tensor of shape sequence/timesteps, batch, features

hidden0 the initial hidden state of shape num_stacked_layers, batch, hidden_size

Outputs:

output of shape sequence/timesteps, batch, num_directions * hidden_size. output contains the output features hiddenT for each T (timesteps)

hiddenN of shape num_stacked_layers * num_directions, batch, hidden_size. hiddenN contains the hidden state for timestep T == sequence/timesteps length of input

Source Edit
proc gru[TT](input, hidden0: Variable[TT]; W3s0, W3sN, U3s: Variable[TT]; bW3s, bU3s: Variable[TT]): tuple[output, hiddenN: Variable[TT]]: ⚠️ API subject to change to match CuDNNs

Bidirectional support is not implemented

Inputs:

input: Input tensor of shape sequence/timesteps, batch, features

hidden0 the initial hidden state of shape num_stacked_layers, batch, hidden_size

Input weights W3s of shapes:
W3s0: 3 * hidden_size, features for the first layer

W3sN: num_stacked_layers - 1, 3 * hidden_size, num_directions * hidden_size for the following layers

A series of hidden state weights U3s of shape num_stacked_layers, 3 * hidden_size, hidden_size

A series of biases for input and hidden state weights of shape num_stacked_layers, 1, 3 * hidden_size

Outputs:

output of shape sequence/timesteps, batch, num_directions * hidden_size. output contains the output features hiddenT for each T (timesteps)

hiddenN of shape num_stacked_layers * num_directions, batch, hidden_size. hiddenN contains the hidden state for timestep T == sequence/timesteps length of input

Source Edit
proc init[T](ctx: Context[Tensor[T]]; layerType: typedesc[GRULayer[T]]; numInputFeatures, hiddenSize, layers: int): GRULayer[T]: Creates an gated recurrent layer. Input:
- ``numInputFeatures`` Number of features of the input. - ``hiddenSize`` size of the hidden layer(s) - ``layers`` Number of stacked layers

Returns the created GRULayer.
Source Edit