Fork me on GitHub

src/arraymancer/nn_primitives/nnp_gru

  Source Edit

Procs

proc gru_backward[T: SomeFloat](dInput, dHidden0, dW3s0, dW3sN, dU3s, dbW3s,
                                dbU3s: var Tensor[T];
                                dOutput, dHiddenN: Tensor[T];
                                cached_inputs: seq[Tensor[T]];
                                cached_hiddens: seq[seq[Tensor[T]]];
                                W3s0, W3sN, U3s, rs, zs, ns, Uhs: Tensor[T])
⚠️ API subject to change to match CuDNNs   Source Edit
proc gru_cell_backward[T: SomeFloat](dx, dh, dW3, dU3, dbW3, dbU3: var Tensor[T];
                                     dnext: Tensor[T]; x, h, W3, U3: Tensor[T];
                                     r, z, n, Uh: Tensor[T])
Input:
  • dx, dh, dW3, dU3: respectively gradients of
  • dbW3 and dbU3: gradients of the biases for W3 and U3 weights
  • dnext: gradient flowing back from the next layer
  • x, h, W3, U3: inputs saved from the forward pass
  • r, z, n, Uh: intermediate results saved from the forward pass of shape batch_size, hidden_size
  Source Edit
proc gru_cell_forward[T: SomeFloat](input, W3, U3, bW3, bU3: Tensor[T];
                                    r, z, n, Uh, hidden: var Tensor[T])
Input:

Output:

  • r, z, n, Uh: intermediate tensors saved for backpropagation. of shape batch_size, hidden_size
  • y == h'(t): The next hidden state of the GRU Cell. (GRU output and next hidden state are the same)

⚠️ Input/output updated in place:

  Source Edit
proc gru_cell_inference[T: SomeFloat](input: Tensor[T];
                                      W3, U3, bW3, bU3: Tensor[T];
                                      hidden: var Tensor[T])
Input:

Output (in-place):

  • y == h'(t): The next hidden state of the GRU Cell. (GRU output and next hidden state are the same)

⚠️ Input/Output updated in-place:

This is an optimized function when backpropagation is not needed.

  Source Edit
proc gru_forward[T: SomeFloat](input: Tensor[T]; W3s0, W3sN: Tensor[T];
                               U3s, bW3s, bU3s: Tensor[T];
                               rs, zs, ns, Uhs: var Tensor[T];
                               output, hidden: var Tensor[T];
                               cached_inputs: var seq[Tensor[T]];
                               cached_hiddens: var seq[seq[Tensor[T]]])

⚠️ API subject to change to match CuDNNs

Bidirectional support is not implemented

Inputs:

Outputs:

⚠️ Input/Output updated in-place:

  Source Edit
proc gru_inference[T: SomeFloat](input: Tensor[T]; W3s0, W3sN: Tensor[T];
                                 U3s, bW3s, bU3s: Tensor[T];
                                 output, hidden: var Tensor[T])

Bidirectional support is not implemented

Inputs:

Outputs:

⚠️ Input/Output updated in-place:

  Source Edit
Arraymancer Technical reference Tutorial Spellbook (How-To's) Under the hood