Fork me on GitHub

src/arraymancer/nn_primitives/nnp_softmax_cross_entropy

  Source Edit

Procs

proc softmax_cross_entropy[T](input, target: Tensor[T]): T

Softmax function + Cross-Entropy loss fused in one layer.

Input:

Returns:

  • Apply a softmax activation and returns the cross-entropy loss.

Softmax_cross_entropy measures the cross-entropy error for multiclass classification. Classes are mutually exclusive (only 1 label is true) but the truth labels (target) need not be.

Note: Instead of one-hot-encoded labels, it is more efficient to use sparse_softmax_cross_entropy instead of feeding softmax_cross_entropy.

For example if your true probablities are (car: 0.10, airplane: 0.60, bike: 0.05, bus: 0.25), you have to use softmax_cross_entropy

However if your true probablities are (car: 0, airplane: 1, bike: 0, bus: 0) (a one-hot-encoded vector), you should prefer sparse_softmax_cross_entropy

  Source Edit
proc softmax_cross_entropy_backward[T](gradient: Tensor[T] or T;
                                       cached_tensor: Tensor[T];
                                       target: Tensor[T]): Tensor[T] {.noinit.}
Derivatives of softmax_cross_entropy Input:
  • The input gradient as a scalar or a Tensor
  • A cache tensor that contains data from before the forward pass
  • The target values

Shape:

  • Both the cache and target shape should be batchsize, features i.e. number of samples as first dimension
  Source Edit
proc sparse_softmax_cross_entropy[T; Idx: SomeNumber or byte or char or enum](
    input: Tensor[T]; target: Tensor[Idx]): T

Softmax function + Cross-Entropy loss fused in one layer.

Input:

Returns:

  • Apply a softmax activation and returns the cross-entropy loss.

sparse_softmax_cross_entropy measures the cross-entropy error for multiclass classification. Classes are mutually exclusive (only 1 label is true).

Important: 0, 0, 1 means label 2 is true i.e. labels start at 0

Note: Instead of one-hot-encoded labels, it is more efficient to use sparse_softmax_cross_entropy instead of feeding softmax_cross_entropy.

For example if your true probablities are (car: 0.10, airplane: 0.60, bike: 0.05, bus: 0.25), you have to use softmax_cross_entropy

However if your true probablities are (car: 0, airplane: 1, bike: 0, bus: 0) (a one-hot-encoded vector), you should prefer sparse_softmax_cross_entropy

  Source Edit
proc sparse_softmax_cross_entropy_backward[T;
    Idx: SomeNumber or byte or char or enum](gradient: Tensor[T] or T;
    cached_tensor: Tensor[T]; target: Tensor[Idx]): Tensor[T] {.noinit.}
Derivatives of sparse_softmax_cross_entropy Input:
  • The input gradient as a scalar or a Tensor
  • A cache tensor that contains data from before the forward pass
  • The target values

Shape:

  Source Edit
Arraymancer Technical reference Tutorial Spellbook (How-To's) Under the hood