Fork me on GitHub

src/arraymancer/stats/kde

  Source Edit

Types

KernelFunc = proc (x, x_i, bw: float): float {.inline.}
  Source Edit
KernelKind = enum
  knCustom = "custom", knBox = "box", knTriangular = "triangular",
  knTrig = "trigonometric", knEpanechnikov = "epanechnikov", knGauss = "gauss"
  Source Edit

Procs

proc boxKernel(x`gensym1, x_i`gensym1, bw`gensym1: float): float {.inline,
    ...raises: [], tags: [], forbids: [].}
  Source Edit
proc epanechnikovKernel(x`gensym4, x_i`gensym4, bw`gensym4: float): float {.
    inline, ...raises: [], tags: [], forbids: [].}
  Source Edit
proc gaussKernel(x, x_i, bw: float): float {.inline, ...raises: [], tags: [],
    forbids: [].}
  Source Edit
proc kde[T: SomeNumber; U: int | Tensor[SomeNumber] | openArray[SomeNumber]](
    t: Tensor[T]; kernel: static KernelFunc; kernelKind = knCustom;
    adjust: float = 1.0; samples: U = 1000; bw: float = NaN; normalize = false;
    cutoff: float = NaN; weights: Tensor[T] = newTensor[T](0)): Tensor[float]

Returns the kernel density estimation for the 1D tensor t. The returned Tensor[float] contains samples elements. The input will be converted to float.

The bandwidth is estimated using Silverman's rule of thumb.

adjust can be used to scale the automatic bandwidth calculation. Note that this assumes the data is roughly normal distributed. To override the automatic bandwidth calculation, hand the bw manually. If normalize is true the result will be normalized such that the integral over it is equal to 1.

By default the evaluation points will be samples linearly spaced points between [min(t), max(t)]. If desired the evaluation points can be given explicitly by handing a Tensor[float] | openArray[float] as samples.

The kernel is the kernel function that will be used. Unless you want to use a custom kernel function, call the convenience wrapper below, which only takes a KernelKind (either as string or directly as an enum value) below, which defaults to a gaussian kernel.

Custom kernel functions are supported by handing a function of signature

KernelFunc = proc(x, x_i, bw: float): float

to this procedure and setting the kernelKind to knCustom. This requires to also hand a cutoff, which is the window of s[j] - t[i] <= cutoff, where s[j] is the j-th sample and t[i] the i-th input value. Only this window is considered for the kernel summation for efficiency. Set it such that the contribution of the custom kernel is very small (or 0) outside that range.

  Source Edit
proc kde[T: SomeNumber; U: KernelKind | string;
         V: int | Tensor[SomeNumber] | openArray[SomeNumber]](t: Tensor[T];
    kernel: U = "gauss"; adjust: float = 1.0; samples: V = 1000;
    bw: float = NaN; normalize = false; weights: Tensor[T] = newTensor[T](0)): Tensor[
    float]

This is a convenience wrapper around the above defined kde proc, which takes a kernel as a string corresponding to the string value of the KernelKind enum or a KernelKind value directly, which does not require to manually hand a kernel procedure.

By default a gaussian kernel is used.

  Source Edit
proc triangularKernel(x`gensym2, x_i`gensym2, bw`gensym2: float): float {.
    inline, ...raises: [], tags: [], forbids: [].}
  Source Edit
proc trigonometricKernel(x`gensym3, x_i`gensym3, bw`gensym3: float): float {.
    inline, ...raises: [], tags: [], forbids: [].}
  Source Edit

Templates

template makeKernel(fn: untyped): untyped
  Source Edit
Arraymancer Technical reference Tutorial Spellbook (How-To's) Under the hood