Fork me on GitHub

src/arraymancer/spatial/kdtree

  Source Edit

Types

KDTree[T] = ref object
  data*: Tensor[T]           ## k-d data stored in this tree
  leafSize*: int             ## maximum size of elements in a leaf node, default: 16
  k*: int                    ## dimension of a single data point
  n*: int                    ## number of data points
  maxes*: Tensor[T]          ## maximum values along each dimension of `n` data points
  mins*: Tensor[T]           ## minimum values along each dimension of `n` data points
  tree*: Node[T]             ## the root node of the tree
  size*: int                 ## number of nodes in the tree
  
  Source Edit
Node[T] = ref object
  level*: int                ## the level this node is at in the tree
  id*: int                   ## a unique id for this node
  case kind*: TreeNodeKind   ## is it a leaf node or an inner node?
  of tnInner:
      lesser*: Node[T]       ## nodes on the "lesser" side of `split`
      greater*: Node[T]      ## nodes on the "greater" side of `split`
      split_dim*: int        ## the dimension this node splits the space at
      split*: float          ## the value at which the space is split in `split_dim`
    
  of tnLeaf:
      children*: int         ## number of indices stored in this node
      idx*: Tensor[int]      ## the indices of the input data stored in this node
    
  
  Source Edit
TreeNodeKind = enum
  tnLeaf, tnInner
  Source Edit

Procs

proc clone[T](kd: KDTree[T]): KDTree[T]
  Source Edit
proc clone[T](n: Node[T]): Node[T]
  Source Edit
proc kdTree[T](data: Tensor[T]; leafSize = 16; copyData = true;
               balancedTree: static bool = true): KDTree[T]

Builds a k-d tree based on the input data.

data must be of shape (n, k) where n is the number of input data points and k the dimension of each point.

leafSize is the maximum number of elements to be stored in a single leaf node.

If balancedTree is true, we split along the most separated axis at the median point. Otherwise we split in the middle. The former leads to slightly longer build times, as we have to compute the median (which implies sorting the data along the axis), but results in a more balanced tree.

  Source Edit
proc query[T](tree: KDTree[T]; x: Tensor[T]; k = 1; eps = 0.0;
              metric: typedesc[AnyMetric] = Euclidean; p = 2.0;
              distanceUpperBound = Inf): tuple[dist: Tensor[T], idx: Tensor[int]]

Queries the k-d tree at point x for the k closest neighbors.

distanceUpperBound can be set to stop the search even if less than k neighbors have been found within that hyperradius.

eps is the relative epsilon for distance comparison by which distanceUpperBound is scaled.

metric is the distance metric to be used to compute the distances between points. By default we use the Euclidean metric.

If the Minkowski metric is used, p is the used power. This affects the way distances between points are computed. For certain values other metrics are recovered:

  • p = 1: Manhattan distance
  • p = 2: Euclidean distance
  Source Edit
proc query_ball_point[T](tree: KDTree[T]; x: Tensor[T]; radius: float;
                         eps = 0.0; metric: typedesc[AnyMetric] = Euclidean;
                         p = 2.0): tuple[dist: Tensor[T], idx: Tensor[int]]

Queries the k-d tree around point x for all points within the hyperradius radius.

eps is the relative epsilon for distance comparison by which distanceUpperBound is scaled.

metric is the distance metric to be used to compute the distances between points. By default we use the euclidean metric.

If the Minkowski metric is used, p is the used power. This affects the way distances between points are computed. For certain values other metrics are recovered:

  • p = 1: Manhattan distance
  • p = 2: Euclidean distance
  Source Edit
Arraymancer Technical reference Tutorial Spellbook (How-To's) Under the hood