Interface ComputeClusters<P>

All Known Implementing Classes:
ExpectationMaximizationGmm_F64, StandardKMeans, StandardKMeans_MT

public interface ComputeClusters<P>
Given a set of points in N-dimensional space, compute a set of unique assignment for each point to a cluster. The clusters will be designed to minimize some distance function between each cluster and the points assigned to it.
  • Method Summary

    Modifier and Type Method Description
    AssignCluster<P> getAssignment()
    Returns a class which is used to assign a point to a cluster.
    double getDistanceMeasure()
    Returns the sum of all the distances between each point in the set.
    void initialize​(long randomSeed)
    Must be called first to initializes internal data structures.
    ComputeClusters<P> newInstanceThread()
    Creates a new instance which has the same configuration and can be run in parallel.
    void process​(LArrayAccessor<P> points, int numCluster)
    Computes a set of clusters which segment the points into numCluster sets.
    void setVerbose​(boolean verbose)
    If set to true then information about status will be printed to standard out.
  • Method Details

    • initialize

      void initialize​(long randomSeed)
      Must be called first to initializes internal data structures. Only needs to be called once.
      Parameters:
      randomSeed - Seed for any random number generators used internally.
    • process

      void process​(LArrayAccessor<P> points, int numCluster)
      Computes a set of clusters which segment the points into numCluster sets.
      Parameters:
      points - Set of points which are to be clustered. Not modified.
      numCluster - Number of clusters it will use to split the points.
    • getAssignment

      AssignCluster<P> getAssignment()

      Returns a class which is used to assign a point to a cluster. Only invoked after process(org.ddogleg.struct.LArrayAccessor<P>, int) has been called.

      WARNING: The returned data structure is recycled each time compute clusters is called. Create a copy if you wish to avoid having it modified.

      Returns:
      Instance of AssignCluster.
    • getDistanceMeasure

      double getDistanceMeasure()

      Returns the sum of all the distances between each point in the set. Can be used to evaluate the quality of fit for all the clusters. Can only be used to compare when the same number of clusters is uesd.

      NOTE: The specific distance measure is not specified and is application specific.
      Returns:
      sum of distance between each point and their respective clusters.
    • setVerbose

      void setVerbose​(boolean verbose)
      If set to true then information about status will be printed to standard out. By default verbose is off
      Parameters:
      verbose - true for versbose mode. False for quite mode.
    • newInstanceThread

      ComputeClusters<P> newInstanceThread()
      Creates a new instance which has the same configuration and can be run in parallel. Some components can be shared as long as they are read only and thread safe.