Package org.ddogleg.clustering
Interface ComputeClusters<P>
- All Known Implementing Classes:
ExpectationMaximizationGmm_F64
,StandardKMeans
,StandardKMeans_MT
public interface ComputeClusters<P>
Given a set of points in N-dimensional space, compute a set of unique assignment for each point to a cluster.
The clusters will be designed to minimize some distance function between each cluster and the points assigned
to it.
-
Method Summary
Modifier and TypeMethodDescriptionReturns a class which is used to assign a point to a cluster.double
Returns the sum of all the distances between each point in the set.void
initialize
(long randomSeed) Must be called first to initializes internal data structures.Creates a new instance which has the same configuration and can be run in parallel.void
process
(LArrayAccessor<P> points, int numCluster) Computes a set of clusters which segment the points into numCluster sets.void
setVerbose
(boolean verbose) If set to true then information about status will be printed to standard out.
-
Method Details
-
initialize
void initialize(long randomSeed) Must be called first to initializes internal data structures. Only needs to be called once.- Parameters:
randomSeed
- Seed for any random number generators used internally.
-
process
Computes a set of clusters which segment the points into numCluster sets. The number of clusters and points must be 1 or more. If this is not true then the behavior is undefined.- Parameters:
points
- Set of points which are to be clustered. Not modified.numCluster
- Number of clusters it will use to split the points.
-
getAssignment
AssignCluster<P> getAssignment()Returns a class which is used to assign a point to a cluster. Only invoked after
process(org.ddogleg.struct.LArrayAccessor<P>, int)
has been called.WARNING: The returned data structure is recycled each time compute clusters is called. Create a copy if you wish to avoid having it modified.
- Returns:
- Instance of
AssignCluster
.
-
getDistanceMeasure
double getDistanceMeasure()Returns the sum of all the distances between each point in the set. Can be used to evaluate the quality of fit for all the clusters. Can only be used to compare when the same number of clusters is uesd.
NOTE: The specific distance measure is not specified and is application specific.- Returns:
- sum of distance between each point and their respective clusters.
-
setVerbose
void setVerbose(boolean verbose) If set to true then information about status will be printed to standard out. By default verbose is off- Parameters:
verbose
- true for versbose mode. False for quite mode.
-
newInstanceThread
ComputeClusters<P> newInstanceThread()Creates a new instance which has the same configuration and can be run in parallel. Some components can be shared as long as they are read only and thread safe.
-