Developer Guide for Intel® Data Analytics Acceleration Library 2016 Update 4
This mode assumes that the data set is split into nblocks blocks across computation nodes.
Centroid initialization for K-Means clustering in the distributed processing mode has the following parameters:
Parameter |
Default Value |
Description |
|
---|---|---|---|
computeStep |
Not applicable |
The parameter required to initialize the algorithm. Can be:
|
|
algorithmFPType |
double |
The floating-point type that the algorithm uses for intermediate computations. Can be float or double. |
|
method |
defaultDense |
Available initialization methods for K-Means clustering:
For more details, see the algorithm description. |
|
nClusters |
Not applicable |
The number of clusters. Required. |
|
nRowsTotal |
0 |
The total number of rows in all input data sets on all nodes. Required in the distributed processing mode. |
|
seed |
777 |
The seed for generating random numbers. |
Centroid initialization for K-Means clustering follows the general schema described in Algorithms.
In this step, centroid initialization for K-Means clustering accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID |
Input |
|
---|---|---|
data |
Pointer to the ni x p numeric table that represents the i-th data block on the local node. The input can be an object of any class derived from NumericTable. |
|
inputCentroids |
Pointer to the nClusters x p numeric table with the initial cluster centroids. This input can be an object of any class derived from NumericTable. |
In this step, centroid initialization for K-Means clustering calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID |
Result |
|
---|---|---|
nPartialClusters |
Pointer to the 1 x 1 numeric table that contains the number of clusters computed on the local node. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except CSRNumericTable, PackedTriangularMatrix, and PackedSymmetricMatrix. |
|
partialClusters |
Pointer to the nClusters x p numeric table with cluster centroids computed on the local node. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedTriangularMatrix, PackedSymmetricMatrix, and CSRNumericTable. |
In this step, centroid initialization for K-Means clustering accepts the input from each local node described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.
Input ID |
Input |
|
---|---|---|
partialResuts |
A collection that contains results computed in Step 1 on local nodes (three numeric tables from each local node). |
In this step, centroid initialization for K-Means clustering calculates the results described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.
Result ID |
Result |
|
---|---|---|
centroids |
Pointer to the nClusters x p numeric table with cluster centroids. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedTriangularMatrix, PackedSymmetricMatrix, and CSRNumericTable. |