Developer Guide for Intel® Data Analytics Acceleration Library 2016 Update 4

Batch Processing

Input

Centroid initialization for K-Means clustering accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.

Input ID

Input

data

Pointer to the n x p numeric table with the data to be clustered. The input can be an object of any class derived from NumericTable.

Parameters

Centroid initialization for K-Means clustering has the following parameters:

Parameter

Default Value

Description

algorithmFPType

double

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

defaultDense

Available initialization methods for K-Means clustering:

  • defaultDense - uses first nClusters points as initial clusters

  • deterministicCSR - uses first nClusters points as initial clusters for data in a CSR numeric table

  • randomDense - uses random nClusters points as initial clusters

  • randomCSR - uses random nClusters points as initial clusters for data in a CSR numeric table

For more details, see the algorithm description.

nClusters

Not applicable

The number of clusters. Required.

seed

777

The seed for generating random numbers.

Output

Centroid initialization for K-Means clustering calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.

Result ID

Result

centroids

Pointer to the nClusters x p numeric table with the cluster centroids. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedTriangularMatrix, PackedSymmetricMatrix, and CSRNumericTable.