Developer Guide for Intel® Data Analytics Acceleration Library 2016 Update 4

Batch Processing

Algorithm Input

The multivariate outlier detection algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.

Input ID

Input

data

Pointer to the n x p numeric table with the data for outlier detection. The input can be an object of any class derived from the NumericTable class.

Algorithm Parameters

The multivariate outlier detection algorithm has the following parameters, which depend on the computation method parameter method:

Parameter

method

Default Value

Description

algorithmFPType

defaultDense or baconDense

double

The floating-point type that the algorithm uses for intermediate computations. Can be float or double.

method

Not applicable

defaultDense

Available methods for multivariate outlier detection:

  • defaultDense - Performance-oriented computation method
  • baconDense - Blocked Adaptive Computationally-efficient Outlier Nominators (BACON) method

initializationProcedure

defaultDense

Not applicable

The procedure for setting initial parameters of the algorithm (the vector of means, variance-covariance matrix, and scalar that defines the outlier region). It's your responsibility to define the procedure.

baconDense

baconMedian

The initialization method. Can be:

  • baconMedian - Median-based method.
  • defaultDense - Mahalanobis distance-based method.

alpha

baconDense

0.05

One-tailed probability that defines the (1 - α) quantile of the χ 2 distribution with p degrees of freedom.

Recommended value: α/n, where n is the number of observations.

accuracyThreshold

baconDense

0.005

The stopping criterion. The algorithm is terminated if the size of the basic subset is changed by less than the threshold.

Algorithm Output

The multivariate outlier detection algorithm calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.

Result ID

Result

weights

Pointer to the n x 1 numeric table of zeros and ones. One in the i-th position indicates that the i-th feature vector is an outlier. By default, the result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except the PackedSymmetricMatrix, PackedTriangularMatrix, and CSRNumericTable.

Examples

C++:

Java*: