Batch Processing

Algorithm Input

The association rules algorithm accepts the input described below. Pass the Input ID as a parameter to the methods that provide input for your algorithm. For more details, see Algorithms.

Input ID	Input
data	Pointer to the `n` x 2 numeric table t with the mining data. Each row consists of two integers: Transaction ID, the number between 0 and nTransactions-1. Item ID, the number between 0 and nUniqueItems-1. The input can be an object of any class derived from NumericTable except PackedTriangularMatrix and PackedSymmetricMatrix.

Input ID

Input

data

Pointer to the n x 2 numeric table t with the mining data. Each row consists of two integers:

Transaction ID, the number between 0 and nTransactions-1.
Item ID, the number between 0 and nUniqueItems-1.

The input can be an object of any class derived from NumericTable except PackedTriangularMatrix and PackedSymmetricMatrix.

Algorithm Parameters

The association rules algorithm has the following parameters:

Parameter	Default Value	Description
algorithmFPType	double	The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
method	defaultDense	The computation method used by the algorithm. The only method supported so far is Apriori.
minSupport	0.01	Minimal support, a number in the [0,1) interval.
minConfidence	0.6	Minimal confidence, a number in the [0,1) interval.
nUniqueItems	0	The total number of unique items. If set to zero, the library automatically determines the number of unique items from the input data.
nTransactions	0	The total number of transactions. If set to zero, the library automatically determines the number transactions from the input data.
discoverRules	true	A flag that enables generation of the rules from large item sets.
itemsetsOrder	itemsetsUnsorted	The sort order of returned item sets: itemsetsUnsorted - not sorted itemsetsSortedBySupport - sorted by support in a descending order
rulesOrder	rulesUnsorted	The sort order of returned rules: rulesUnsorted - not sorted rulesSortedBySupport - sorted by support in a descending order
minItemsetSize	0	A parameter that defines the minimal size of item sets to be included into the array of results. The value of zero imposes no limitations on the minimal size of item sets.
maxItemsetSize	0	A parameter that defines the maximal size of item sets to be included into the array of results. The value of zero imposes no limitations on the maximal size of item sets.

Algorithm Output

The association rules algorithm calculates the result described below. Pass the Result ID as a parameter to the methods that access the results of your algorithm. For more details, see Algorithms.

Result ID	Result
largeItemsets	Pointer to the numeric table with large item sets. The number of rows in the table equals the number of items in the large item sets. Each row contains two integers: ID of the large item set, the number between 0 and `nLargeItemsets` -1. ID of the item, the number between 0 and nUniqueItems-1. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and СSRNumericTable.
largeItemsetsSupport	Pointer to the `nLargeItemsets` x 2 numeric table of support values. Each row contains two integers: ID of the large item set, the number between 0 and `nLargeItemsets`-1. The support value, the number of times the item set is met in the array of transactions. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and СSRNumericTable.
antecedentItemsets	Pointer to the `nAntecedentItems` x 2 numeric table that contains the left-hand-side (`X`) part of the association rules. Each row contains two integers: Rule ID, the number between 0 and `nAntecedentItems`-1. Item ID, the number between 0 and nUniqueItems-1. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and СSRNumericTable.
conseqentItemsets	Pointer to the `nConsequentItems` x 2 numeric table that contains the right-hand-side (`Y`) part of the association rules. Each row contains two integers: Rule ID, the number between 0 and `nConsequentItems`-1. Item ID, the number between 0 and nUniqueItems-1. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and СSRNumericTable.
confidence	Pointer to the `nRules` x 1 numeric table that contains confidence values of rules, floating-point numbers between 0 and 1. Confidence value in the `i`-th position corresponds to the rule with the index `i`. By default, this result is an object of the HomogenNumericTable class, but you can define the result as an object of any class derived from NumericTable except PackedSymmetricMatrix, PackedTriangularMatrix, and СSRNumericTable.

Note

The library requires transactions and items for each transaction to be passed in the ascending order.
Numbering of rules starts at 0.
The library calculates the sizes of numeric tables intended for results in a call to the algorithm. Avoid allocating the memory in numeric tables intended for results because, in general, it is impossible to accurately estimate the required memory size. If the memory interfaced by the numeric tables is allocated and its amount is insufficient to store the results, the algorithm returns an error.

Examples

C++: apriori_batch.cpp

Java*: AprioriBatch.java