Determining the maximum number of transaction records that the Apriori algorithm can scan in 90 seconds

CHRISTIAN DALE P. CELESTIAL, BASSY LEIAH V. IBARRETA, RUTH SP G. TIRON, MARIA MILAGROSA A. NULLA, and ZENNIFER L. OBERIO

Philippine Science High School – Western Visayas Campus, Brgy. Bito-on, Jaro, Iloilo City 5000, Department of Science and Technology, Philippines


Abstract

The Apriori algorithm is a data mining algorithm used for frequent itemsets. It is easy and simple to use, but its main disadvantage is its inefficiency in scanning large databases. Studies about the algorithm focus on improving its efficiency in large databases, but there is no definite value yet as to the maximum number of transactions that the Apriori algorithm can process in 90 seconds, the tolerable offline waiting time for the human attention span. The methods of this study consist of the hardware and database acquisition, program implementation, data collection, and data analysis. Five hundred transactions were first scanned using the algorithm. It was determined that the classic Apriori algorithm can process 1,310 transaction records in 90 seconds, with a percentage prediction error of 0%. The percentage prediction error was computed using the actual and outputted frequencies.

Keywords: Apriori algorithm, data mining, frequent itemsets, percentage prediction error, accuracy