A Multidimensional Evaluation Framework for Parallel Frequent Pattern Mining Algorithms on Big Data Platforms
DOI:
https://doi.org/10.57159/jcmm.4.6.25220Keywords:
Hadoop YARN, Parallel FP-Growth, Resource Adaptation, Communication Overhead, Fault ToleranceAbstract
Despite ongoing theoretical refinements in parallel frequent pattern mining algorithms, practical implementations still experience issues such as inefficient resource scheduling, low node interaction efficiency, and limited system robustness. Addressing the lack of comprehensive and systematic testing methodologies for existing algorithms, this paper proposes a data mining algorithm testing framework tailored for big data platforms. The methodology centers on three key dimensions: resource adaptability, communication efficiency, and system robustness, establishing a quantifiable and reproducible experimental evaluation framework. To validate its effectiveness, KVBFP is used as the experimental subject, and three sets of experiments are designed and implemented within a Hadoop cluster environment: algorithm resource adaptability testing, communication frequency testing, and algorithm stability testing. Experimental results show that the first set of experiments accurately measures algorithm resource consumption across different clusters. The second set of experiments shows that KVBFP reduces communication frequency by 51.5% compared to the PFP algorithm. The third set of experiments demonstrates that the algorithm’s recovery time remains within 30 seconds under fault conditions. Through comprehensive evaluation of the algorithm via these three experiments, this paper provides a quantitative reference for applying data mining algorithms in real-world scenarios.
References
C. Liu, “Parallel frequent itemset mining algorithm and optimization based on Spark,” in 2023 5th International Conference on Artificial Intelligence and Computer Applications (ICAICA), 2023.
H. Ma, J. Ding, M. Liu, and Y. Liu, “Connections between various disorders: Combination pattern mining using Apriori algorithm based on diagnosis information from electronic medical records,” BioMed Research International, 2022.
P. Gupta and V. Sawant, “A parallel Apriori algorithm and FP-Growth based on Spark,” in ITM Web of Conferences, 2021.
B. N. Arunakumari, J. M. Suhas, and A. Nair, “Optimizing resource management in Hadoop YARN for efficient allocation, utilization, and scheduling,” in International Conference on Computing Communication and Networking Technologies, 2024.
F. Ullah, G. Srivastava, S. Ullah, K. Yoshigoe, and Y. Zhao, “NIDS-VSB: Network intrusion detection system for VANET using Spark-based big data optimization and transfer learning,” IEEE Transactions on Consumer Electronics, 2023.
T. J. Akinbolaji, G. Nzeako, D. Akokodaripon, and A. V. Aderoju, “Proactive monitoring and security in cloud infrastructure: Leveraging tools like Prometheus, Grafana, and HashiCorp Vault for robust DevOps practices,” World Journal of Advanced Engineering Technology and Sciences, 2024.
A. Hakeem, R. Curtmola, X. Ding, and C. Borcea, “DFPS: A distributed mobile system for free parking assignment,” IEEE Transactions on Mobile Computing, 2021.
H. Guo and N. Guo, “Research and application of a multidimensional association rules mining algorithm based on Hadoop,” in 2021 IEEE International Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 2021.
S. Chaturvedi, S. K. Saritha, and A. Chaturvedi, “Spark-based parallel frequent pattern rules for social media data analytics,” in 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW), 2023.
A. Janowski, M. Hüsrevoğlu, and M. Renigier-Biłozor, “Sustainable parking space management using machine learning and swarm theory—The SPARK system,” Applied Sciences, 2024.
M. Gu, H. Tang, and L. Li, “Exploration on facial image recognition and processing algorithms under Hadoop,” in 2023 International Conference on Integrated Intelligence and Communication Systems (ICIICS), 2023.
J. Ragaventhiran and M. K. K. Devi, “Map-optimize-reduce: CAN tree assisted FP-Growth algorithm for clusters-based FP mining on Hadoop,” Future Generation Computer Systems, 2020.
A. M. Al-Badani, A. A. Shujaaddeen, and M. M. Aljafare, “Efficient mining of FP-Growth algorithm structure and Apriori algorithm using OFIM for big data,” International Journal of Applied Information Systems, 2025.
X. Wang and G. Jiao, “Research on association rules of course grades based on parallel FP-Growth algorithm,” Journal of Computational Methods in Sciences and Engineering, 2020.
Yulani, R. Kurniawan, and Y. Wijaya, “Implementasi algoritma FP-Growth pada data transaksi penjualan Seblak Jontor,” JIKA (Jurnal Informatika), 2024.
R. Wahyuningsih, A. Suharsono, and N. Iriawan, “Comparison of market basket analysis method using Apriori algorithm, frequent pattern growth (FP-Growth), and equivalence class transformation (ECLAT): Case study supermarket ‘X’ transaction data for 2021,” Business and Finance Journal, 2023.
B. Zhang, “Optimization of FP-Growth algorithm based on cloud computing and computer big data,” International Journal of Systems Assurance Engineering and Management, 2021.
Z. Mahrousa, D. M. Alchawafa, and H. Kazzaz, “Frequent itemset mining based on development of FP-Growth algorithm and use MapReduce technique,” Association of Arab Universities Journal of Engineering Sciences, vol. 28, no. 1, pp. 83–98, 2021.
A. Senthilkumar and D. Hari Prasad, “An efficient FP-Growth-based association rule mining algorithm using Hadoop MapReduce,” Indian Journal of Science and Technology, vol. 13, no. 34, pp. 3561–3571, 2020.
J. Heaton, “Comparing dataset characteristics that favor the Apriori, ECLAT, or FP-Growth frequent itemset mining algorithms,” 2017. arXiv: Databases.
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2025 Journal of Computers, Mechanical and Management

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The Journal of Computers, Mechanical and Management applies the CC Attribution- Non-Commercial 4.0 International License to its published articles. While retaining copyright ownership of the content, the journal permits activities such as downloading, reusing, reprinting, modifying, distributing, and copying of the articles, as long as the original authors and source are appropriately cited. Proper attribution is ensured by citing the original publication.