TY - GEN
T1 - Big Data-Driven Phishing Detection in Smart Devices Using Chi-Square and Optimized Gradient Boosting
AU - Gaurav, Akshat
AU - Gupta, Brij B.
AU - Hsu, Ching Hsien
AU - Chui, Kwok Tai
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
PY - 2026
Y1 - 2026
N2 - Phishing attacks are a major cybersecurity threat, especially in smart devices, where attackers exploit vulnerabilities to steal sensitive information. As the complexity of phishing techniques grows, the need for robust detection methods becomes critical. This paper presents a Big Data based model to identify phishing in smart devices. Following PySpark for data preprocessing and Chi-Square feature selection, the suggested model optimizes a Gradient Boosting model using the Probabilistic Bees Algorithm (BeesA). With high accuracy, recall, and F1-score, the model was assessed on a dataset including more than 11,000 webpages. Comparative study with conventional classifiers like Random Forest, SVM, and Naive Bayes shows the better performance of the suggested model. The results show how well integrating Big Data methods with sophisticated optimization algorithms improves phishing detection in smart device scenarios.
AB - Phishing attacks are a major cybersecurity threat, especially in smart devices, where attackers exploit vulnerabilities to steal sensitive information. As the complexity of phishing techniques grows, the need for robust detection methods becomes critical. This paper presents a Big Data based model to identify phishing in smart devices. Following PySpark for data preprocessing and Chi-Square feature selection, the suggested model optimizes a Gradient Boosting model using the Probabilistic Bees Algorithm (BeesA). With high accuracy, recall, and F1-score, the model was assessed on a dataset including more than 11,000 webpages. Comparative study with conventional classifiers like Random Forest, SVM, and Naive Bayes shows the better performance of the suggested model. The results show how well integrating Big Data methods with sophisticated optimization algorithms improves phishing detection in smart device scenarios.
KW - Big Data
KW - Gradient Boosting
KW - Phishing Detection
KW - Probabilistic Bees Algorithm
KW - Smart Devices
UR - https://www.scopus.com/pages/publications/105011948568
U2 - 10.1007/978-981-96-6294-4_16
DO - 10.1007/978-981-96-6294-4_16
M3 - Conference contribution
AN - SCOPUS:105011948568
SN - 9789819662937
T3 - Communications in Computer and Information Science
SP - 204
EP - 217
BT - Ubi-Media Computing, Pervasive Systems, Algorithms and Networks - 13th International Conference, Ubi-Media 2025, and 17th International Symposium, I-SPAN 2025, Proceedings
A2 - Hui, Lin
A2 - Hsu, Ching-Hsien
A2 - Ruengittinun, Somchoke
T2 - 13th International Conference on Ubi-Media Computing, Ubi-Media 2025 and 17th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2025
Y2 - 19 January 2025 through 23 January 2025
ER -