Aliyu, Saeed Hubairik and Naseer, Naeem and Muhammad, Bilal (2025) Detecting Malicious URLs: A Machine Learning Approach using Feature Engineering and Ensemble Models. International Journal of Innovative Science and Research Technology, 10 (5): 25may1412. pp. 2947-2956. ISSN 2456-2165
![IJISRT25MAY1412.pdf [thumbnail of IJISRT25MAY1412.pdf]](https://eprint.ijisrt.org/style/images/fileicons/text.png)
IJISRT25MAY1412.pdf - Published Version
Download (845kB)
Abstract
This work considers the use of machine learning to classify URLs into four categories: benign, defacement, phishing, and malware. In this research, a dataset used contains 651,191 URLs where there are 428,103 benign, 96,457 defacements’, 94,111 phishing, and 32,520 malware URLs. For this comparison, three machine learning models were used: Cat Boost classifier, Snapshot Ensemble, and Stacked Ensemble with Snapshots. The Cat Boost classifier was fairly accurate, at about 96%, with previsions ranging from 91% to 97% and recall from 82% to 99%, thus handling class imbalance rather well. Snapshot Ensemble scored an accuracy of about95.83%, thus performing quite great in classification tasks and handling model complexity and generalization effectively. Using Stacked Ensemble with Snapshots resulted in a somewhat-lower accuracy of 91.30% but high-performance variability across the different classes. These results have shown the power of ensemble techniques in enhancing classification performance and solving issues related to class imbalance. Future research should be directed toward the refinement of feature engineering techniques and real-time detection capabilities, focusing on high ethical standards with regard to public, readily available data, further contributing to the development of URL classification and thus to cybersecurity as a whole.
Item Type: | Article |
---|---|
Subjects: | R Medicine > R Medicine (General) |
Divisions: | Faculty of Engineering, Science and Mathematics > School of Engineering Sciences |
Depositing User: | Editor IJISRT Publication |
Date Deposited: | 19 Jun 2025 10:44 |
Last Modified: | 19 Jun 2025 10:44 |
URI: | https://eprint.ijisrt.org/id/eprint/1269 |