Using a Machine Learning Model for Malicious URL Type Detection

Suet Ping Tung, Ka Yan Wong, Ievgeniia Kuzminykh, Taimur Bakhshi, Bogdan Ghita

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The world wide web, beyond its benefits, has also become a major platform for online criminal activities. Traditional protection methods against malicious URLs, such as blacklisting, remain a valid alternative, but cannot detect unknown sites, hence new methods are being developed for automatic detection, using machine learning approaches. This paper strengthens the existing state of the art by proposing an alternative machine learning approach, that uses a set of 14 lexical and host-based features but focuses on the typical mechanisms employed by malicious URLs. The proposed method employs random forest and decision tree as core mechanisms and is evaluated on a combined benign and malicious URL dataset, which indicates an accuracy of over 97%.
Original languageEnglish
Title of host publicationInternet of Things, Smart Spaces, and Next Generation Networks and Systems
Subtitle of host publication21st International Conference, NEW2AN 2021, and 14th Conference, ruSMART 2021, St. Petersburg, Russia, August 26–27, 2021, Proceedings
EditorsYevgeni Koucheryavy, Sergey Balandin, Sergey Andreev
Number of pages13
ISBN (Electronic)978-3-030-97777-1
ISBN (Print)978-3-030-97776-4
Publication statusPublished - 16 Mar 2022
Externally publishedYes
Event21st International Conference on Next Generation Wired/Wireless Networks and Systems - Online, St, Petersburg, Russian Federation
Duration: 30 Aug 202131 Aug 2021

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference21st International Conference on Next Generation Wired/Wireless Networks and Systems
Abbreviated titleNEW2AN 2021
Country/TerritoryRussian Federation
CitySt, Petersburg


  • Malicious URL
  • Web security
  • Machine Learning
  • Phishing
  • Spamming
  • Malware
  • Lexical Feature
  • Traffic


Dive into the research topics of 'Using a Machine Learning Model for Malicious URL Type Detection'. Together they form a unique fingerprint.

Cite this