Skip to main navigation Skip to search Skip to main content

MALGRA: Machine Learning and N-Gram Malware Feature Extraction and Detection System

  • Muhammad Ali
  • , Stavros Shiaeles*
  • , Gueltoum Bendiab
  • , Bogdan Ghita
  • *Corresponding author for this work
  • University of Portsmouth

Research output: Contribution to journalArticlepeer-review

31 Downloads (Pure)

Abstract

<jats:p>Detection and mitigation of modern malware are critical for the normal operation of an organisation. Traditional defence mechanisms are becoming increasingly ineffective due to the techniques used by attackers such as code obfuscation, metamorphism, and polymorphism, which strengthen the resilience of malware. In this context, the development of adaptive, more effective malware detection methods has been identified as an urgent requirement for protecting the IT infrastructure against such threats, and for ensuring security. In this paper, we investigate an alternative method for malware detection that is based on N-grams and machine learning. We use a dynamic analysis technique to extract an Indicator of Compromise (IOC) for malicious files, which are represented using N-grams. The paper also proposes TF-IDF as a novel alternative used to identify the most significant N-grams features for training a machine learning algorithm. Finally, the paper evaluates the proposed technique using various supervised machine-learning algorithms. The results show that Logistic Regression, with a score of 98.4%, provides the best classification accuracy when compared to the other classifiers used.</jats:p>
Original languageEnglish
Pages (from-to)1777-1777
Number of pages0
JournalElectronics
Volume9
Issue number11
Early online date26 Oct 2020
DOIs
Publication statusPublished - 26 Oct 2020

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 9 - Industry, Innovation, and Infrastructure
    SDG 9 Industry, Innovation, and Infrastructure

Fingerprint

Dive into the research topics of 'MALGRA: Machine Learning and N-Gram Malware Feature Extraction and Detection System'. Together they form a unique fingerprint.

Cite this