Research Output

Improving Classification of Metamorphic Malware by Augmenting Training Data with a Diverse Set of Evolved Mutant Samples

  Detecting metamorphic malware provides a challenge to machine-learning models as trained models might not generalise to future mutant variants of the malware. To address this, we explore whether machine-learning models can be improved by augmenting training data-sets with samples of potential variants. These variants are generated using an evolutionary algorithm that evolves a behaviourally diverse set of mutants, optimised to avoid detection by a large set of existing detection-engines. Using features calculated from the behavioural trace of a sample as input, we evaluate the ability of five machine-learning methods to detect the new variants, show that the detection rate is considerably improved by including the new samples as training data, and that the classifiers still generalise over a range of malware. We then repeat this experiment using a sequence-based deep-learning method as the classifier, which is shown to out-perform the feature-based classifiers.

  • Date:

    20 March 2020

  • Publication Status:

    Accepted

  • Funders:

    Edinburgh Napier Funded

Citation

Babaagba, K., Tan, Z., & Hart, E. (in press). Improving Classification of Metamorphic Malware by Augmenting Training Data with a Diverse Set of Evolved Mutant Samples

Authors

Keywords

Machine-learning; Evolutionary computing; Malware and Computer security

Monthly Views:

Available Documents