Files
Download Full Text (1.4 MB)
Description
The research project, Feature Extraction and, Analysis of Binaries for Classification, provides an in-depth examination of the features shared by unlabeled binary samples, for classification into the categories of benign or malicious software using several different methods. Because of the time it takes to manually analyze or reverse engineer binaries to determine their function, the ability to gather features and then instantly classify samples without explicitly programming the solution is incredibly valuable. It is possible to use an online service; however, this is not always viable depending on the sensitivity of the binary. With Python3 and the Pefile library, we can gather the necessary features to begin choosing different classifier models from the Scikit-learn library for machine learning. This all addresses the issue of local automated classification, and we present several different classifier models, datasets and methods that allow for the classification of unknown binaries with a high degree of accuracy for predicting malware and benignware.
Publication Date
Spring 4-9-2020
Disciplines
Computer Sciences | Databases and Information Systems | Data Science | Theory and Algorithms
Recommended Citation
Flack, Micah, "Feature Extraction and Analysis of Binaries for Classification" (2020). Annual Research Symposium. 28.
https://scholar.dsu.edu/research-symposium/28
Included in
Databases and Information Systems Commons, Data Science Commons, Theory and Algorithms Commons