Meta-Architecture Binary Code Analysis: Multi-ISA Deep Learning Analysis Leveraging Single-ISA Data

Researchers from George Mason University, Virginia Tech

Prompted by the increased use of deep learning, researchers propose using meta-architecture binary code analysis, in which a model trained on abundant data for a high-resource Instruction Set Architecture (ISA) can make predictions for other ISAs without modifications.

Funded by the CCI Hub

Project Investigators

Principal Investigator (PI): Lannan Lisa Luo, George Mason University Department of Computer Science
Co-PI: Qiang Zeng, George Mason University Department of Computer Science
Co-PI: Peng Gao, Virginia Tech Department of Computer Science

Rationale

Deep learning in binary code analysis has gained attention due to its high performance.

Training a deep learning model requires a substantial amount of data, presenting a challenge for the wide range of ISAs that face data scarcity issues.

Additionally, many ISAs demand significant time and effort (e.g., for data collection, labeling and cleaning, and parameter tuning) to train a deep learning model for each individual ISA.

Projected Outcomes

This research has the potential for several commercial and economic development opportunities. For example:

Commercial tools and services: The proposed tools and models can be readily adopted by enterprises to fortify cybersecurity defenses.
Boosting the workforce: Solutions based on binary code analysis are increasing, leading to the creation of jobs in the cybersecurity workforce, and the expansion of the industry.