Mon Mar 16 2020

Benchmarking of Algorithms with One Clear Winner?


One of the biggest opportunities in cybersecurity right now is a comprehensive mapping of vulnerabilities (CVEs) to the MITRE ATT&CK® framework. Below, we explain what is MITRE ATT&CK/CVE, the size of the opportunity and benchmark the current competitors mapping CVEs to MITRE ATT&CK. We will use a simple analogy of house robbery to make the article more accessible.

Analogy: To use an example, it turns out that most house robberies start by entering not through the house door but the side window. Armed with that information and the awareness that I've left 3 open windows in my house helps me to prioritize my defense strategy. First off, I should close the windows.


MITRE ATT&CK® is a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community.”

Analogy: In other words, MITRE ATT&CK would be the playbook of steps that robbers would take in order to rob a house. (Identify unoccupied houses. Determine the access route to a window. Find the valuables etc.)

What are CVEs (Cybersecurity Vulnerabilities and Exposures)?

“A CVE is a dictionary that provides definitions for publicly disclosed cybersecurity vulnerabilities and exposures. CVE Entries are comprised of an identification number, a description, and at least one public reference.”

Analogy: In the case of our house robbery analogy, vulnerabilities are the weaknesses in your house security: unlocked door, broken cameras, your birthday as the safe password.

Why would you map CVEs to MITRE ATT&CK?

When you have multiple vulnerabilities, how should you allocate your time and resources? By combining knowledge of your situation (CVE) and understanding of most commonly followed criminal practices (MITRE), you can assign a risk factor and act accordingly. However, currently, only 4% of CVEs are manually mapped to MITRE ATT&CKs thus leaving the security professionals unable to prioritize CVE’s based on their role in the attack chain. As various APT (hackers) used different paths in the ATT&CK framework some controls can have a disproportionate impact given current vulnerabilities (CVEs) in your system. 

Analogy: Knowing that your house has a broken camera and open windows, allows you to prioritize closing the windows since most attackers would use them to break in. However, if the relationship between your situation (CVE) and attacker protocol (MITRE) is off by 5%, you might prioritize a safety action less likely to prevent robbery.

How can you implement this mapping?

Here are 6 competitive approaches based on price, speed, and quality.

  • Most vendors have already started thinking about this and mapped the most common CVEs seen in the wild to the corresponding MITRE ATT&CK manually. 

  • Some vendors have built complicated heuristics that map all currently existing CVEs. 

  • Some vendors began using sophisticated machine learning techniques, including deep learning, and NLP.

Which one is best?

Summary: There is no one good vendor. Manual mapping of attacks seen in the wild gives a small list of roughly 4000 CVEs with high accuracy. Heuristics are scalable to all 100k CVEs with strong model interpretability. An advantage of machine learning techniques is that they perform well for newly emerging CVEs not previously seen by heuristics.


  • The most expensive solution is not always the best. One of the manual mappings charges $$$ for the full service even though it only works for 4k CVEs. The best machine learning solution maps +100k CVEs for the same money $$$. 

  • The variability amongst the vendors is very significant. The 2 manual mappings vary widely in quality depending on whether qualified security analysts or Amazon Turk performed the work. Natural language processing is almost twice as accurate as the deep learning technique due to the limited data set size.

  • For some solutions, results improve with time. While heuristics perform well on existing CVEs, newly emerging CVEs are better mapped by the natural language techniques, which could indicate overfitting by the heuristics.


Choosing the best depends on 1) your specific circumstance and 2) evolves over time as new CVE's are released. Our meta-APIs meet both of these needs. We find your best fit the first time and continue to reassess as new APIs are released (all without your concerted effort).

Next steps?

If you are a vendor that displays CVEs in their product, review these APIs to improve your product and mapping accuracy and contact [email protected] to unlock more details. (purchase the model code from one of the vendors, buy a data dump, receive your API keys or login credentials for unlimited requests). If you have your own solution to map CVE/MITRE ATT&CK, please contact us to be included in the continuous benchmarking, or share your own model API with [email protected]. You will receive 90% of the revenue we make from it and you keep your intellectual property as we don’t require code submission. 

Open Source 

In order to promote the transparency of our work, we share a subset of our validation data in our GitHub. You can always contribute to it with a push request. To avoid fraud by the vendors we use a carefully curated holdout data set for the actual benchmarking.

For more details, contact us at [email protected]