Causal factors analysis of vulnerability exploitation

Causal factors and root cause analyses are often used to help people understand why the problem happened in the first place. Identifying the underlying causes of an observed issue is one of the best ways to prevent it in the future. In fact, the accumulated understanding of past security failures and threats should help us build better processes.

However, the lack of resources, the cost, the pressure of the market, etc. have partially hindered this effort. In addition, the fact that security researchers are continually struggling with security issues also points to a knowledge gap and a prioritization crisis. This article is a preliminary analysis of factors that are likely to cause exploitation in the wild, especially as a means to better inform the actions needed at the design level.

Introduction

When it comes to cybersecurity, companies are failing to learn from previous mistakes, and more often than not, corrections are cosmetic, superficial, or merely workarounds. Since 2017, the number of reported vulnerabilities has hit an all-time-high record every single year¹. During that same period, between 1100 and 1900 (declared) data breaches hit the US every year².

‍

These aren't solely the product of technical mistakes, but also other misinformed decisions, including those that span business factors. The fact of the matter is that the same mistakes keep happening and we’re failing to look deeper into the chains of cause and effect that created these situations and implement what’s necessary to break them.

Beyond the structural flaws and governance issues, there is a lack of understanding and disregard (and sometimes resistance as well) to important aspects that will help companies better digest and operationalize available information from vulnerability disclosure and the threat landscape.

What is driving security incidents and causing data breaches hasn't fundamentally changed over the years (cf. yearly DBIR reports for instance⁷). Stepping back and looking at the aspects that are associated with a vulnerability in the code, and likely to cause a breach, will hopefully lead to the right questions to ask from the outset, i.e., at the design level. We believe that the right framework to do that is threat modeling, which will help organizations better operationalize available threat landscape information.

Data sources

This article includes the analysis of the following data sources (over the past decade):

CVE list, from NVD (disclosed vulnerabilities)³
Exploit DB (exploit availability)⁴
Known exploited vulnerabilities, from CISA (exploitation in the wild)⁵
AlienVault's threat intelligence feed (exploitation in the wild)⁶

The combined dataset is analyzed to deduce top security issues and their prevalence and show the impact of exploitability, and other vulnerability features, on actual exploitation in the wild.

Data analysis

This data analysis aims to infer potential causes of exploitation based on the impact of multiple factors, e.g., known exploitability, the type of weaknesses, or the severity of vulnerabilities. The purpose is twofold. Firstly, to help with prioritization, e.g., what factors should be primarily taken into account to prioritize vulnerability patching or controls’ implementation? Secondly, what actions and countermeasures are most needed at the design level to effectively reduce risk and improve product quality from the outset? To that end, causal inference⁸ was used.

Causal inference is a process used to determine whether some associations truly reflect a cause-and-effect relationship. Using a causal approach in cybersecurity is becoming necessary to help the community better explore the appropriate actions to defend against and/or prioritize. It can also help produce better analytics and machine learning models, e.g., by considering causality in feature selection.

CVSS severity⁹, for instance, is frequently used in vulnerability prioritization. It is, however, often criticized for failing to represent the real-world impact and end up 'over-abusing' the higher severity ranges (high and critical). In fact, close to 40% of vulnerabilities disclosed over the past decade have high or critical severity. This is mainly due to assuming the worst-case scenario when it comes to exploitation.

A helpful tool would be to further define what exposure feature impacts the most real-world exploitation. Using causal inference, we will estimate the impact of known exploitability on whether or not exploitation in the wild happens. We use CausalNLP¹⁰ which has support for text, e.g., vulnerability/weakness description, as a covariate in this case. The underlying data analysis uses meta-learners to estimate causal effects of interest¹¹.

Causal impact of exploit availability on exploitation in the wild

The experiment setup is as follows:

Exploitation in the wild: outcome
Known exploit availability: affects the outcome
Description text, Weaknesses (CWE), Severity range (for low, high or critical), and Attack complexity (low or otherwise): covariates,
Meta-learner/ learner: X-learner/ LightGBM.

Key findings

The overall ATE (Average Treatment Effect) shows that known exploitability only increases the probability of exploitation by 3.4 percentage points in this dataset (covering all vulnerabilities disclosed in the past decade). This jumps to 16 percentage points (Conditional Average Treatment Effect) when the weakness associated with the vulnerability is CWE-416 (Use After Free). We present the ranking of CWEs in terms of the importance of their causal relationship to exploitation in Table 2.

It is also possible to measure the causal effect of description text. For instance, the CATE (Conditional Average Treatment Effect) for vulnerabilities that have ‘use after free’ in the description is 12.9 and is almost three times higher than vulnerabilities with ‘buffer overflow’ in their description, which amounts to 4.7. This suggests a hierarchy of memory issues, where use after free exploits have more importance than buffer overflows when it comes to causal effects on exploitation in the wild.

This can easily be used to predict the effect of new observations, using the input mentioned above, and might be used to prioritize based on the increase in exploitation probability. An example is presented in Table 1, where two CVEs are identical in the CVSS metrics used here, but can be differentiated based on their descriptions and CWEs. Based on the causal prediction, the second CVE would be prioritized over the first one (when patching). Overall, from a prevention perspective, design and development teams can use this type of prediction to prioritize areas of focus in countermeasure implementation and create robust tests for a more secure final product.

Table 1. Predicting the effect of new observations.

Vulnerability description	[Exploitable, Low severity, High severity, Critical severity, Low attack complexity]	CWE	Prediction
This is an authentication bypass vulnerability in SonicWall analyzer. The API allows unauthorized requests to write files.	[1, 0, 1, 1, 1]	CWE-306	5.1
This is a buffer overflow in the Microsoft SharePoint server. Allows remote attackers to execute arbitrary code.	[1, 0, 1, 1, 1]	CWE-119	9.2

Table 2. CWE ranking based on causal effects importance.

Top 20 CWEs (all views combined)¹² - Most predictive of exploitation across the dataset

CWE-416 (Variant): Use After Free
CWE-78 (Base): Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection')
CWE-502 (Base): Deserialization of Untrusted Data
CWE-20 (Class): Improper Input Validation
CWE-843 (Base): Access of Resource Using Incompatible Type ('Type Confusion')
CWE-22 (Base): Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')
CWE-94 (Base): Improper Control of Generation of Code ('Code Injection')
CWE-732 (Class): Incorrect Permission Assignment for Critical Resource
CWE-89 (Base): Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
CWE-189 (Category): Numeric Errors
CWE-74 (Class): Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')
CWE-787 (Base): Out-of-bounds Write
CWE-119 (Class): Improper Restriction of Operations within the Bounds of a Memory Buffer
CWE-862 (Class): Missing Authorization
CWE-269 (Class): Improper Privilege Management
CWE-284 (Pillar): Improper Access Control
CWE-917 (Base): Improper Neutralization of Special Elements used in an Expression Language Statement ('Expression Language Injection')
CWE-264 (Category): Permissions, Privileges, and Access Controls
CWE-77 (Class): Improper Neutralization of Special Elements used in a Command ('Command Injection')
CWE-1188 (Base): Insecure Default Initialization of Resources

The path forward

We gain valuable knowledge from examining details of past threats and vulnerabilities. The community has also done a decent job of representing this knowledge with a body of definitions, taxonomies, enumerations, and even standards; including CVE, CWE, ATT&CK, and the NIST 800 series, among others. There is still, however, a gap in operationalizing threat data and acting on it to mitigate relevant risks as early as possible and prevent malicious attacks right from the outset.

We are still largely in a reactive posture when it comes to dealing with ever-increasing and growing threats, which we can generally trace back to a limited set of prevalent weaknesses and vulnerabilities. It has been shown, for instance, that attackers exploit the same prevalent vulnerabilities, with low-attack complexity, and are slow at trying to weaponize new ones¹³. This gives us a chance, as a community, to collectively act to increase the entry barrier to cyber attacks. For that, we need to rethink our priorities to further shift left, harden security, and reduce the prevalence of top weaknesses, starting at the design phase.

One of the tools to implement a proactive posture toward security, and to inverse the current trend of playing catch-up to bad actors, is threat modeling. While it is not a magic wand, we believe it is the most important step toward implementing effective mitigations and control designs that capture the necessary actions to be taken from the outset and avoid the costs incurred by the bad press and having to fix trivial vulnerabilities later on in the product life-cycle. The data-driven analysis presented here aims at helping further prioritize those actions based on open data from the vulnerability and threat landscapes.

Example threat model

The threat landscape increases and evolves at a tremendous pace. One of the most challenging aspects of threat modeling (and cybersecurity in general) is prioritization. In addition to priorities driven by local context, data-driven and evidence-based insights should further facilitate comparisons when considering priorities in threats and countermeasures. It is naturally preferable to consider all relevant threats and countermeasures. However, there are often too many to be addressed, with limited resources.

In IriusRisk, countermeasures would generally default to a 'recommended' state. Based on local context, relative to the likelihood or impact of an attack, for instance, the status can be changed to 'required'. In the example threat model below, and based on the insights from this analysis, all countermeasures related to the CWE list above have been prioritized (moved to 'required'). This can be easily automated via the IriusRisk API. Out of 127 initial recommended countermeasures, 30 are now required (based on their CWE references in this case). We believe this type of data-driven analysis helps draw attention to what really matters in the rather noisy threat landscape.

‍

References

‍