Causal factors analysis of vulnerability exploitation
Causal factors analysis of vulnerability exploitation
Causal factors and root cause analyses are often used to help people understand why the problem happened in the first place. Identifying the underlying causes of an observed issue is one of the best ways to prevent it in the future. In fact, the accumulated understanding of past security failures and threats should help us build better processes.
However, the lack of resources, the cost, the pressure of the market, etc. have partially hindered this effort. In addition, the fact that security researchers are continually struggling with security issues also points to a knowledge gap and a prioritization crisis. This article is a preliminary analysis of factors that are likely to cause exploitation in the wild, especially as a means to better inform the actions needed at the design level.
Introduction
When it comes to cybersecurity, companies are failing to learn from previous mistakes, and more often than not, corrections are cosmetic, superficial, or merely workarounds. Since 2017, the number of reported vulnerabilities has hit an all-time-high record every single year1. During that same period, between 1100 and 1900 (declared) data breaches hit the US every year2.
These aren't solely the product of technical mistakes, but also other misinformed decisions, including those that span business factors. The fact of the matter is that the same mistakes keep happening and we’re failing to look deeper into the chains of cause and effect that created these situations and implement what’s necessary to break them.
Beyond the structural flaws and governance issues, there is a lack of understanding and disregard (and sometimes resistance as well) to important aspects that will help companies better digest and operationalize available information from vulnerability disclosure and the threat landscape.
What is driving security incidents and causing data breaches hasn't fundamentally changed over the years (cf. yearly DBIR reports for instance7). Stepping back and looking at the aspects that are associated with a vulnerability in the code, and likely to cause a breach, will hopefully lead to the right questions to ask from the outset, i.e., at the design level. We believe that the right framework to do that is threat modeling, which will help organizations better operationalize available threat landscape information.
Data sources
This article includes the analysis of the following data sources (over the past decade):
- CVE list, from NVD (disclosed vulnerabilities)3
- Exploit DB (exploit availability)4
- Known exploited vulnerabilities, from CISA (exploitation in the wild)5
- AlienVault's threat intelligence feed (exploitation in the wild)6
The combined dataset is analyzed to deduce top security issues and their prevalence and show the impact of exploitability, and other vulnerability features, on actual exploitation in the wild.
Data analysis
This data analysis aims to infer potential causes of exploitation based on the impact of multiple factors, e.g., known exploitability, the type of weaknesses, or the severity of vulnerabilities. The purpose is twofold. Firstly, to help with prioritization, e.g., what factors should be primarily taken into account to prioritize vulnerability patching or controls’ implementation? Secondly, what actions and countermeasures are most needed at the design level to effectively reduce risk and improve product quality from the outset? To that end, causal inference8 was used.
Causal inference is a process used to determine whether some associations truly reflect a cause-and-effect relationship. Using a causal approach in cybersecurity is becoming necessary to help the community better explore the appropriate actions to defend against and/or prioritize. It can also help produce better analytics and machine learning models, e.g., by considering causality in feature selection.
CVSS severity9, for instance, is frequently used in vulnerability prioritization. It is, however, often criticized for failing to represent the real-world impact and end up 'over-abusing' the higher severity ranges (high and critical). In fact, close to 40% of vulnerabilities disclosed over the past decade have high or critical severity. This is mainly due to assuming the worst-case scenario when it comes to exploitation.
A helpful tool would be to further define what exposure feature impacts the most real-world exploitation. Using causal inference, we will estimate the impact of known exploitability on whether or not exploitation in the wild happens. We use CausalNLP10 which has support for text, e.g., vulnerability/weakness description, as a covariate in this case. The underlying data analysis uses meta-learners to estimate causal effects of interest11.
Causal impact of exploit availability on exploitation in the wild
The experiment setup is as follows:
- Exploitation in the wild: outcome
- Known exploit availability: affects the outcome
- Description text, Weaknesses (CWE), Severity range (for low, high or critical), and Attack complexity (low or otherwise): covariates,
- Meta-learner/ learner: X-learner/ LightGBM.
Key findings
The overall ATE (Average Treatment Effect) shows that known exploitability only increases the probability of exploitation by 3.4 percentage points in this dataset (covering all vulnerabilities disclosed in the past decade). This jumps to 16 percentage points (Conditional Average Treatment Effect) when the weakness associated with the vulnerability is CWE-416 (Use After Free). We present the ranking of CWEs in terms of the importance of their causal relationship to exploitation in Table 2.
It is also possible to measure the causal effect of description text. For instance, the CATE (Conditional Average Treatment Effect) for vulnerabilities that have ‘use after free’ in the description is 12.9 and is almost three times higher than vulnerabilities with ‘buffer overflow’ in their description, which amounts to 4.7. This suggests a hierarchy of memory issues, where use after free exploits have more importance than buffer overflows when it comes to causal effects on exploitation in the wild.
This can easily be used to predict the effect of new observations, using the input mentioned above, and might be used to prioritize based on the increase in exploitation probability. An example is presented in Table 1, where two CVEs are identical in the CVSS metrics used here, but can be differentiated based on their descriptions and CWEs. Based on the causal prediction, the second CVE would be prioritized over the first one (when patching). Overall, from a prevention perspective, design and development teams can use this type of prediction to prioritize areas of focus in countermeasure implementation and create robust tests for a more secure final product.