Our strong performance in the AV-Comparatives EPR Test further confirms the ROI of ESET XDR-enabling products and services. Can the test also contextualize ESET’s performance in round five of the MITRE Engenuity ATT&CK® Evaluations: Enterprise?
Introduction
In the recently published Endpoint Prevention and Response (EPR) Comparative Report 2023 by AV-Comparatives, ESET PROTECT Enterprise version 10.1 scored highly in the EPR CyberRisk Quadrant™. Deemed a Strategic Leader along with only three other vendors, ESET sits at the top of the prevention and response capability ranking with the highest detection rate.
For some ESET partners, and even our sales leads, this result in the AV-Comparatives EPR Test stands in apparent contrast to the substep visibility coverage we provided in the 2023 round of the MITRE Engenuity ATT&CK® Evaluations: Enterprise. Some might argue that the statistical results for the detection and protection scenarios of the evaluation look quite different. Do testers, vendors, and end users all live in separate universes where math isn’t a universal language?
In our view, the APT-minded testers at MITRE Engenuity haven’t created a performance ranking (nor a competitive test) in their examination of protection and detection across the cyberattack chain, at least not one that can be easily compared to other tests. Readers transitioning their consideration from the ATT&CK Evaluations to the AV-Comparatives EPR Test will require a shift in attention. This shift must include a focus on the effectiveness of simulating real-world conditions and, importantly, consideration of the different audiences of the tests — in the case of the EPR Test, both end users and participating vendors.
Comparing audiences
The two tests discussed here have unique audiences in mind. AV-Comparatives has built its EPR Test to evaluate vendors’ product expectations and, critically, to give buyers transparent access to standard and agreed-on outcomes via a methodology that best mimics real-world performance. As part of the service to buyers, third-party testers typically provide their own analysis of the raw data, which is what AV-Comparatives offers in its EPR Report.
In contrast, MITRE Engenuity does not provide comparative analyses of vendor performance in the ATT&CK Evaluations. Indeed, MITRE Engenuity states:
These evaluations are not a competitive analysis. We show the detections we observed without providing a “winner.” Because there is no singular way for analyzing, ranking, or rating the solutions, we instead show how each vendor approaches threat defense within the context of ATT&CK.
This statement makes it abundantly clear that the ATT&CK Evaluations are not a ranking but a resource, first, for vendors, and second, for the security staff at organizations, who can engage thoroughly with the raw data provided.
We can put this another way: third-party tests are both a product and a service that provides its users – both vendors and end-user businesses – with input to make educated decisions concerning product R&D or business cost and real-world performance, respectively.
Participating in the ATT&CK Evaluations and the EPR Test
Another key point is who participates in these tests. The 2023 round of the MITRE Engenuity ATT&CK® Evaluations: Enterprise tested 29 vendors1 against two attack scenarios built from techniques used by the Turla threat group. Ultimately, this test assists vendors in working on validating their approach to optimized detection and protection – this includes the reasoning behind detecting or not detecting specific substeps in the evaluation.
Conversely, the 2023 AV-Comparatives EPR Test evaluated 12 vendors with a highly comprehensive approach that deployed 50 targeted attack scenarios using a diversity of techniques mapped to the ATT&CK knowledge base. These scenarios were not communicated ahead of the test, a fact that further contrasts the AV-Comparatives approach with that of MITRE Engenuity. The resultant EPR CyberRisk Quadrant factors in product efficacy in breach prevention, the calculated savings, and the product’s purchase, operational accuracy, and workflow delay costs.
Despite the high value of AV-Comparatives’ approach, historically, several participating vendors have opted not to reveal their names. Furthermore, a slew of other vendors who participated in the ATT&CK Evaluations have decided not to participate in the EPR Test. Potential end users of XDR products can rightly question the costs of these untested products and whether those costs potentially degrade the apparent detection capability shown in the ATT&CK Evaluations.
In developing our own capabilities and reviewing our performance in past ATT&CK Evaluations, we opine that the usual “price” for a claim of 100% detection or protection in an evaluation is likely paid in false positives. However, our product, ESET PROTECT Enterprise, completed the AV-Comparatives test without producing any false positives – a bit of an ESET obsession, actually. The fact is, false positives drive real-world costs – costs that could even exceed those of a real compromise – due to IT staff potentially having to spend many hours to handle them.
Our participation in the AV-Comparatives EPR Test is important to help ESET communicate our value to IT security decision-makers. These decision-makers love quadrants such as the EPR CyberRisk Quadrant because they simplify understanding the capability of different products. We can imagine a CISO thinking: “Give me the product with the best detection and protection capability. Oh wait, the CFO reminded me to factor in cost!”
Figure 1. The EPR CyberRisk Quadrant provides quick and easy access to prevention and response vs. cost metrics
AV-Comparatives’ EPR CyberRisk Quadrant is a great resource for organizations to start evaluating and shortlisting XDR solutions on prevention and TCO metrics, before necessarily diving into the depths of technology and implementation.
Comparing methodologies
While AV-Comparatives uses a real-world methodology and provides analysis, MITRE Engenuity provides the emulation plan and results as raw data. If, as an organization, you have the resources and skilled IT staff, you could even rerun an ATT&CK Evaluation in your network, or substeps of it, to obtain highly relevant, real-world feedback on the actual effectiveness and cost of various solutions.
However, without such a personalized rerun, our assessment is that the ATT&CK Evaluations are unsuitable as the primary basis for a purchase decision. Instead, the goal of the evaluations is to provide trained eyes a resource to understand the specific levels of substep visibility that a product offers. Whether the provided level of visibility is suitable for an end user is hotly debated, but our main response is that you don’t need 100% visibility, nor do you need technique detections for every substep. What is necessary, however, is seeing enough relevant substeps – as not every substep is equally important to determine whether an attack is happening – then mitigate and/or stop it.
The DIY analysis required to really extract value from the ATT&CK Evaluations means all readers must be prepared to dive independently into the emulation plan and carefully consider each substep and what that means for your organization. Critically, the more you understand adversarial techniques, the challenges of reconstructing attack chains, and the commonality of events in your environment, the better your analysis and takeaways will be. Thus, the expertise required to interpret and assess the evaluations immediately puts decision-makers, who typically prefer digestible executive summaries, at arm’s length.
The measures taken by AV-Comparatives to offer a best approximate real-world environment, complete with commercially available and open-source attack tools, as well as tactics, techniques, and procedures (TTPs) assembled from MITRE’s ATT&CK knowledge base, underline testing for the service of businesses and institutions that rely on endpoint protection, detection, and response capabilities against real-world attacks.
Figure 2. MITRE ATT&CK TTPs employed by AV-Comparatives 2023 EPR Test (Source)
The thoughtful approach taken by AV-Comparatives to quantify product performance beyond protection and detection across the attack chain pays big dividends in avoiding problems due to false positives.
A standout example taken from the results of the Turla ATT&CK Evaluation is the number of alerts allowed to be generated without penalty. One vendor had over one million alerts per attack chain. Another vendor’s dashboard showed almost 6.7 million suspicious events. In contrast, ESET Inspect displayed around 6,000 detections (including both endpoint and rule-based detections) on Day One of the evaluation and around 2,000 on Day Two. Keep in mind that the test environments had four or five machines, and MITRE Engenuity did not test products “with a battery of clean scenarios” as AV-Comparatives did!
Betting on both horses
As a vendor, we have clear motivations to maintain our participation in both the MITRE Engenuity ATT&CK Evaluations and the AV-Comparatives EPR Test, which were executed very professionally. Both the EPR Test, with its dual technical and business audience, and the ATT&CK Evaluations, with its (ideally) technical audience, promote advanced security practice.
Betting on multiple horses is nothing new. The high level of engagement by ESET malware researchers and security analysts with the ATT&CK knowledge base has helped drive product R&D around improvements to our EPP, EDR, and threat intelligence products for many years. ESET began contributing to MITRE ATT&CK very early on and now stands among the top 10 out of more than 350 contributors to the ATT&CK knowledge base. Thus, participating in the ATT&CK Evaluations continues the critical dialogue between several teams to balance visibility into and usability of ATT&CK techniques and procedures.
With the internal dialogue established and sustained via our engagement with MITRE Engenuity, our parallel participation in AV-Comparatives’ EPR Test provides the necessary balance to factor in user-centric real-world needs. The result is composite use of both tests because both tests have significant merit and are (different) horses for courses.
Conclusion
With an aim to not only lead in prevention and response, but also to deliver a competitive total cost of ownership score, ESET sees decision-makers as the key readers of the AV-Comparatives EPR Report.
The stimulating dialogue instigated by the MITRE Engenuity ATT&CK Evaluations is a whole other animal. For enterprises, institutions, and other select businesses with SOC teams or skilled security staff in-house, we encourage you to continue leveraging the ATT&CK knowledge base and looking more deeply at the ATT&CK Evaluations. However, we see the true value there as a trigger for innovation, experimentation, and constant improvement.
1 The MITRE Engenuity ATT&CK Evaluations: Enterprise began with 30 vendors. MITRE Engenuity communicates 29 having completed the evaluation.