THE LIMITS OF EDR THREAT HUNTING
I’ve been busy since my last post. I left my position as a threat hunter, and I’ve started working for a new employer as a security engineer. In light of this change I thought it’d be prudent to share the lessons I’ve learned while working in the Endpoint Detection and Response (EDR) space and from the EDR-based threat hunting I’ve been doing for the last few years.
If you’re an offensive security professional then this post can help you adapt to the changing defensive landscape by teaching you something new about EDR vendor dynamics, threat hunting program design, and the challenges threat hunters face. Although I won’t be posting any EDR evasions or bypasses I will be describing the very real economic, technical, and human limitations of mainstream hunting programs. While those don’t automatically tell you how to implement a specific offensive technique you’ll likely get more long-term mileage from seeing the bigger picture even more clearly.
If you’re on ‘team blue’ and you’re unfamiliar with EDR tech (or you’re an IT worker/decision maker) then you’ll likely learn enough from this post to better plan an EDR deployment, to think about improvements you can make to your existing threat hunting program, to understand what EDR can and can’t do for your broader security program, and to ask EDR vendors more critical questions before you spend your limited time and money.
My goal with this post is to share any wisdom gained from my first-hand experience that could benefit other infosec professionals. Furthermore, I write this post with the utmost respect for my former employer, the EDR industry, and threat hunters working tirelessly to secure the world, and I hope my comments are received as well-intentioned, constructive, and sincere.
My experience in EDR-based threat hunting has consisted of providing hunting services to customers as an outside vendor, and so the following points refer specifically to the threat hunting discipline by means of commercial EDR products one would purchase. Not all of this post will apply to one’s use of in-house or open source tools.
The limits of EDR threat hunting can generally be described in terms of the following areas:
Before we can begin considering the technical limitations of EDR technologies we should examine the business environments and economic context in which these tools are designed, built, tested, and sold. Organizations that buy commercial EDR products may be at a security disadvantage due to market forces that act on for-profit, multi-customer EDR companies, because economic interests don’t always align to produce solutions that maximize customer security. Creating and maintaining useful detection-based security software requires lots of talented, hard-working people (with a deep understanding of both offensive operations and sound programming practices), and employing them effectively is both expensive and error-prone.
Money Before All Else
Companies have to make money. They have mouths to feed. Regardless of VC funding, present-day market share, or general popularity, in the long term, they have to satisfy customer demand adequately enough to continuously bring in revenue. Therefore, it’s generally considered wise for businesses to listen to customer feedback, adapt to changing conditions in the market, and ultimately produce and deliver products with features that align to the benefits customers are seeking. Security software vendors are no exception.
EDR customers act as the final arbiter when it comes to determining the market value of a given EDR product, but it turns out that even technically competent buyers don’t always know the most relevant threats in play or the detection engineering needed to truly mitigate them. Why? Because this is an area of constant research that evolves at break neck speed and requires highly technical, competent, well-resourced people to devote their educations, careers, and lives to mastering an increasingly complex body of knowledge. In my experience, it’s the lacking of this necessary talent in-house that leads organizations to answer their initial ‘build verses buy’ question by purchasing commercial EDR solutions in the first place.
With respect to those customers, vendors already have to fight to maintain a sense of identity within a volatile, competitive market, but they also face the following dilemma:
- Build features that uninformed customers demand, and risk making an ineffective product, or
- Ignore some customer demands, and risk the loss of revenue and an increase in customer churn
The problem is that many customers don’t know which features make for a quality EDR solution. It’s particularly bad when large customer organizations (representing a significant source of an EDR vendor’s revenue) influence the direction of the product’s development from a place of ignorance. Features that have a direct security benefit (new data collection capabilities, new behavioral detection logic, forensics and IR-supporting capabilities, etc.) are all-too-readily sidelined by the demands of less-than-savvy customers in favor of other considerations like UI/UX design, reporting capabilities, and pseudo-‘easy button’, check-the-box ideas. My point is not that any of these are invalid, but rather that vendors have to make difficult decisions quite often, and there is plenty of opportunity for poor decision quality.
Photo by peasap (modified) / CC BY 2.0
Vendors increase their market share, maximize profits, and survive by how well they deliver solutions to the problems customers merely believe they have irrespective of whether those beliefs reflect reality or not. The financial success and prominence of certain EDR companies can lead bystanders to the mistaken conclusion that their products have proven effectiveness at stopping and mitigating cyberattacks. Meanwhile, attackers don’t care that vendor X just got VC funding or company Y just had an IPO. In the long term, insofar as hackers have fresh TTPs of which EDR customers are unaware or unconcerned attackers will stay ahead of the defenders’ R&D cycles.
Busyness is not Effectiveness
If a company is to develop quality, detection-oriented security software in the long run, there needs to be a culture that takes the security of their customers seriously, values R&D as an irreplaceable endeavor (read: sees investing in R&D as synonymous with maintaining customer security), and generally supports and appreciates R&D personnel and their efforts. If the company’s leadership neglects or undervalues R&D (e.g. via inadequate staffing or insufficient budgeting) then the software becomes more or less irrelevant the very first time attackers find suitable TTPs that circumvent or evade the existing detections.
However, every company, including EDR vendors, has a finite number of employees. An organization can only employ a limited number of software developers, QA engineers, and security researchers, and it’s impossible for any single group to discover every new vulnerability, uncover every novel attack technique, track every threat group and malware campaign, and independently have all the answers to combat each issue. Therefore, the development of detection-oriented security tooling, like EDR, absolutely must involve integrating and implementing the findings of outside researchers who choose to share their work with the public.
The breadth of security research published online, released at infosec conferences, and shared within tight-knit local hacker communities and online forum/chat groups is staggering! This body of publicly-shared knowledge is constantly growing with every contribution, and the rate of growth, the volume of new findings, the cadence of their release, and the gravity of each discovery is unpredictable. I personally think it’s a good problem to have, but obviously anyone can quickly bite off more than they can chew, and that includes the R&D teams of EDR companies.
Meanwhile, personnel costs (i.e. salaries) often represent one of the largest expenses an organization can have, so obviously it’d be wasteful to pay employees who go underutilized. It’s entirely reasonable for any profit-seeking enterprise to pursue maximum output from each person on the payroll, and it’s thought that this can be accomplished by keeping everyone busy. But if an EDR company’s R&D staff is always tied up with its own internal projects and ideas, then it will never have the available resources to respond and adapt to new, outside research quickly enough. Similarly, if the R&D team is always working at full capacity, then it can’t help but neglect the requests, advice, and feedback from other employees in the company who rely on the EDR product for security analysis and incident response.
Traditional business orthodoxy (informed by completely sound economic considerations), if followed dogmatically, can work against the long-term interests of cybersecurity companies and their missions to provide sufficient detection technologies in a timely fashion. EDR vendors who are shortsightedly preoccupied with penny-pinching via mandated, constant busyness will continue to handicap defenders.
Infinite Treadmills Fail Open
For detection-based security tech one thing is certain: if you’re treading water then you’re downing. Your product must adapt to new threats and evolve its capabilities to stay relevant. But merely acknowledging that your tooling needs to keep pace is not the same as having a specific, actionable strategy for making it happen. Effective R&D leadership requires striking a balance between competing priorities and sources of work. Leaders need to take all of the available information (market trends, customer needs, product feedback, threat actor activity, and new security research findings) into account when planning which projects to pursue and how many resources to invest in each.
Actually doing so, and doing it to a high degree of competence, demands a state of constant vigilance and a ceaselessly action-oriented attitude in the face of such a complicated informational landscape. Companies that buy EDR software are implicitly trusting their vendor to fulfill these responsibilities, to be sleepless watchmen seeing and knowing every twist and turn as they unfold, but companies are just people capable of error with limited awareness. This methodology ultimately fails, and the product’s capacity to protect its users disappears, upon the first lapse in situational awareness.
“Infinite Loop” by Phil (pdsphil) / CC BY 2.0
Detection tech, such as EDR, is always on thin ice. Attackers don’t even necessarily need to pivot to new TTPs faster than the R&D team to be successful; they just need to identify the detections that R&D didn’t prioritize, forgot about, or otherwise never got around to implementing. Relying on constant, unerring, action-oriented vigilance is unreliable. The infinite treadmill of detection-based security has no off button.
Like the prior section, we can’t fully appreciate the cognitive and other human-related limitations of building a threat
hunting program on EDR tooling until we grasp the technical challenges (and their present-day solutions) that hold
threat hunters back and degrade their analytical and defensive capabilities. Computer technology isn’t magic.
There will always be natural, physical limits on what computers can do, and the software we make to run on those
computers is no exception. On top of that, engineering decisions (read: critical, resource-using judgment calls)
always have to be made when designing any software, and for EDR solutions, those choices can have lasting, negative
Complete Visibility Isn’t
There’s no shortage of EDR companies offering ‘complete visibility’ into the happenings on endpoints, but what does it actually mean to achieve this, and what engineering work would have to occur to make it possible? To what aspects of an endpoint would one have total visibility? Certainly, one cannot record every CPU instruction, every bus signal, every hardware event, every disk read/write, and every datum (past and present) from all memory pages throughout the entire runtime of a modern computer. If anyone ever actually attempted this type of ‘complete visibility’ it would entirely diminish a computer’s performance in every measurable way, and such an attempt probably couldn’t even get off the ground if implemented in software alone.
Generally speaking, today’s endpoint monitoring tech focuses on OS-level events, and I think this is a reasonable aim. But even in this saner context can ‘complete visibility’ truly be achieved? Do mainstream EDR tools trace every system call made by every process? Do they collect every file system event for all files? Can they observe both user-mode and kernel-mode activity? Is every network connection for every protocol type recorded? Every part of every packet? For closed-source operating systems, do modern, commercial EDR products collect details for every API call? Did the R&D team reverse engineer all of the undocumented API functions? Is every single registry read and write operation logged so that the entire registry could be forensically reproduced in its entirety?
Photo by Korall / CC BY 3.0
Leaving aside the moral hazard of, at best, marketing a product in an inadvertently misleading way or, at worst, promising customers something that’s impossible to deliver, it doesn’t improve anyone’s security to set them up to depend on something that isn’t there. Defenders need to be acutely aware of the visibility limitations inherent in the specific EDR platform they rely upon (and present-day EDR as a whole), because attackers know these blind spots, and they exploit them to avoid observation altogether.
Data Retention is Expensive
Expensive? Wait, how is this a technical limitation? Shouldn’t I have included this in the prior section? Well, yes and no. The underlying expense of data storage is a financial constraint, sure, but the engineering and design decisions EDR vendors make to overcome this challenge are time-expensive too. Certain designs waste analysts’ time and ability to focus due to poor storage performance. Whether you fail to retain enough data by choosing storage with insufficient capacity, or you fail to make the data available in a time-efficient manner, you damage the effectiveness of the entire EDR product and tie the hands of analysis that rely on it.
We know that any given EDR tool can only record some reduced set of endpoint activity data, but even so, endpoints can produce an absolutely voluminous quantity of output that one needs to store. Going beyond visibility constraints vendors proceed to eliminate entire classes of activity data to reduce the burden. Even when the product has the ability to ingest a certain type of data provided by endpoint OSes, EDR companies are choosing not to collect it for fear of storage expenses. Furthermore, most EDR products ship data off of endpoints over the network to some central server(s), so making these cuts also consumes less network bandwidth further incentivizing this design.
EDR architecture isn’t freed from the current state of storage media technologies. Fast-access storage is expensive, while less-costly storage is quite slow. One approach taken is to keep recently-collected data (or data with a high access frequency) in memory and to transition older (or less-frequently accessed) elements to slower, disk-based storage. In my experience, this design is actually pretty useful for day-to-day SOC monitoring use cases. Sadly, it quickly falls apart for threat hunters who need to analyze vast quantities of historical data and can’t wait around for slow queries to complete.
Photo by Jennifer Williams / CC BY-SA 2.0
Vendors (whether they posses or lack this type of caching system) still have to make judgment calls on how long to retain historical data. They have to choose either the exact retention duration for various types of endpoint activity data (e.g. process execution, network connections, registry activity), or at the very least, they build in an order of precedence where certain data categories get stored for longer even though they’ll eventually ‘age out’. Some data points might be retained indefinitely, but if the customer isn’t responsible for their own storage costs, EDR vendors will almost always implement some kind of ‘data expiration date’ albeit imprecise to calculate and sometimes undocumented.
Attackers know every organization has finite retention capabilities, and they adjust their tactics accordingly. For example, if a hacker can’t avoid creating a certain artifact on an endpoint (that gets collected by the EDR solution), he or she can temporarily postpone further actions on that endpoint (i.e. put command and control on ice) until the retention duration (whether known or inferred) has passed.
Event Correlation is Hard
Proper visibility, collection and storage is merely the requisite foundation upon which security analysis, and ultimately, actual threat detection can occur. Useful event correlation, at the appropriate semantic level for productive investigation, is an absolutely crucial capability for a security program to make sense of the flood of incoming data. But associating the multitude of disparate endpoint activity data points requires even more engineering work by people with expertise in attacker TTPs and offensive operational methodology. If an EDR product can’t adequately group together distinct events and other artifacts, then defenders won’t be able to readily discern malicious activities from benign ones let alone reconstruct an attacker’s actions.
EDR tools err on the side of not correlating events enough. Not unlike a poorly-tuned SIEM, data about endpoint execution is left disintegrated too frequently. It takes extra work for an analyst to manually, mentally associate actions that are, in reality, related. For example, associating the atypical behaviors of an otherwise legitimate operating system process with the prior execution of a suspicious office document macro that arrived from an unknown sender is unnecessarily frustrating without some automation. Similarly, noticing that an anomalous outbound TCP connection from an EDR-monitored host corresponds to an equally unusual inbound TCP connection on another EDR-equipped host on the LAN, and piecing the story together as a case of malicious lateral movement, can be a laborious nightmare for an analyst to manually reconstruct in lieu of computer-based correlation.
Photo by Staff Sgt. Jeremiah Runser
However, over-grouping can be problematic too. It’s desirable for the central server component of an EDR platform to link together execution activities from the past with new events as they happen (especially across multiple endpoints), but if the engineers designing the grouping logic don’t comprehend analyst needs and priorities in addition to attacker TTPs then the product may end up being too ‘trigger happy’ in associating events that, in reality, are not related. For example, some attack tools execute payloads on victim machines in the form of base64-encoded PowerShell commands. If an EDR developer implemented grouping logic that associates all instances, past and present, of base64-encoded PowerShell execution that occur on all endpoints, and the product thereby sends an analyst a single alert for this massive ‘event’, then it’s easy to imagine how difficult the analyst’s job becomes, and how pointlessly counterproductive grouping the events turned out to be. In this example, not only would separate and distinct attacker tools (and therefore, attacks, and maybe even threat actors) be presented to the analyst has being ‘the same’, but completely benign programs that use this execution method would also get mixed in.
The goal of detecting malicious activity in computer systems rests on our capacity to correlate events and find out what they mean in the big picture. But this is a fast-changing field of constant discovery, so a build-and-forget expert system has a short shelf life. Staying relevant demands consistent integration of new findings that are based on what attackers are actually doing. Maybe someday machine learning can assist researchers and threat hunters in discovering otherwise hidden patterns of relevance here, but, in my view, the industry hasn’t truly achieved this quite yet.
Now that we’ve reviewed the business environment in which commercial EDR technology is produced, the challenges of endpoint detection engineering, and the pitfalls of current-day tooling, we can now better appreciate the struggles and weaknesses that human operators (SOC analysts, threat hunters, researchers, etc.) of EDR tools strain to overcome every day. Well-equipped defenders represent the most resilient form of protection from sophisticated attackers. As a threat hunter with years of experience, I have firsthand knowledge of how EDR vendors that fail to adequately address the obstacles holding back defenders ultimately prevent them from fulfilling their mission of consistently stopping cyberattacks.
Analysts Lack Context
EDR threat hunters are always working with incomplete information, and that’s okay. Modern EDR tools can’t feasibly collect all events, let alone from every endpoint, and most R&D teams are probably at least a few weeks or months behind on whipping up new detections to match updated attacker TTPs. But hunters can only create hunting queries to search for data that their platform has actually collected. Even if a hunter runs every query available across the entire fleet they most likely don’t have visibility of some attacker activities. Moreover, it’s also likely that older data has ‘aged out’ of storage (or takes too long to restore or replay from slower, archive-speed storage), so doing any sort of temporal or statistical analysis will be spotty.
Experienced attackers attempt to camouflage their actions and tell-tale artifacts by impersonating user activity, application behaviors, organization-specific business processes, or the operating system itself. EDR solutions generally focus on detecting maliciousness at the operating system level, and so good threat hunters are familiar with OS internals. That’s a huge discipline on its own, but if an organization’s security personnel don’t learn the other areas, then attackers can exploit this knowledge gap by identifying where hunters cannot intelligently scrutinize their impersonations. An effective security program should include training for threat hunters on the organization’s diverse technology use cases, typical user workflows, and department-specific business processes that involve computers or sensitive information. Full situational awareness can’t occur if an organization merely writes and disseminates a corporate security policy.
Threat hunters have tons of data to analyze, and they have to make endless judgement calls with a downright risky lack of context. Is a strange, new behavior the result of an operating system/application/business process update? Can it be safely ignored/filtered/whitelisted, or does it warrant spending precious minutes to go down the rabbit hole? Some behaviors aren’t outright malicious when considered in isolation (e.g. remote desktop usage) but they could be revealed to be part of an attack (e.g. stolen credentials/user impersonation) if analysts know what tasks users are and aren’t supposed to perform on their computers during which hours and from which locations.
The undesirable analytical limitations that a lack of necessary context creates are amplified when threat hunters work in isolation or are otherwise kept from or uninvolved with other business departments and personnel who make decisions or act on their analysis. When evaluating a specific organization’s threat hunting program it’s helpful to learn how exactly threat hunters interact with the incident response team. Can and does a conclusion from a threat hunter trigger the incident response playbook? If the analysis reached a conclusion that was invalidated by IR work, is there a process to circle back, share feedback, and make improvements? Threat hunters are confronted with commonplace, organization-level obstacles that hinder internal information sharing of this kind, but external personnel (i.e. threat hunters hired through a managed security service provider such as the EDR vendor itself) really have an uphill battle to learn all of this detailed, organization-specific information (use cases, policies, processes, etc.) and to make contact with all of the necessary people for every customer for whom they work.
Fatigue is a Real Problem
If we’re talking about moderately-sized organizations with more than a handful of endpoints and users, then the amount of data, even with limited collection capabilities, a threat hunter has to manually review and understand is massive. Threat hunters must choose which datasets to prioritize since there is always more data than people to look at it. It’s wise to start analyzing query results that are most likely to reveal malicious activity first and work one’s way to event data that has a decreasing likelihood to do so. But this approach doesn’t free one from the need to still review all the data. Someone new to threat hunting may, completely understandably, get overwhelmed with the amount of data that demands manual, human analysis.
EDR tools don’t improve this situation much either. Most EDR platforms present event data in a sub-optimal way in some respect. Without getting into the weeds of UI/UX best practices suffice it to say that outputting large volumes of query results in a listed, text-based format doesn’t make for an intuitive, workable analysis process. Threat hunters need to remove known-benign events from result sets with ease, perform statistical analysis, identify anomalous activities and deviations from baseline trends. Depending on how deep the analysis process within a given hunting program is supposed to go, hunting team may also need to dissect non-behavioral data (compiled binary executables, registry hives, etc.) and integrate and iterate on each new, relevant finding within the same workflow. Few turnkey endpoint-focused detection tools implement the principles of good data visualization that would alleviate pain points for threat hunters by speeding up analysis, and I haven’t met a threat hunter who’s said his or her data presentation setup and analysis workflow is pain free.
Threat hunters are supposed to be hunting threats. They typically shouldn’t be doing vulnerability management, risk assessments, security policy work, or penetration testing activities. In ill-defined security programs, though, these non-hunting responsibilities are foisted on already-too-busy hunting team personnel. A better program will clearly delineate responsibilities and, ideally, expect its threat hunters to exclusively hunt for threats. Otherwise, a ‘threat hunter’ working in a wear-every-hat, do-everything work environment might burn out and become incapable of the constant, unerring diligence required of most security work.
Collaboration is Errant
Fortunately, threat hunters don’t have to be constrained by the finite informational resources of their given employer. Utilizing shared threat information in the form of indicator-of-compromise (IOC) data such as known-malicious domain names, IP addresses, file hashes, and so on is a great way to cut through the noise of endless datasets to find active threats and respond quickly.
Integrating threat feeds of constantly-shifting IOC data is rife with engineering issues, though. Reputation data has a very brief period of relevance for threat hunting purposes. For example, if an attacker uses a shared cloud hosting provider for command-and-control servers its likely that the IP addresses of those servers will end up in a consumable feed of threat info. But attackers burn their infrastructure quite fast and often, and there’s a delay between when the servers in question get shut down and when their addresses are finally removed from the list. During that time, the data is erroneous, but some EDR products may blindly ingest the data and improperly alert analysts anyway.
The curators of IOC data who provide threat feeds are also capable of errors such including benign websites on lists of attacker domains. I’ve seen hiccups like this unfold far too frequently, and the outcome for threat hunters is wasted time and reduced ability to focus. Every time there’s an error it has the potential to grind smooth threat hunting operations to a halt with a flood of false positives. Since most commercial EDR tools incorporate IOC feeds like this then it stands to reason that security programs that depend on commercial EDR tools are generally plagued by these deficiencies.
Photo by Airman 1st Class Jesenia Landaverde
The Detection Paradigm is Fragile
Security by means of detection (i.e. identifying all relevant threats consistently and on time) has a lot of overhead, and there’s plenty that can go wrong. Reliably stopping cyberattacks by discovering them every time is akin to expecting to achieve security by means of correctness of implementation. For software, correctness generally means that the code is completely free of bugs, but for detection tools like EDR, correctness also means the product can never experience any collection gaps or blind spots as well. The foundation of detection-based security rests on the need for relative perfection, but any honest observer knows that all modern, non-trivial software is far from perfect. In my view, attempting to solve cybersecurity problems exclusively through attack detection requires far too many lines of code to ever achieve meaningful correctness.
Security architects are better served by examining the trust they place in various hardware and software components, reducing the degree to which each is trusted, implementing strong isolation between components of varying levels of trust, and deploying reduced-scope detection techniques to monitor component interaction and the isolation itself.
Do you agree or disagree with my conclusions or the limits of EDR threat hunting as listed? Feel free to let me know your thoughts and feedback. Thanks for reading!