CrowdStrike BSOD episode – How could primitives such as eBPF alleviate such concerns?
The CrowdStrike incident exposes the flaws of traditional security. Discover how AccuKnox’s eBPF-based architecture provides unmatched security and reliability, preventing catastrophic failures and downtime.
Reading Time: 7 minutes
Table of Contents
On July 19, 2024, CrowdStrike released a software update to the vulnerability scanner Falcon Sensor. Flaws in the update caused blue screens of death (BSOD) on Microsoft Windows machines, disrupting millions of Windows computers worldwide. Affected machines were forced into a bootloop, making them completely unusable. The recent CrowdStrike incident highlights the risks associated with traditional security architectures that depend on agents that require full access to the Kernel. This design can lead to catastrophic system failures, as seen with the widespread Blue Screen of Death (BSOD) errors affecting numerous organizations and critical infrastructure worldwide. Such incidents underscore the need for a modernized and resilient approach to endpoint security.
Impacts of the BSOD (Blue Screen of Death) Incident
- Airports — including LAX, Hong Kong Airport, and Dubai Airport—experienced enormous operational delays, where passengers were stranded in airports while flights were grounded or delayed from takeoff, as reservations and check-ins could not be processed. Delays like this cost airlines thousands of dollars per minute; the cascading effects of connecting flights and passenger inconvenience are not even taken into consideration.
- Financial institutions — including the Central Bank of Israel, thereby affecting transactions and services. The event caused transaction failures and delays in the banking sector, hence having a potential impact on millions of dollars in daily transactions.
- Supermarkets – Majority in Sydney operated on a cash-only basis because the processing systems for credit cards crashed.
- Municipalities and government services – including 911 operations – which threatened the very core of public safety and emergency response.
How did the incident happen?
It was traced to a faulty update in the CrowdStrike Falcon sensor. This update contained a critical flaw that, when deployed, caused systems to enter an endless boot loop. The flaw was particularly dangerous because the agent had full access to the Windows Kernel, amplifying the potential damage.
Source: https://www.reddit.com/r/crowdstrike/comments/1e6vmkf/bsod_error_in_latest_crowdstrike_update/
3 Steps to avoid such incident
- Rigorous Testing: The incident highlights the crucial need for comprehensive and rigorous testing before releasing updates. Implementing extensive testing protocols, including regression testing, stress testing, and compatibility testing, could have identified the flaw before deployment.
- Layered Security Architecture: Utilizing a layered security architecture that limits the agent’s access to critical system components can mitigate the risk of such widespread failures. By restricting kernel access, any potential issues would be contained and less likely to cause catastrophic system failures.
- Real-Time Monitoring and Rollback Mechanisms: Implementing robust real-time monitoring and automated rollback mechanisms can help quickly identify and mitigate the impact of faulty updates. Immediate rollback of problematic updates would prevent widespread disruptions.
The worst case that can happen because of a security agent going down would be that the security application itself would stop working. However, in this case, the Falcon sensor had used a kernel driver that had full access to the kernel. Using Kernel driver or kernel module is a huge risk for multiple reasons:
a) any impact to the driver/module can result in system crash (which is what happened this week)
b) any compromise to the kernel driver/module could lead full privileges to the attacker.
Thus in general, the use of kernel driver/module should be avoided or if it cannot be avoided, an organization needs to ensure that deployment best practices are followed.
The deployment is more difficult since the agents are installed on the end-user devices. In contrast to cloud services, these modifications are not automatically reversible. Furthermore, the blast radius is very large, encompassing all electronic gadgets worldwide. At AccuKnox, since the Agent runs in the kernel mode, the testing must be rigorous. Also, the code reviews are done thoroughly and critically. There are enough guardrails to prevent any accidental updates to the source operating system. This is highlighted in the image below.
Why Leverage eBPF?
To mitigate the risks associated with traditional agent-based security architectures, organizations can explore the use of eBPF (extended Berkeley Packet Filter) technology. The whole premise for eBPF’s existence is that the use of kernel modules in production environments should be avoided. eBPF is highly constrained and essentially provides less than 0.1% of the capabilities of that of a kernel module but it ensures that its in-kernel operations are safe and these operations are vetted by a kernel verifier before the instruction set is allowed to be executed in kernel. This property of eBPF allowed users to inject eBPF bytecode dynamically in the production servers without having to worry about runtime impact. eBPF offers a more secure and less intrusive approach to endpoint security:
- Minimal Kernel Access: eBPF programs run in a sandbox within the kernel, reducing the risk of system crashes or instability caused by faulty code.
- Dynamic Tracing: eBPF enables dynamic tracing of system events and performance metrics, providing real-time visibility into system behavior without the need for heavyweight agents.
- Efficient Monitoring: eBPF-based monitoring solutions can efficiently collect and process data from various system events, reducing the performance impact on endpoints.
- Scalability: eBPF’s lightweight nature allows for easy deployment and scalability across large networks, making it suitable for enterprise-level security solutions.
By adopting eBPF-based security architectures, organizations can benefit from enhanced security, improved system stability, and reduced performance impact on endpoints, mitigating the risks associated with traditional agent-based approaches.
AccuKnox’s eBPF-Based Architecture
AccuKnox’s recent blog on the Role of eBPF-based agents; explains in depth about how we operate in isolated environments, significantly reducing the risk of system-wide disruptions. While traditional agents did require full kernel access and had the potential to make critical errors, AccuKnox’s eBPF-based solution does not have this problem due to little to no kernel access. This isolation makes sure such issues are contained and do not escalate to affect the entire system.
Solid Reliability
Our architecture is designed for maximum reliability. The isolation provided through eBPF ensures that, even in the unlikely event of a bug, it will remain contained and localized, hence manageable. This is in very sharp contrast to the wide extent of impact seen in the CrowdStrike incident, wherein kernel-level access gave full play to the potential of a single fault for extensive damage.
Performance Efficiency
eBPF runs natively inside kernel space, bringing very high performance that does not affect system stability. In a high-demand environment such as Kubernetes, this efficiency becomes of paramount importance for network packet processing and runtime security. Tools such as Cilium, using eBPF, ensure high-performance networking by solving the scaling problems inherent in Kubernetes environments.
AccuKnox’s Journey with eBPF
Our journey with eBPF began by collaborating with SRI International on the improvement of network and system security. It was through collaboration that KubeArmor came into being as an enforcement tool for runtime security policies in Kubernetes pods. This further integrates with Linux security modules, such as AppArmor and SELinux, for comprehensive security.
The Open-Source Advantage
KubeArmor – A famous open-source project created and actively maintained by the developers of AccuKnox has the model that will ensure improvements in the system along with policy recommendations from the community. Only this kind of collaboration will ensure that KubeArmor will keep evolving with changing security threats and get the best industry practices from the practitioners globally.
AccuKnox CNAPP
Powered by KubeArmor Open-Source
We created KubeArmor in 2021 |
We scaled from 0 to 750K+ downloads |
We are major contributors and maintainers of KubeArmor |
10+ Industry Adoptors |
1300+ GitHub Stars |
750K+ Downloads |
Addressing Challenges
In the course of developing KubeArmor, we have faced some platform-specific problems with SELinux, due to which we moved towards BPF-LSM for its greater compatibility. Most major Linux distributions have included BPF-LSM and demonstrated evidence that it can also stack with AppArmor and SELinux. This has improved workload hardening quite a lot. Our security solutions always turn out to be versatile and effective in different scenarios.
Inline Mitigation (vs) Post-Attack Mitigation
One of the standout features of KubeArmor is its inline mitigation capabilities. AccuKnox CNAPP is powered by KubeArmor under the hood. Contrasted with the mitigation strategies of tools like Falco and Tetragon, which work after a possible attack has been perpetrated, KubeArmor prevents an attack in real-time. All this proactive approach does is decrease potential damage and downtime from security breaches.
Key Takeaways
✔ AccuKnox’s adoption of eBPF is a future proof approach toward endpoint security.
✔ We are delivering one of the most solid, reliable, safe, and resilient solutions against challenges faced by traditional architectures. Through rigorous testing processes, continuous improvement, and open-source collaboration, we help our customers receive nothing less than the best of security solutions.
✔ The CrowdStrike incident reminds us how important innovative and resilient security architectures are in today’s world.
✔ Further development and fine-tuning of our technologies will ensure superior protection for our customers. AccuKnox has been rewriting new standards when it comes to cybersecurity.
✔ We at AccuKnox pride ourselves on leading with our eBPF-based solutions that bring enhanced security, reliability, and performance to computing environments in modern times. Talk to us if you are planning to evaluate the comprehensive CNAPP security.