The Zeek Network Security Monitor provides a powerful open-source platform for network traffic analysis. However, from its network vantage point, Zeek lacks access to host-level semantics, such as the process and user accounts that are responsible for any connections observed. The new Zeek Agent fills this gap by interfacing Zeek directly with your endpoints, providing semantic context that’s highly valuable for making security decisions. The agent collects endpoint data through custom probes and, optionally, by interfacing to osquery and making most of its tables available to Zeek.
Traditionally, network security monitors only receive network traffic passively intercepted between hosts (endpoints). While that vantage point is very powerful—the network does not lie!—it does not provide the complete picture, and it can be challenging to understand the broader context of who is doing what on the endpoints. This approach makes analyzing encrypted traffic particularly challenging: Since passive monitoring cannot assess the actual content of the communication, defenders often remain in the dark about its legitimacy.
The Zeek Agent closes this gap by adding an endpoint-level vantage point to host activity. Just like Zeek itself, the policy-neutral agent does not perform any detection. Instead, it collects a stream of host-side events (“new process,” “socket opened,” “user logged in”) and feeds those events into Zeek’s standard processing pipeline, where they become available to Zeek scripts just like traditional network-derived events. By representing both network and host activity inside the same event abstraction, this setup lets users deploy all of Zeek’s powerful machinery to cross-correlate the two vantage points. For example, Zeek could now tag network connections with the endpoint-side services that initiated them (e.g., sshd). The agent can also let Zeek create new logs that record endpoint information, as seen in this example:
Zeek logs of network connections and listening ports, enriched with relevant process context from the endpoint (provided by the Zeek Agent)
A short history and background of the project
In 2018 at the University of Hamburg, Steffen Haas developed an initial prototype of the agent. Originally called zeek-osquery, this prototype was a powerful demonstration of the agent approach, but it had certain technical limitations that precluded production usage.
In 2019, Corelight hired Trail of Bits to update and improve Haas’ zeek-osquery software prototype. While the prototype was developed as a set of patches that implemented functionality directly within the osquery core, Trail of Bits suggested a different approach that would be more suitable for long-term development, and a better match for existing deployments. The new version began as a port of the existing code to an osquery extension which could be packaged and distributed independently. Eventually, this process evolved into a clean rewrite to produce an entirely new agent that can operate both in a standalone fashion and with osquery. The agent’s built-in data sources leverage the same underlying process-monitoring capabilities as osquery, but in a way more compatible with how Linux systems are configured for real-world use. Trail of Bits also designed the agent to easily support more custom data sources in the future.
How the Zeek Agent works
Like osquery, the agent provides system information in the form of a database, using the SQLite library. Table plugins publish the actual data, and are allocated and registered during startup. Once they export the necessary methods to report the schema and generate data, the internal framework will create the required SQLite table abstractions and attach them to the database.
Most data sources will inspect the system at query time and report their findings at that specific point in time. However, some tables may want to keep listening for system activity even when no query is being executed. This is commonly referred to as an evented table, which typically uses threads/callbacks to continuously record system events in the background. The process_events table works exactly like this, and allows Zeek scripts to look at past process executions.
Additional data sources can be imported from osquery, if it happens to be installed and running on the same system, thanks to the extensions socket. With this design, everything appears to come from a single unified database, allowing users to seamlessly join built-in and osquery tables together.
The tables can be accessed via scripts as soon as the agent connects to a Zeek server instance. Data can be requested either by running a single-shot SQL statement or as a scheduled query that runs automatically and reports data at specified intervals.
On the Zeek side, scripts fully control the data requested from the agent by defining corresponding SQL queries. Results stream into Zeek continuously, and transparently turn into standard Zeek events that handlers can hook into, just as if the events were derived from network traffic.
Right now, we welcome feedback on the Zeek Agent regarding usability, functionality, and deployment models. We’re curious to see what use-cases the Zeek community comes up with, and we encourage users to publish Zeek packages that leverage the agent’s capabilities. The best places to leave feedback are the agent’s GitHub issues, the #zeek-agent channel on the new Zeek Slack, and the Zeek mailing list.
The agent is an open-source project, so we also appreciate code contributions; just file GitHub pull requests. If you are interested in sponsoring specific work on the Zeek Agent, please contact Corelight.
We continue to extend the agent: We just completed an initial port to macOS, and we’re working on Windows support, as well. We will be extending the Zeek-side agent script framework, and we‘re also adding a multi-hop routing layer to Zeek’s underlying communication system, Broker, to facilitate deployment of the Zeek Agent across numerous endpoints.
About Trail of Bits Security Engineering Services
As your secure development partner, Trail of Bits has helped some of the world’s leading security software companies bring reliable products to market. Our engineering team’s goal is to write secure code and build tools that users can trust to protect their organizations and data. With our teams’ combined software development experience and research into system-level security, we regularly identify foundational gaps: missing capabilities, opportunities for improvement, and even potential vulnerabilities. We will review existing software architecture and provide recommendations or fixes, enhance feature sets or write new capabilities, and improve your security testing using the best available tools.
If your organization has security software plans but lacks the time or dedicated engineering resources to ensure the final product adheres to the best practices in secure coding, contact us. We would love to hear from you.
Corelight delivers powerful network traffic analysis (NTA) solutions that help organizations defend themselves more effectively by transforming network traffic into rich logs, extracted files, and security insights. Corelight Sensors are built on Zeek (formerly called “Bro”), the open-source network security monitoring framework that generates actionable, real-time data for thousands of security teams worldwide. Zeek has become the “gold standard’’ for incident response, threat hunting, and forensics in large enterprises and government agencies worldwide. Founded by the team behind Zeek, Corelight makes a family of virtual and physical network sensors that take the pain out of deploying open-source Zeek and expand its performance and capabilities. Corelight is based in San Francisco, California, and its global customers include Fortune 500 companies, large government agencies, and major research universities. For more information, visit https://www.corelight.com.