chatgpt – ODR

Today’s scenario is probably what the future looks like. A new Linux exploit, named “Dirty Frag”, was released yesterday after the embargo elapsed. No patch or broadly available detections yet exist; guidance from security vendors, at the time of this writing, is that they are “working on related detections.” We don’t have to wait for humans to write detection rules; we can find this using detection lattices without waiting for static rules to be written.

The Dirty Frag PoC lights up a detection lattice like a Christmas tree with at least three conventional behavioral detections, four machine learning detections, 21 unusual syscalls, and three signal fusions between these detection classes as shown below:

While none of these, by themselves, are sufficient to produce a critical alert, the combined set creates a strong alpha signal detection lattice. This lattice tells us that anomalous compiler activity, followed by anomalous process activity from the temp directory, was followed by a cluster of syscalls, and ultimately a root shell. Each of these alerts do sometimes come from benign outlier activity, but the confluence is too unusual to be coincidence.

There is one school of thought that this class of exploit is low or medium risk because it is not remotely exploitable without existing shell access. I think this kind of assumption works against us, and is one of the ways vuln management misses attack paths and things get popped. If an attacker is going to obtain a root shell through an exploit like this, they’re most likely going to get initial execution by looking for an RCE or RCI vuln in a web application (often a target rich environment) not a CVE in the web server process . Maybe something that has a CVE, maybe something bespoke in the application code that the owners have not noticed, or have given low priority because it yields low privilege execution. In order to see these kinds of attack paths, you need a fusion of CVE and appsec data. Explore your combined attack surface and ask two questions 1) What can I get right now? 2) What path can I take to get something valuable?

Also, I think the conventional “scan, attack, exploit, actions” model is not necessarily how operators work. It is not always feasible to achieve actions on objectives in linear time on a target of choice if the conditions are not quite right. So some crews are “collectors” in that they collect as much persistence as they can, in as many networks and cloud accounts as they can, so that when a privesc becomes available, they can jump on it and be the first. Sometimes we do see a conventional discovery, execution, initial access, persistence, privesc, lateral movement cycle in a short windows of time, but sometimes it comes from a crew or an operator who have been persisting in the environment for a while. So sometimes people are looking for discovery or initial access that happened a long time ago and think everything is fine because they are not detecting the start of a cycle.

Although I suppose model based vuln development changes all this in that the time frame will probably be compressed, and there are rapid cycles to find. The question I see is what happens if the cycles become faster than human response or even human cognition. We’re scaling up offensive (attack) art first, with AI, because the models are good at that, and nothing attracts eyeballs and sells things like dramatic attack case studies. We’re not really scaling up agentic defense yet.

This scenario – a new exploit without a patch or available detection rules – is probably going to more more and more common as AI-assisted vulnerability and exploit development continues to scale. As the volume and velocity increases, we will not always have time for humans to manually write detections. We will increasingly need robust detection lattices capable of identifying emerging threats without de novo signatures, exploit-specific rules, or even prior threat intelligence

So we are seeing more and more cases of malicious code injected into shared models, skills, tool, agents, and projects. I had a case where a shared model had a number of malicious code blocks that the users had not noticed due to the size of the codebase. At the time of this writing, we are seeing proliferation of malicious skills and prompts with a variety of payloads. When run in an IDE with AI tools, much of these produce execution and C2 activity that is detectable if something was watching closely enough. The challenge is that these events will be in a haystack of benign process and network events from tasks the AI tools were given permission for by the user. One of the purposes of the ODR (open DR) project is to hunt for this class of threat activity using anomaly detection techniques and a learning informed detection pipeline that is continuously updated.

One reason to run these hunts at the local level, in addition to conventional SOC and SIEM operations, is that the definition of what normal looks like, for a particular IDE, is largely in the head of the developer. Another reason is that much of the context is on the endpoint where the activity took place, and all of that state cannot generally be logged to a SIEM due to data volume and cost. At the same time, devs cannot monitor every action taken by their tools, and they are not trained in what to look for. This feels like a job for an autonomous agent pack.

Smith is an autonomous agent pack that will process most alerts and anomaly detections generated by ODR (open DR, also in this GitHub org.) ODR mainly looks for strange things happening in your AI dev tools at present but Smith will process most alerts generated by ODR. There is a show and tell video here: https://youtu.be/lsh3JRne9sg and the project lives in a repo here: https://github.com/opendr-io/agentic-park/tree/main/smith

Smith processes raw event and alert data from the ODR (open DR) project in order to investigate alerts and anomaly detections. It outputs an analysis of each alert. It lets you know when it thinks it has a high confidence detection and engages you in a collaborative analysis conversation in order to work out whether the detected activity is benign and expected or unexpected and potentially malicious.

Some sample alert data is provided so that it can do something out of the box. Sample event data is not provided due to its size, and the need to sanitize it, but can be generated by running openDR. It has a filter layer to try to stop prompt injections from reaching the agents and one filter test case alert is included in the sample data; you will see it get “intercepted” by the filter. That is an interesting area of research and we would like to hear from both offensive and defensive researchers as we add more filtering techniques and more detections.

Tag: chatgpt

Detection Lattices For Emerging Threats: “Dirty Frag”

Show and tell: agent smith