Education2026-06-269 min read

AI for Threat Intelligence: How LLMs and Agents Change CTI.

AI is used in threat intelligence to automate the high-volume, low-judgement parts of the intelligence cycle, but the part that matters most is easy to miss. A serious deployment is not one model that summarises reports. It is a stack of specialised agents: some call external tools and enrich indicators, some extract entities from raw text, some apply analytical tradecraft like Analysis of Competing Hypotheses and the Admiralty scale, some hold and continuously update the entire body of knowledge on a threat, some judge whether an event is relevant to your specific organisation, and some produce the finished report, detection rule, or STIX bundle. Large language models and agents do this work in minutes rather than days. The analyst still owns the final judgement, the calls that carry consequences, and the accountability. The honest summary is that AI handles the mechanical work and runs the analytical scaffolding, so the human judgement that turns information into intelligence finally gets the time it deserves.

That is the short answer. The rest of this guide explains the categories of agent in an AI-run CTI operation, why tradecraft and persistent knowledge are the parts that turn automation into analysis, where AI fits in the lifecycle, and where the analyst stays firmly in control.

What AI in threat intelligence actually means

“AI threat intelligence” is a broad label, so it helps to separate the parts. Three things sit underneath it.

The first is machine learning for classification and prioritisation: scoring indicators, clustering related events, deduplicating reports, and flagging what looks relevant. This has been part of mature CTI tooling for years and is not new.

The second is generative AI, meaning large language models that can read unstructured text, summarise it, extract structured data from it, translate between formats, and draft prose. This is the part that changed the economics recently. An LLM can read a vendor blog, a forum post, and a CERT advisory, then produce a structured summary with the indicators pulled out and mapped to ATT&CK, in the time it takes to make coffee.

The third is agentic AI: LLMs given tools, memory, and a goal, able to run a multi-step task on their own. This is where the real shift happens, because an agent does not just summarise a report you paste in. It can pivot on an indicator across several services, apply a structured analytic technique, check a source's reliability before trusting it, and write the result up, calling each tool in turn without a human driving every click.

The mistake is to picture all of this as a single clever assistant. In a serious deployment it is not one agent, it is a stack of specialised ones, each trained for a single job and composed together on every event.

The kinds of agent in an AI-run CTI operation

This is the part most coverage of AI in threat intelligence skips, and it is the part that decides whether you get a summariser or an analyst. Liberty91 runs six kinds of agent, each doing one thing well.

Integration agents call the external tools. They enrich an IP or hash across VirusTotal, URLScan, Shodan, GreyNoise and AbuseIPDB, pull finished intelligence from sources like Google Threat Intelligence, CrowdStrike and Group-IB, and push results into MISP, your SIEM, or Slack.

Entity-extraction agents each pull one kind of thing out of incoming reports: indicators of compromise, MITRE ATT&CK techniques, named threat actors and malware, the sectors and regions being targeted, and the assets and suppliers mentioned. One job each, done consistently.

Tradecraft agents apply real analytical technique, which we cover in depth below. They run Analysis of Competing Hypotheses, rate a source and its data on the Admiralty scale before it is trusted, attach calibrated confidence and likelihood language, pivot across indicators, and run contrarian checks to fight bias. This is the layer that turns a summary into the basis for an assessment.

Knowledge agents are self-learning. Each builds and maintains the entire body of knowledge on one topic over time: Russian cybercrime, infostealers, a particular sector, a region, a named threat actor, or one of your intelligence requirements. Keeping that knowledge current and accessible is genuinely hard for a human, and it is exactly what an agent is good at.

Organisation agents are trained on your organisation, or on each organisation an MSSP serves. They know what kind of business it is, its assets, its suppliers, and its threat profile, and they judge the relevance of every event and every datapoint to that specific organisation. This is what makes the output tailored and actionable, rather than generic.

Production agents turn the finished analysis into the right product for each audience: a strategic brief for the board, a technical report for the SOC, and machine-ready outputs such as detection rules, indicators of compromise, STIX bundles and blocklists, all from the same underlying work. The audience does not have to be a person. A production agent can feed another system, or another agent, directly, so the intelligence flows straight into a business process. A development team, for instance, can have its CI/CD pipeline check every dependency against a malicious-package list the CTI team maintains, and fail the build if one shows up, with no human in the loop for that step.

The point is that these are different jobs, and treating them as one undifferentiated “AI” is how teams end up disappointed. The value comes from composing them.

Tradecraft is the difference between summarising and analysing

Anyone can get a language model to summarise a threat report. That is not intelligence. Intelligence is what you get when the summary is put through the same analytical discipline a good analyst would apply, and this is where tradecraft agents earn their place.

A tradecraft agent can run Analysis of Competing Hypotheses, laying out the plausible explanations for a piece of activity and testing the evidence against each one rather than anchoring on the first read. It can apply the NATO Admiralty scale to rate both the reliability of a source and the credibility of its data before that data is ever onboarded, so a single unconfirmed forum post does not get treated as fact. It can attach calibrated confidence and likelihood language, the difference between “this is the actor” and “moderate confidence, single source”, which is the difference between an assessment a decision-maker can use and one that misleads them. It can pivot across indicators to expand a single lead into the surrounding infrastructure, map activity to the Diamond Model, run a devil's-advocate or contrarian pass to surface the disconfirming evidence a busy analyst might skip, and feed strategic horizon scanning that looks for the weak signals of what is coming next.

The advantage is not that an agent invents these techniques. It is that it applies them consistently, on every event, even at three in the morning, when a time-pressured human team would skip straight to the conclusion.

That consistency is the real advantage. Most teams already know the tradecraft, but few have the time to apply it to everything, and an agent that runs the structured technique on every event raises the floor on quality across the whole operation.

Knowledge that stays current, and context on every datapoint

The second under-appreciated category is the self-maintaining knowledge agent, and it solves a problem every analyst recognises. The body of knowledge on any serious threat, say Russian cybercrime or the infostealer ecosystem, is large, fast-moving, and impossible for one person to hold in their head and keep current. Most of an analyst's expertise decays the moment they move on to the next topic.

A knowledge agent does not. It holds everything known about its topic and updates that picture continuously, so it can contextualise each new datapoint against the whole. In practice that looks like an assistant that reads a fresh report and says: this new malware resembles a backdoor previously used by a GRU-aligned group, so it is worth looking for this particular privilege-escalation technique, because that fits their established pattern. That kind of connection, drawn instantly from current and complete domain knowledge, is what a human analyst aspires to and rarely has the time or recall to do for every event.

Pair that with the organisation agents, which assess whether any of it actually matters to you, and the output stops being a feed and becomes intelligence about your organisation specifically. We cover the standing-knowledge side of this on our intelligence requirements page.

Where AI fits in the intelligence lifecycle

The threat intelligence lifecycle has not changed. Direction, collection, processing, analysis, dissemination, and feedback are still the right way to think about the work. What has changed is which phases the agents can carry.

AI is strongest in the middle of the lifecycle and lighter at the two ends, where human direction and human accountability live. Direction stays human: deciding what the organisation needs to know is a stakeholder conversation. Collection is where integration and knowledge agents earn their keep first, monitoring hundreds of sources continuously and surfacing only what is relevant. Processing is the entity-extraction and integration layer: pulling indicators, enriching them, mapping to ATT&CK, converting to STIX. Analysis is the shared ground where tradecraft agents do their work and the analyst makes the call. Dissemination is the production agents, tailoring the output per audience. Feedback stays human, because judging whether the intelligence was useful is a relationship and a conversation.

For the full breakdown of each phase, our threat intelligence lifecycle guide covers it in depth, and the four types of threat intelligence explains how strategic, operational, tactical and technical intelligence map onto these phases.

Where the analyst stays in control

Giving agents real tradecraft does not move the judgement to the machine. It moves the scaffolding to the machine and leaves the judgement where it belongs.

A tradecraft agent can lay out competing hypotheses, but the analyst decides which one to back and stakes their name on it. An agent can rate a source on the Admiralty scale, but a person still exercises the editorial judgement to know when a well-formatted report is simply wrong. An agent can draft an assessment of an actor's likely intent from how they have behaved before, and even propose recommendations, but it cannot carry the accountability for the call. Signing off on an assessment that goes to an executive and says “this matters, here is what we should do” is a human responsibility. And an agent has no access to the relationship-based intelligence that runs on human trust, the call from a peer or the tip from a researcher, which is often the most valuable intelligence a team has.

So the model is not AI replacing analysts. The agents do the mechanical work, draft the assessments, run the tradecraft consistently, and hold the knowledge current; the analyst keeps the pen on the final judgement, the confidence, the relevance, and anything that carries consequences. It is a force multiplier for the team you have, not a substitute for it.

What about hallucination?

It is the right question to ask of any AI system, and the honest answer is that an ungrounded language model will sometimes invent an indicator or assert a confident answer that is wrong. That is exactly why the agents are not left to free-associate, and why several defences run at once.

The first is grounding. Every assessment is anchored in provided source material, so the agents work from the reporting in front of them and the output stays traceable back to it, rather than drawing on whatever a model happened to absorb in training. The second is real knowledge: the knowledge agents supply current, sourced domain expertise to contextualise each datapoint, so a new event is judged against what is actually known rather than what sounds plausible. The third is tradecraft: source reliability on the Admiralty scale, calibrated confidence and likelihood, and Analysis of Competing Hypotheses keep the agents from over-claiming. On top of that, contrarian and devil's-advocate passes actively hunt for disconfirming evidence, and review stages catch what slips through. We do not put a human in the loop for you, and we do not sign off on your behalf. What we do is make the output checkable: every assessment is traceable to its sources and carries its confidence, so your own analyst can review it quickly and own the calls that carry consequences. No single guardrail is enough on its own. The point is that there are several, and they compound.

Build it yourself, or run it managed

There are two ways to get this. You can assemble it yourself: with a coding agent like Claude Code or Cursor, an analyst can run a stack of skills that each handle one part of the lifecycle, against their own API keys and their own MISP instance. We put a pack of these workflows, covering the tradecraft and integration building blocks, on GitHub under an open licence, so practitioners can run them on their own stack. You can browse them on the CTI Skills page.

Or you can run the managed version, where the same categories of agent, plus the self-maintaining knowledge agents and the organisation agents that judge relevance, run continuously against your intelligence requirements without you maintaining the plumbing. Either way the principle is the same: one analyst can now run workflows that used to need a team and a procurement cycle.

How to start using AI for threat intelligence

You do not need to rebuild your programme to get value. A sensible order looks like this:

Start with collection and triage. Point AI at the firehose of sources and let it surface what matters to your requirements. This is the fastest, safest win and the one that frees the most time.
Automate processing next. Indicator extraction, enrichment, ATT&CK mapping, and STIX conversion are low-risk because the output is checkable.
Add tradecraft, not just summarisation. Insist that the analysis applies a structured technique, a source rating, and calibrated confidence, so what you get is an assessment rather than a paraphrase.
Let knowledge accumulate. The value compounds when an agent holds a topic over time and contextualises each new event against it.
Keep humans on the final call, attribution, relevance, and feedback. These are the parts that make intelligence trustworthy, and they are the parts machines cannot carry.

This is the model Liberty91 is built around. The platform runs the full stack of agents against your intelligence requirements at machine speed, and the analyst stays in control of the calls that matter. The combination of AI and threat intelligence is not a story about replacing analysts. It is a story about giving them a tireless team that does the mechanical work, applies the tradecraft every time, and keeps the knowledge current, so the work that actually protects the organisation gets done.

Frequently asked questions.

What is AI threat intelligence?

AI threat intelligence is the use of machine learning and large language models to automate the high-volume parts of the intelligence cycle: collecting from many sources, extracting indicators, mapping activity to frameworks like MITRE ATT&CK, and drafting reports. It lets a small team work at the speed and scale that used to require a large one. The analyst still owns the final judgement, the calls that carry consequences, and the accountability.

What are the categories of AI agents in threat intelligence?

A serious AI deployment is not one model but a stack of specialised agents. Liberty91 runs integration agents that call external tools, entity-extraction agents that pull indicators and techniques from raw text, tradecraft agents that apply structured analysis, self-maintaining knowledge agents that hold expertise on a topic, organisation agents that judge relevance to your business, and production agents that write the finished report or detection rule. Composing them is what turns automation into analysis.

What is a tradecraft agent in threat intelligence?

A tradecraft agent applies the analytical discipline a good analyst would use, consistently and on every event. It can run Analysis of Competing Hypotheses, rate a source on the NATO Admiralty scale before its data is trusted, attach calibrated confidence and likelihood language, and run a contrarian pass to surface disconfirming evidence. The agent supplies the scaffolding; the analyst still makes the call and signs their name to it.

What is a self-maintaining knowledge agent?

A self-maintaining knowledge agent holds and continuously updates the entire body of knowledge on one topic, such as Russian cybercrime or the infostealer ecosystem. Because its picture stays current, it can place each new datapoint in context and flag how a fresh report connects to an actor's established pattern. It gives an analyst the kind of complete, up-to-date recall that is hard for one person to maintain across many topics at once.

What is agentic threat intelligence?

Agentic threat intelligence uses large language models that are given tools, memory, and a goal so they can run a multi-step task on their own. Rather than summarising a report you paste in, an agent can pivot on an indicator across several services, apply a structured analytic technique, check a source's reliability, and write up the result. It calls each tool in turn without a human driving every click.

Can AI replace a threat intelligence analyst?

No. AI handles the mechanical work, drafts the assessments, and runs the analytical scaffolding, but it cannot carry the accountability of an assessment that goes to an executive, make the final call on what it means for you, or tap the relationship-based intelligence that runs on human trust. It is a force multiplier for the team you have, not a substitute for it.

Can I not just ask ChatGPT about a threat actor?

You can, but a general chatbot answers from whatever it absorbed in training, which may be out of date, incomplete, or simply wrong, and it knows nothing about your organisation. A CTI deployment grounds every assessment in current source material, contextualises it with knowledge agents that track the actor continuously, and applies tradecraft like source rating and calibrated confidence. The difference is between a plausible summary and an assessment you can act on.

How does Liberty91 prevent AI hallucination?

By not letting the agents free-associate. Every assessment is grounded in provided source material and stays traceable to it, knowledge agents supply real and current context, and tradecraft agents apply source reliability scoring and calibrated confidence. Contrarian and devil's-advocate passes plus review stages catch errors. We do not sign off on your behalf; because every assessment carries its sourcing and confidence, your own analyst can review and approve anything that carries consequences.

How does AI fit into the intelligence lifecycle?

AI is strongest in the middle phases of the lifecycle and lighter at the two ends. Collection, processing, and the analytical scaffolding of analysis are where integration, entity-extraction, and tradecraft agents do their work. Direction and feedback stay human, because deciding what the organisation needs to know and judging whether the intelligence was useful are stakeholder conversations.