Agentic AI security is a control problem at the level of action

Connect with Centific to discover what's next in AI.

See where to meet us

Connect with Centific.

Find an event

Platforms

Data Marketplace

Data Canvas

AI Data Foundry

OneForma

AI Localization

Expert Network

Join our Expert Network

Build & Train AI

RL Environments

Data Collection & Creation

RLHF & Preference Optimization

Supervised Fine Tuning

Model Safety & Evaluation

Internationalization

Vertical AI

Physical AI

Healthcare

Vision AI

Explore our full suite of AI platforms, data marketplaces, and expert services designed to build, train, fine-tune, and deploy reliable, production-grade AI systems at scale.

Platforms

Data Marketplace

Data Canvas

AI Data Foundry

OneForma

AI Localization

Expert Network

Join our Expert Network

Build & Train AI

RL Environments

Data Collection & Creation

RLHF & Preference Optimization

Supervised Fine Tuning

Model Safety & Evaluation

Internationalization

Vertical AI

Physical AI

Healthcare

Vision AI

Explore our full suite of AI platforms, data marketplaces, and expert services designed to build, train, fine-tune, and deploy reliable, production-grade AI systems at scale.

Book a Demo

Article

Agentic AI security is a control problem at the level of action

Learn why securing agentic AI requires governance at the level of actions, workflows, and outcomes, including controls for context, multi-agent coordination, tool use, and autonomous decision-making.

Published on Jun 9, 2026

•

8 min read time

Table of contents

Summarize

AI Summary by Centific

Turn this article into insights

with AI-powered summaries

Summarize article

Give me key takeaways

Topics

Agentic AI

AI Security

AI Governance

Zero Trust Architecture

Agentic AI

AI Security

AI Governance

Zero Trust Architecture

Author(s)

Sanjay Bhakta

In March 2026, an internal AI agent at Meta triggered a security incident after taking an action it had not been explicitly instructed to perform. It had valid access to the environment it was operating in. It was connected to the right tools, operating within its permissions, and executing a workflow it was designed to support. There was no external breach in the conventional sense. The issue was that the agent interpreted context, made a decision, and acted in a way that exceeded its intended role.

From a system perspective, the behavior was coherent. From a security perspective, it exposed a gap. The controls in place governed access, not intent. They ensured the agent could only interact with approved systems, but they did not constrain how the agent interpreted signals or what combinations of actions it could take once inside the workflow.

Most enterprise AI security practices are built around models that produce outputs. Agentic systems operate inside workflows that produce outcomes. Security has to be structured around that change.

Security must define what an agent is allowed to do, not just what it can access

Security models for visual language models and large language systems are designed to control exposure. They define which data a model can access, how outputs are filtered, and how usage is monitored across users and interfaces. These controls assume that risk is introduced through interaction at the edges of the system.

Agentic AI operates differently. Agents are embedded inside workflows where they take inputs, evaluate conditions, and trigger actions. Security must define which actions the system can take and the exact conditions that allow them.

Consider an agent deployed in a security operations environment to triage potential threats. The agent ingests alerts, correlates them with historical patterns, and determines whether to escalate, suppress, or initiate a response. Governance for this system cannot stop at defining which logs or signals the agent can access. It must define how signals map to decisions, what confidence thresholds trigger escalation, and which actions can occur automatically versus requiring human review.

If those boundaries are not explicit, the agent can take actions that are technically permitted but operationally incorrect. It may suppress alerts that should be investigated or escalate benign activity in ways that disrupt operations. The issue is a lack of control at the task level.

Context and memory introduce persistent points of failure

Agentic AI relies on accumulated context to function across tasks. Systems store prior interactions, intermediate decisions, and environmental signals so that future actions reflect more than a single input. This allows agents to operate across workflows, but it also means that errors are not isolated. A compromised or misleading input can influence future behavior long after the initial interaction.

Take an agent responsible for monitoring activity in a public transit system. The agent reviews live video feeds from platforms and entrances, combines that with motion sensors and historical traffic patterns, and identifies conditions that require attention. For example, it may be configured to flag a backpack left unattended for more than three minutes during peak hours, or to detect crowd density that exceeds a defined threshold near stairwells or exits. Over time, it builds a baseline of what normal activity looks like at each station by time of day, day of week, and event schedule.

Now consider what happens if that context is gradually distorted. A series of inputs may indicate that bags left on platforms for longer periods are common and low risk, or that crowd surges near a specific entrance are routine during certain hours when they are not. The agent incorporates those signals into its baseline. It begins to extend the time threshold for unattended objects or raise the tolerance for crowd density before triggering an alert.

At that point, the system is still operating logically. It is applying its learned behaviors consistently against the context it has learned. The problem is that the context has shifted away from reality. A bag that should trigger an alert after three minutes may now be ignored for ten. A crowd condition that should prompt intervention may be treated as normal flow.

Each individual decision appears reasonable when viewed in isolation. The risk emerges from the accumulation of those decisions over time, as the system continues to act on a baseline that no longer reflects the conditions it was designed to monitor.

Controls need to address how context is established, how long it persists, and how it is validated. Without those safeguards, the system can continue to operate while gradually moving away from intended behavior.

Multi-agent workflows require controls at the point of handoff

Agentic systems rarely operate as a single decision point. They are structured as workflows where multiple agents handle sensing, interpretation, decision-making, and execution. This structure enables more complex outcomes, but it also creates dependencies between agents. Each handoff becomes a point where assumptions are transferred.

Consider a traffic management system that uses multiple agents to monitor conditions, identify congestion, and adjust signals. One agent processes sensor data and flags a potential issue. A second agent evaluates the severity and recommends a response. A third agent executes changes to traffic controls.

If the initial signal is incorrect, that error does not remain contained. It is passed along as input to the next agent, which treats it as valid. The workflow reinforces the same assumption at every step. By the time the system acts, the decision reflects a chain of dependent judgments rather than a single point of failure.

Systems need clear agency perimeters and policies while identity context (authentication and authorization) is needed on how agents validate inputs from other agents, when uncertainty interrupts the workflow, and where human review is required. Without those controls, errors propagate through coordination rather than appearing as isolated mistakes.

Tool use requires boundaries on sequences, not just permissions

The ability to use tools allows agentic systems to operate. Agents can update records, trigger workflows, interact with infrastructure, and coordinate across systems. These capabilities are governed today through permissions and access controls.

In agentic systems, that is not sufficient. Exposure emerges from how tools are used in sequence. An agent managing inventory may detect a shortage, place an order, reroute shipments, and update availability across systems. Each of these actions is valid within the system’s permissions. The issue arises when those actions are driven by incorrect assumptions or manipulated inputs, producing an outcome that disrupts operations.

Security controls need to adopt the MAESTRO (Multi-Agent Environment, Security, Threat, Risk, and Outcome) threat modeling framework specifying not only which tools an agent can access, but how those tools can be combined and comprehending the types of vulnerabilities. They must establish constraints on action sequences, define acceptable triggers for each step, and introduce checkpoints where automated execution pauses if conditions are not met.

Securing AI agents requires Zero Trust architecture applied at the agent layer: least privilege access scoped to individual tasks, micro-segmentation between agent workloads, and attribute-based access control (ABAC) that enforces context-aware permissions at runtime. Security operations tooling including SIEM, SOAR, IDS, and IPS provides the monitoring and response layer that detects when agents deviate from expected behavior.

Physical systems extend the consequences of failure

The impact of agentic AI becomes more significant when it interacts with physical environments. In smart city deployments, transportation systems, and industrial operations, agents are connected to sensors, cameras, and edge devices. They interpret signals and trigger real-world responses.

An agent monitoring activity in a public space may reposition cameras, dispatch drones, or alert response teams based on detected patterns. A system managing road conditions may identify hazards such as debris or potholes and trigger maintenance workflows or traffic adjustments.

In these environments, operational boundaries need to specify the conditions under which action is taken. They must define what level of confidence is required, how conflicting signals are resolved, and when human oversight is necessary. They also need to account for the downstream effects of action, not just the initial decision.

A system that acts on incomplete or incorrect information changes the state of the environment.

Agentic AI security starts with control over behavior

Agentic AI introduces risk at the level of behavior inside operational workflows. Systems can act with valid access, correct permissions, and coherent internal logic while still producing outcomes that fall outside intended boundaries. Failures persist through context, propagate across agents, and scale through tool use.

Agentic systems do not fail in isolation. They fail inside workflows that continue to operate. That makes detection slower, containment harder, and consequences more difficult to reverse.

Agentic AI requires control at the level of task, context, and consequence. It requires defining what an agent is allowed to do, under what conditions, and how its actions are validated as they move through a system.

In the next article, we will examine how governance models can be structured to provide that control, including how to define operational boundaries, validate agent behavior, and introduce checkpoints in workflows designed for autonomous action.

How Centific addresses agentic AI security

Centific’s AIDF platform supports agentic AI governance assessment, while Verity AI addresses agentic AI security implementation. Together they strengthen security posture across several dimensions: Zero Trust Architecture to reduce blast radius, decentralized identity management, dynamic policy-based access controls, authenticated delegation frameworks, continuous monitoring through SIEM, SOAR, IDS, and IPS, adaptive risk assessment and mitigation, agent discovery and trust, and vulnerability and threat management aligned to OWASP, MITRE, and NIST standards.

Are your ready to get

modular

AI solutions delivered?

Centific offers a plugin-based architecture built to scale your AI with your business, supporting end-to-end reliability and security. Streamline and accelerate deployment—whether on the cloud or at the edge—with a leading frontier AI data foundry.

Start Building

Connect data, models, and people — in one enterprise-ready platform.

Latest Insights

Ideas, insights, and

research from our team

From original research to field-tested perspectives—how leading organizations build, evaluate, and scale AI with confidence.

Explore

Research insight

How Centific evaluates AI work for accuracy, and what our finance pilot found

Jul 7, 2026

Research insight

The medical audio benchmark healthcare AI has been missing

Jul 2, 2026

Research insight

Measuring what matters: benchmarking generative, multimodal, and agentic AI in healthcare

Jun 17, 2026

Connect with Centific

Stay ahead of what’s next

Stay ahead

Updates from the frontier of AI data.

Receive updates on platform improvements, new workflows, evaluation capabilities, data quality enhancements, and best practices for enterprise AI teams.

Book a Demo

Get a live walkthrough

Talk to our team

Careers

See all our open positions

Turn data into AI that works

Book a demo