AI agents are autonomous, reasoning entities that make decisions to achieve goals. AI automation follows pre-set, if-this-then-that rules for repetitive tasks. Agents excel in dynamic, unpredictable environments. Automation is best for consistent, high-volume workflows. Agents adapt. Automation is rigid. You need both.
Stourio is an orchestration layer that sits between your team, your systems, and your AI provider. It receives operational signals, uses an LLM to reason about them, then delegates work to either AI agents (for novel situations) or automation workflows (for known patterns).
The key insight: Stourio doesn't care which LLM powers it. You connect the AI provider that fits your needs, your budget, and your compliance requirements.
Any LLM with tool use / function calling support works. The orchestrator communicates via a standard interface: send context, receive a decision, execute the action.
Five layers, each with a clear responsibility. The orchestrator is the decision point. Everything else is either an input, a capability, a guardrail, or persistence.
Two channels feed into the orchestrator. Both are always active.
You (Chat interface). A direct conversation channel where you talk to Stourio in natural language. "Check why response times spiked in EU and fix it if it's the CDN again." This can be a web app, a Slack bot, a Teams integration, or a mobile app. Standard WebSocket or REST endpoint that forwards your message to the orchestrator along with conversation history.
Your systems (Webhooks & Redis Streams). Stourio listens to your operational infrastructure through a high-throughput webhook API. When Grafana, Datadog, or PagerDuty fire an alert, it hits POST /api/webhook. To prevent the orchestrator from crashing during an alert storm, the webhook immediately drops the payload into a Redis Stream and returns a 202 Accepted. A background consumer worker dequeues the signals, processes them through the rules engine and LLM, and utilizes At-Least-Once delivery (ACK) to ensure no alerts are dropped.
The combination is what makes this an assistant, not a pipeline. You can ask questions, give commands, and have a conversation. Meanwhile, your systems feed signals that Stourio processes autonomously in the background.
The brain. Stourio receives an input (your message or a system signal), sends it to your connected LLM with the full context (your rules, available tools, conversation history), and gets back a decision.
The LLM doesn't execute anything directly. It returns one of five possible responses:
Gather more context — calls an MCP tool to investigate before deciding.
Delegate to an agent — routes to a specialized AI agent for reasoning-heavy work.
Trigger automation — fires a predefined workflow for a known pattern.
Ask you first — the feedback loop. For high-risk actions, Stourio comes back and asks for confirmation before proceeding.
Respond directly — answers your question or provides a status update.
The routing decision utilizes a deterministic short-circuit. Incoming signals are first evaluated by a fast, deterministic rule engine (e.g., regex, exact event signatures). If a signal matches a known pattern, the system triggers the automation workflow directly, bypassing the LLM. If the situation is ambiguous or unmatched, the LLM evaluates the context to delegate to the correct agent. If the resulting action is high-risk, it asks the user first.
Two lanes, each designed for a fundamentally different type of work.
AI Agents & The MCP Gateway. Agents are focused LLM loops with specialized roles (e.g., "Diagnose & Repair"). When an agent decides to take action or gather data, it does not execute code directly. Instead, it sends an HTTP POST request to a standalone MCP Gateway. This enforces an air-gapped security model: The Orchestrator ("Brain") lives on Server A and holds the LLM API keys. The MCP Gateway ("Muscle") lives on Server B and holds your AWS, database, and infrastructure credentials. If the LLM is compromised or hallucinates, it cannot touch your infrastructure directly; it can only request execution of strictly predefined tools on the Gateway.
Agents are stored as templates in a library: a role description, a set of allowed tools, and constraints. The orchestrator selects and configures the right agent for the situation. Over time, you add new agent templates as you encounter new patterns. The system grows its capabilities through use.
Automation — for known, repeatable patterns. Standard workflow execution via an engine like Temporal, n8n, or plain API orchestration. The orchestrator triggers a predefined workflow by ID with parameters. The workflow runs its steps (health check, apply fix, validate, notify) and returns a result. Fast, consistent, no reasoning needed.
The bridge between the two: when automation encounters something unexpected or fails, it falls back to the agent lane. Patterns that agents solve repeatedly can be "promoted" to automation rules. The system learns which situations need thinking and which need executing.
Every decision passes through three control mechanisms.
Your rules. Stored in a database, injected into the orchestrator's context on every call. Risk thresholds, blast radius limits, time-of-day restrictions, approval requirements. You define them through an admin interface. They're versioned for audit trail.
Full visibility. Every orchestrator decision is logged: what input triggered it, what the LLM reasoned, what action was taken, what tools were used, what the outcome was. Agent execution traces record every sub-step. Everything is queryable: "show me all actions Stourio took on EU infrastructure last week."
Override anytime. A global circuit breaker. For AI Agents, this is implemented as middleware that checks a Redis flag before every tool execution. For Automation, the orchestrator actively sends cancellation API payloads to external engines (Temporal, n8n) to terminate running workflows. It halts both reasoning and distributed execution.
LLMs are stateless. Every call needs the full context. The persistence layer maintains continuity across conversations and actions.
| Store | Purpose | Recommended |
|---|---|---|
| Conversation state | Chat history for each orchestrator call | PostgreSQL |
| Agent state | Running agent context, distributed locking (mutex) to prevent race conditions | PostgreSQL + Redis (with Redlock) |
| Rule store | User-defined rules, versioned | PostgreSQL |
| Audit log | Every decision and action, immutable | PostgreSQL (append-only) |
| Signal queue | Incoming system events awaiting processing | Redis Streams or SQS |
| Session cache | Active sessions, kill switch flags | Redis |
Stourio communicates with your LLM through a standard interface. Every provider that supports tool use / function calling works the same way from the orchestrator's perspective: send a message with context and tool definitions, receive a response with either text or a tool call.
A provider adapter layer translates between Stourio's internal format and each provider's API, acting as a strict security boundary. It cryptographically validates every LLM tool call against a predefined schema (e.g., using Zod or Pydantic) before execution, and sanitizes every raw MCP response before injecting it back into the context window. Switching providers means mapping to this secure boundary, not rewriting the system.
Different LLMs have different strengths. Models with strong reasoning (Claude Opus 4.6, GPT-5.2xhigh, Gemini 3.1 PRO) work better for the agent lane. Faster, cheaper models (Claude Haiku, Gemini 3 or GPT 5.2) work well for the orchestrator's routing decisions and simple automation triggers. You can use different models for different parts of the system.
Scales linearly with usage. Main cost driver is LLM token consumption. Using a smaller model for routing (orchestrator) and a larger model for reasoning (agents) optimizes the cost-to-quality ratio.
Not every decision should be autonomous. When the orchestrator encounters a high-risk action (as defined by your rules), it pauses and comes back to you with a structured plan and a confirmation request: what it wants to do, why, what the risk is and what the blast radius would be.
You approve, reject, or modify. Every approval request has a strict Time-to-Live (TTL). If unapproved within the window, the action defaults to 'Reject' to prevent stale execution. Upon approval, the orchestrator performs a rapid state re-validation via MCP tools to ensure the environment hasn't changed before executing the action. This isn't a limitation — it's the core safety mechanism that makes autonomous operations viable in production. Without it, you're trusting an LLM with your infrastructure. With it, you're trusting an LLM that asks before doing anything dangerous.
The threshold for "high-risk" is yours to define. Some teams want confirmation before any production change. Others only want it for actions that affect multiple regions. The rule engine handles this.
You don't build all five layers at once. Start with the smallest useful version and expand.
Orchestrator service + chat interface + one MCP server (pick your monitoring tool). At the end of this phase, you can talk to Stourio, it reads your alerts, and it reasons about them. No actions yet — just understanding and responding.
Rule engine + audit log + kill switch + feedback loop. This is the mandatory safety foundation. Rules are enforced, every routing decision is logged, and the distributed override mechanism is operational before any actions are allowed.
Two automation workflows for your most common known patterns + the Diagnose & Repair agent. Stourio can now fix known issues automatically and investigate unknown ones, operating strictly within the Phase 2 guardrails.
Additional MCP servers for your other systems + Escalate agent + Take Action agent. The full schema becomes operational across multiple integrations, and standard context windows will no longer efficiently hold your growing runbook library.
Agent template UI + pattern promotion (recurring agent solutions become automation rules). Admin interface for managing the agent library. The system learns from its own usage.
| Risk | Impact | Mitigation |
|---|---|---|
| LLM reasoning error | Wrong action executed | Guardrails layer, confirmation on high-risk, blast radius limits |
| LLM provider downtime | System stops reasoning | Queue signals, retry with backoff, fallback to automation-only mode |
| Prompt injection | Malicious signals manipulate the LLM | Sanitize all external inputs before including in LLM context |
| Agent loops | Agents calling agents indefinitely | Max depth limit (3-4 hops), timeout per agent execution |
| Rule conflicts | Contradictory rules cause unpredictable behavior | Validation on rule creation, priority ordering |
| Runaway Automation | Destructive workflows continue executing after orchestrator shutdown. | Kill switch tied directly to external workflow engine cancellation APIs, not just local middleware |
| Stale Approvals | Executing an outdated plan on a changed infrastructure state causes secondary outages. | TTL on all approval requests + mandatory state re-validation post-approval |
| The Thundering Herd (Event Storms) | Multiple agents spawned for the same root cause collide and corrupt infrastructure state | Signal debouncing and correlation windows at the queue layer before orchestrator processing |
| Open-ended Command Execution | Agent hallucinates a destructive terminal command, wiping production data or infrastructure | Implement strict command allow-lists, ephemeral least-privilege credentials, and an absolute ban on raw shell access for all agents |
| Probabilistic Routing Drift | LLM misroutes a known urgent issue to a slow reasoning agent instead of instant automation, breaching MTTR | Implement a deterministic rules engine before the LLM to handle known alert signatures; restrict LLM routing strictly to novel or ambiguous signals |
| Agent State Collision (Race Conditions) | Concurrent agents read stale state and execute conflicting actions on the same infrastructure component | Implement strict distributed locking (e.g., Redis Redlock) on target infrastructure before an agent begins reasoning or execution |
| Malformed LLM Tool Calls | The LLM hallucinates incorrect parameters or invalid JSON, causing external automation engines to panic or execute broken workflows | Strict schema validation at the adapter boundary; drop and retry any tool call that fails schema enforcement before it reaches the execution layer |
This architecture is deliberately simple. Standard web infrastructure plus an LLM API. Specifically, you do not need: custom ML models or training, GPU infrastructure, complex multi-agent engine like LangGraph or CrewAI (direct API calls are simpler and more reliable), or Kubernetes (unless you choose to over-engineer from day one).
The entire system runs on application servers, a database, a cache, and API calls to your LLM provider. That's the point. The intelligence comes from the model. The value comes from the orchestration.