IronCurtain - A secure* runtime for autonomous AI agents with plain-English policy enforcement.

IronCurtain

A secure* runtime for autonomous AI agents with plain-English policy enforcement.

Pitch

IronCurtain is designed to provide a safe environment for autonomous AI agents, deriving security policies from easily understood constitutions. As a research prototype, it aims to tackle the challenge of ambient authority, ensuring agents can operate effectively while minimizing risks and maintaining user trust.

Description

IronCurtain is a secure runtime designed for autonomous AI agents, leveraging human-readable security policies in the form of a constitution. This research prototype aims to enhance the safety of AI agents, preventing unauthorized actions while maintaining their ability to function autonomously.

Overview

IronCurtain addresses the prevalent issue of ambient authority inherent in current AI frameworks, where agents possess the same full access privileges as users, exposing systems to potential exploits through prompt injections or unmonitored actions. Traditional solutions either limit agent capabilities through restrictive sandboxes or demand excessive user approvals, both of which compromise functionality.

The Unique Approach

IronCurtain pioneers a different strategy by allowing users to detail their security intentions in plain English. Users formulate a short document termed a constitution, outlining permitted and prohibited actions for their AI agents. IronCurtain interprets this document and compiles it into a deterministic security policy that is enforced at runtime, ensuring agent autonomy is maintained within clearly defined limits.

Key Features

Untrusted Agents: IronCurtain operates on the principle that AI agents may be compromised, and thus applies strict security measures regardless of the agent's presumed reliability.
Natural Language Interface: Security policies are created using plain English, offering an intuitive method for establishing guidelines for agent behavior.
Semantic Interposition: Every interaction with system resources is mediated through the Model Context Protocol (MCP), which can allow, deny, or escalate requests for approval from users.
Layered Security: Agents operate within V8 isolates, thereby preventing unauthorized access to the host system and ensuring that all actions are scrutinized against established policy before execution.

Modes of Operation

IronCurtain supports multiple session modes to accommodate different operational needs:

Interactive Mode: Conduct multi-turn sessions where an agent can respond to tasks and pause for user approvals as needed.
Single-shot Mode: Send a single task command and exit after the agent completes the operation.
Workspace Mode: Assign a defined directory as the agent's working area, allowing it to operate within an established context while still adhering to security protocols.
Terminal Multiplexer: Leverage a terminal multiplexer for enhanced interaction with Docker-based agents, facilitating inline escalation handling and real-user input verification.

Security Model

IronCurtain's design revolves around the principle that the AI model itself must not dictate security. Instead, security is enforced through comprehensive policies that manage resource access and operational boundaries. All actions, decisions, and interactions are logged to provide a thorough audit trail.

Developed for the Future

IronCurtain is an evolving project, open to contributions and feedback from the community. Its promise lies in the combination of strong security measures, ease of use, and the ability for AI agents to operate within defined norms — a crucial step toward safer autonomous AI interactions.

For more detailed insights and updates, visit the IronCurtain website, or explore the project repository on GitHub.

0 comments

No comments yet.

New comment