Skip to main content
AI Guardrails allow enterprise administrators to define safety boundaries for how users interact with Devin across the organization. Guardrails automatically screen incoming user messages — including initial messages, follow-up messages, and PR comments — to detect prompt injection, data exfiltration attempts, and policy violations before Devin processes them.

Overview

Guardrails run as an additional layer of oversight on messages sent to Devin. They analyze user messages in real time and can:
  • Log suspicious messages for review (log_only)
  • Warn the user with a visible banner while still processing the message (warn_user)
  • Block messages that violate organization policies (block_message)
  • Kill the session entirely when a critical violation is detected (kill_session)

Configuring Guardrails

Enterprise administrators can configure guardrails from the enterprise settings page or the organization settings page at Settings > Guardrails. The guardrails configuration page provides:
  • Organization filter — View and manage guardrails for specific organizations within the enterprise
  • Preset guardrails — Enable or disable available guardrails and choose the action to take on violation (log_only, warn_user, block_message, or kill_session)
  • Session links — Each guardrail event links back to the originating session for investigation

Guardrail Events

When a guardrail is triggered, Devin records the event with details including:
  • The user message that triggered the guardrail
  • The guardrail rule that was matched
  • The action taken (log_only, warn_user, block_message, or kill_session)
  • A link to the session where the event occurred
Guardrail events appear in the audit logs with the ai_guardrail_violation action type, enabling automated monitoring and alerting. You can also retrieve guardrail events programmatically through the guardrail violations API.

Use Cases

Common guardrail configurations include:
  • Detecting prompt injection — Identify and block user messages that attempt to override Devin’s instructions or manipulate its behavior
  • Preventing data exfiltration — Flag or block messages that attempt to instruct Devin to send sensitive data to unauthorized destinations
  • Enforcing policy compliance — Screen user requests to ensure they align with organizational security and usage policies
AI Guardrails is an enterprise feature. Contact your account team to learn more about enabling guardrails for your organization.