Superagent - Safety Page

Guardrails

Security controls aligned with OWASP LLM TOP 10 and MITRE ATLAS

LLM01: Prompt Injection Protection

Blocks malicious prompts that try to manipulate the AI's behavior or override system instructions.

Active

LLM02: Sensitive Information Disclosure

Automatically detects and removes personal information, credentials, and confidential data from AI responses.

Active

LLM03: Supply Chain Security

Verifies AI models come from trusted sources and scans dependencies for vulnerabilities.

Active

LLM04: Data and Model Poisoning Prevention

Monitors for corrupted training data or model tampering that could affect AI accuracy or security.

Active

LLM05: Improper Output Handling

Validates and sanitizes all AI outputs to prevent security exploits like code injection or XSS attacks.

Active

LLM06: Excessive Agency Controls

Limits AI actions to approved scope and requires human oversight for critical decisions.

Active

LLM07: System Prompt Leakage Prevention

Prevents internal system instructions from being exposed in AI responses.

Active

LLM08: Vector and Embedding Security

Protects knowledge bases and vector stores from unauthorized access or data exposure.

Active

LLM09: Misinformation Prevention

Detects and flags false or misleading information in AI-generated content.

Active

LLM10: Unbounded Consumption Protection

Enforces rate limits and usage quotas to prevent resource exhaustion or denial-of-service attacks.

Active

Language Models

AI language models used and their configurations

Model	Provider	Purpose	Model Card
Claude Opus 4.5	Anthropic	Safety testing and Red teaming	View

Model Training

Information about how customer data is used for model training

Customer Data Usage for Training

Customer data is not used to train AI models. All training is done using publicly available or proprietary datasets.

Inactive

Data Access and Tools

Third-party integrations and data sources available to AI agents

No data access tools configured yet.

Safety Reports

Security audits, compliance certifications, and safety assessments

No safety reports available yet.

AI Safety Page

LLM01: Prompt Injection Protection

LLM02: Sensitive Information Disclosure

LLM03: Supply Chain Security

LLM04: Data and Model Poisoning Prevention

LLM05: Improper Output Handling

LLM06: Excessive Agency Controls

LLM07: System Prompt Leakage Prevention

LLM08: Vector and Embedding Security

LLM09: Misinformation Prevention

LLM10: Unbounded Consumption Protection

Customer Data Usage for Training