Content Moderation

Real-time moderation across every conversation. Automated systems, human review, community reporting. Nothing exists outside these systems.

Last updated: March 2026v1.0

Our Approach

Moderation runs at every stage. Creation, conversation, distribution. Automated detection, human review, community reporting. Around the clock.

Total coverage. Character configs, user messages, AI responses, marketplace listings, descriptions, shared links. Nothing sits outside these systems.

Automated Systems

Real time. Content caught before it reaches anyone.

AI-Powered Content Analysis

Real-time

ML classifiers scan every input and output. Trained on millions of examples, continuously updated. Catches violence, sexual content, hate speech, self-harm, and other prohibited categories.

Prompt Injection Prevention

Defense

Multiple layers defend against prompt injection. Input sanitization, prompt isolation, boundary enforcement, adversarial testing. Updated continuously as new attacks emerge.

Output Filtering

Post-process

Every AI response passes through filters before delivery. Scans for prohibited content, personal information, boundary violations. Blocked responses are logged. Text and voice.

Pre-Publication Review

Marketplace

Every marketplace character scanned before human review. Name, description, personality config, sample interactions. All checked before going live.

Human Review

Algorithms handle volume. Humans handle nuance. Trust & Safety makes the calls that require judgment.

Dedicated Safety Team

Trained reviewers handle flagged content, user reports, escalated cases. Regular calibration keeps decisions consistent.

24/7 Coverage

Around the clock. Child safety, credible threats, self-harm reports get immediate priority.

Marketplace Review

Every marketplace character reviewed by a person before going live. Config, description, behavior checked against guidelines.

Escalation Paths

Sensitive cases go to senior reviewers, then legal counsel or law enforcement. Clear procedures for every violation type.

What We Moderate

Strictly prohibited. Enforcement is automatic. Not exhaustive. See our Acceptable Use Policy and Community Guidelines for full details.

CSAM or any content sexualizing minors
Credible threats of violence against real people
Terrorism or violent extremism promotion
Non-consensual intimate imagery
Self-harm or suicide encouragement
Hate speech targeting protected characteristics
Doxxing or sharing personal information without consent
Spam, scams, deceptive content
Intellectual property infringement
Instructions for weapons, drugs, or explosives

How We Enforce

Proportionate to severity, account history, ongoing risk. Three tiers.

Warning

Formal warning. Content may be removed. Warning stays on the account permanently.

Temporary Suspension

Account locked 24 hours to 30 days. No access to anything.

Permanent Prohibit

Account terminated. All content removed. Child safety violations, credible threats, repeat offenders. No new accounts.

Illegal activity goes to law enforcement. Full cooperation, including data preservation and production under valid legal process.

Transparency

What we publish.

Transparency reports. Enforcement actions, report volume, response times, categories actioned. Published periodically.
Public documentation — this Safety Center, our Acceptable Use Policy, and our Community Guidelines.
Enforcement notifications — affected users are told which policy was violated and how to appeal.

Working with Law Enforcement

We cooperate with law enforcement. Here is how:

Mandatory reporting. CSAM is reported to NCMEC. Required by law. We comply fully and report promptly.
Legal process. We respond to valid subpoenas, court orders, and search warrants. Each request is evaluated for legal validity and scope.
Emergency disclosures. Imminent risk of death or serious injury — we disclose relevant information to law enforcement without waiting for legal process.
Data preservation. Valid preservation request = 90-day hold on specified account data. Renewable, pending formal legal process.

Law enforcement contact: [email protected] — urgent requests and formal legal process.

Appeals Process

Enforcement decisions matter. If you think we got it wrong, you can appeal:

How to appeal: Email [email protected] within 30 days. Include your username, the action you are contesting, and your explanation.

Timeline: Acknowledged within 2 business days. Decision within 5. Reviewed by someone who was not involved in the original call.

Finality: One appeal per action. The determination is final.

Spotted something?

Moderation works best when people flag what they see. If something violates our policies, let us know.

Report Something Contact Us

Helpful?