Real-time moderation across every conversation. Automated systems, human review, community reporting. Nothing exists outside these systems.
Moderation runs at every stage. Creation, conversation, distribution. Automated detection, human review, community reporting. Around the clock.
Total coverage. Character configs, user messages, AI responses, marketplace listings, descriptions, shared links. Nothing sits outside these systems.
Real time. Content caught before it reaches anyone.
ML classifiers scan every input and output. Trained on millions of examples, continuously updated. Catches violence, sexual content, hate speech, self-harm, and other prohibited categories.
Multiple layers defend against prompt injection. Input sanitization, prompt isolation, boundary enforcement, adversarial testing. Updated continuously as new attacks emerge.
Every AI response passes through filters before delivery. Scans for prohibited content, personal information, boundary violations. Blocked responses are logged. Text and voice.
Every marketplace character scanned before human review. Name, description, personality config, sample interactions. All checked before going live.
Algorithms handle volume. Humans handle nuance. Trust & Safety makes the calls that require judgment.
Dedicated Safety Team
Trained reviewers handle flagged content, user reports, escalated cases. Regular calibration keeps decisions consistent.
24/7 Coverage
Around the clock. Child safety, credible threats, self-harm reports get immediate priority.
Marketplace Review
Every marketplace character reviewed by a person before going live. Config, description, behavior checked against guidelines.
Escalation Paths
Sensitive cases go to senior reviewers, then legal counsel or law enforcement. Clear procedures for every violation type.
Strictly prohibited. Enforcement is automatic. Not exhaustive. See our Acceptable Use Policy and Community Guidelines for full details.
Proportionate to severity, account history, ongoing risk. Three tiers.
Warning
Formal warning. Content may be removed. Warning stays on the account permanently.
Temporary Suspension
Account locked 24 hours to 30 days. No access to anything.
Permanent Ban
Account terminated. All content removed. Child safety violations, credible threats, repeat offenders. No new accounts.
Illegal activity goes to law enforcement. Full cooperation, including data preservation and production under valid legal process.
What we publish.
We cooperate with law enforcement. Here is how:
Law enforcement contact: [email protected] — urgent requests and formal legal process.
Enforcement decisions matter. If you think we got it wrong, you can appeal:
How to appeal: Email [email protected] within 30 days. Include your username, the action you are contesting, and your explanation.
Timeline: Acknowledged within 2 business days. Decision within 5. Reviewed by someone who was not involved in the original call.
Finality: One appeal per action. The determination is final.
Moderation works best when people flag what they see. If something violates our policies, let us know.
Helpful?