Trust & Safety
1. OUR COMMITMENT
User safety and platform integrity are central to Dyva+. We are building an environment where people can create, use, and share AI companions without exploitation, abuse, or harm.
Our Trust & Safety program proactively identifies risks, responds quickly to reports, and continuously improves as the platform grows. We use a combination of automated detection, human review, and transparent policies.
This page covers our safety practices, reporting process, enforcement, and protections for vulnerable users. For specific content rules, see our Acceptable Use Policy.
2. CONTENT MODERATION
We use a layered approach combining automated systems with human oversight to detect and address policy violations.
- Automated detection systems. Machine learning classifiers and rule-based filters run in real time to catch harmful content. These systems analyze text, images, and character configs against our policies, flagging anything that may violate our Acceptable Use Policy. These systems are continuously updated to address new abuse patterns.
- Human review. Flagged and reported content is escalated to our Trust & Safety team. Trained reviewers evaluate it against our policies, considering context, intent, and potential harm. Human review is the final decision point for all enforcement actions.
- Proactive marketplace monitoring. All Marketplace characters are reviewed before publication. We evaluate configs, descriptions, sample interactions, and behavior for compliance with our guidelines. Characters that do not meet standards are rejected with feedback and can be resubmitted after fixes.
3. AI SAFETY MEASURES
As an AI platform, we implement specific measures to address risks unique to AI-generated content and adversarial attacks.
- Prompt injection prevention. Multiple layers of defense against prompt injection: input sanitization, system prompt isolation, context boundary enforcement, and adversarial testing. These prevent users from manipulating AI characters into violating their boundaries or our policies.
- Output filtering. AI responses pass through filters that detect and block prohibited content -- explicit violence, sexual content involving minors, instructions for illegal activities, and personally identifiable information. This applies to both text and voice outputs.
- Rate limiting. We limit message volume, character creation, and API requests to prevent automated abuse and spam. Limits are adjusted dynamically based on usage patterns and detected abuse.
- Content boundaries. Each character operates within boundaries set by its Creator, further constrained by platform safety rules. Creators can configure topics and interaction styles, but cannot override platform-level prohibitions. Platform safety always takes precedence.
- Adversarial use monitoring. We monitor for adversarial patterns: systematic filter bypass attempts, coordinated exploitation campaigns, and novel attack vectors. Our security team runs regular red-team exercises to find and fix vulnerabilities before they can be exploited.
4. USER REPORTING
We rely on our community to help spot violations. We provide multiple reporting channels and review every report.
- In-app report button. Every character page, listing, profile, and chat interface has a report button. In-app reports go directly to our Trust & Safety queue with full context.
- Email reporting. Send detailed reports to [email protected]. Email is best for complex situations that need additional context, evidence, or descriptions of behavior patterns across multiple interactions.
- What happens when you report. Reports are assigned to a Trust & Safety team member who evaluates the content against our policies, considers context, and decides on the appropriate action. You will be notified of the outcome, subject to privacy and legal constraints.
- Review timelines. Reports are prioritized by severity. Imminent threats, child exploitation, and emergencies: reviewed within 24 hours. All other reports: within 72 hours. High report volume may extend standard timelines, but we aim for consistency.
5. ENFORCEMENT ACTIONS
When we confirm a violation, we take action proportionate to the severity, your history, and the potential for ongoing harm:
- Content removal. The violating content is removed and, where applicable, deleted. You are notified of what was removed and which policy it violated.
- Account warnings. A formal warning documenting the violation and expected corrective behavior. Warnings are permanently recorded and factor into future decisions.
- Temporary suspension. Access suspended for 24 hours to 30 days depending on severity. During suspension, you cannot access your account, interact with characters, or use the API.
- Permanent ban. Account permanently terminated and all content removed. Reserved for the most serious violations: child safety offenses, credible threats of violence, and repeated violations after prior enforcement. Banned users may not create new accounts.
- Law enforcement referral. For illegal content or activity -- CSAM, credible threats of violence, terrorism, or other criminal conduct -- we report to law enforcement and cooperate fully with investigations. We may preserve and produce user data in response to valid legal process.
6. APPEAL PROCESS
Enforcement decisions matter. We provide a fair process for users who believe an action was made in error.
- How to appeal. Email [email protected] within 30 days of the action. Include your username, the action you are contesting, any reference numbers, and why you believe it was incorrect.
- Review timeline. A senior Trust & Safety team member who was not involved in the original decision reviews your appeal. We acknowledge receipt within 2 business days and issue a decision within 5 business days. Complex cases may take longer -- we will let you know.
- One appeal per enforcement action. Each action may be appealed once. The appeal decision is final. We will send the outcome and reasoning to your account email.
7. PROTECTING MINORS
Protecting minors is a top priority. We take active steps to prevent child exploitation and comply with all applicable laws.
- Age verification. You must be at least 13 (or older if required in your jurisdiction) to create an account. We verify age during registration and may request additional verification at any time. Accounts with misrepresented ages are immediately terminated.
- Parental controls. We are developing parental controls so parents and guardians can manage their child's activity, including restricting access to certain characters and content. These will be announced as they become available.
- COPPA compliance. We comply with COPPA and do not knowingly collect personal information from children under 13 without parental consent. If we discover such data, we immediately delete it and terminate the account. Parents who believe their child provided information without consent should contact [email protected].
- Reporting child safety issues. If you encounter content that exploits, endangers, or harms a minor, report it immediately via the in-app button or email [email protected] with subject line "Child Safety Report." These reports get highest priority and are reviewed within 24 hours. Confirmed violations are reported to NCMEC and law enforcement.
8. TRANSPARENCY
Transparency builds trust. We provide visibility into how we enforce our policies and how our safety systems work.
- Transparency reports. We publish periodic reports with aggregate data on content moderation and enforcement. These are available on our website.
- Types of content actioned. Reports break down enforcement by category -- violence, hate speech, harassment, child safety, spam, IP violations, and AI-specific violations -- so the public can see what challenges we face and how we handle them.
- Volume of reports. Reports include total reports received, reports reviewed, percentage resulting in action, average response times, proactive detections, appeals received, and reversal rates.
9. CONTACT
For safety questions, concerns, or reports:
For general legal inquiries, contact [email protected]. For copyright issues, see our DMCA & Copyright Policy.
Helpful?