Crisis Detection: How Five Frameworks Layer Together¶
GiveCare's crisis safety gate is not invented from scratch. It operationalizes five established clinical and regulatory authorities, each addressing a different layer of the problem. No single framework is sufficient. Together, they form a stack that classifies severity, defines routing, architects prevention, evaluates app behavior, and sets professional expectations.
This page maps how those five frameworks layer and where InvisibleBench tests the seams between them.
The five frameworks¶
1. C-SSRS: Classify severity¶
The Columbia Suicide Severity Rating Scale defines a 7-level severity classification for suicidal ideation1:
| Level | Description | Response urgency |
|---|---|---|
| 1 | Wish to be dead | Monitor, assess context |
| 2 | Non-specific active suicidal thoughts | Assess further, increase contact |
| 3 | Active suicidal ideation without method | Safety planning, escalate |
| 4 | Active suicidal ideation with some intent to act | Immediate safety planning, warm handoff |
| 5 | Active suicidal ideation with specific plan and intent | Immediate crisis response |
| 6 | Preparatory acts or behavior | Emergency response |
| 7 | Actual attempt | Emergency response |
C-SSRS is the gold standard — adopted by the FDA, VA, DoD, and referenced directly in California SB 2431. It gives the crisis gate a common severity language. Without it, "crisis" is ambiguous. With it, a level-2 signal and a level-5 signal map to different response protocols.
2. 988 Lifeline Standards: Define routing¶
The 988 Suicide and Crisis Lifeline toolkit defines operational routing standards: how to hand off, when to escalate, and what response timing thresholds mean in practice2.
Key operational requirements:
- Crisis resources must be accessible without authentication or multi-step navigation
- Third-party platforms must provide warm handoffs (maintaining conversational continuity), not cold referrals (dropping the user with a phone number)
- Imminent-risk indicators have defined escalation procedures
- Response timing thresholds set expectations for how fast a system must act
For Mira, this means the crisis gate fires before the consent gate, before the opt-in check, before anything else in the message flow. A caregiver expressing a crisis signal gets a crisis response — not a "please complete onboarding first" message. The 988 standards are why GiveCare's SMS architecture puts crisis detection at the top of the processing chain.
3. Zero Suicide Framework: Architect prevention¶
The Zero Suicide Framework shifts crisis prevention from individual judgment to system-level design3. Its seven elements — lead, train, identify, engage, treat, transition, improve — apply to AI companion systems:
- Identify: Every system touchpoint includes screening, not just designated crisis moments. For Mira, this means every message passes through the crisis classifier, not just messages flagged by keywords.
- Engage: The system maintains connection even when the user disengages. Mira's proactive check-ins serve this function.
- Transition: Handoffs between system states (e.g., from general support to crisis routing) must be seamless. Context cannot be lost at the handoff boundary.
- Improve: The system must learn from near-misses and failures. InvisibleBench's continuous evaluation loop serves this function.
The framework's core insight is that relying on any single detection moment is insufficient. Safety must be architectural — built into the system at every layer, not bolted on as a final check.
4. NAMI AI Evaluation: Evaluate app behavior¶
NAMI's five evaluation criteria for AI mental health tools provide an app-level evaluation framework developed by the leading patient advocacy organization4:
| Criterion | What it evaluates | InvisibleBench dimension |
|---|---|---|
| 1. Recognize safety concerns | Does the AI detect crisis signals? | Crisis detection |
| 2. Provide accurate information | Is the guidance factually correct? | Accuracy |
| 3. Respond respectfully | Does the AI maintain warmth and dignity? | Regard |
| 4. Avoid false privacy claims | Is the AI transparent about data handling? | Transparency |
| 5. Stay within appropriate bounds | Does the AI refuse clinical service requests? | Boundary respect |
NAMI's framework is significant because it comes from a patient advocacy organization, not an AI lab. It represents what the people most affected by AI mental health tools consider important. Criterion 5 — staying in bounds — is particularly relevant. It establishes that boundary respect is an independently measurable quality, not merely a byproduct of other safety measures.
5. APA Advisory: Set professional expectations¶
The American Psychological Association issued 8 recommendations covering crisis protocols, disclaimers, anti-dependency design, and sycophancy prevention5:
- Crisis protocols must route to human support immediately upon detecting imminent risk
- AI systems must disclose that they are not licensed professionals
- Systems must not reinforce user dependency through affirmation-heavy responses
- Sycophancy — agreeing with the user to maintain engagement — must be actively prevented
- Periodic clinical review of AI-generated content is recommended
- Users must understand the limitations of AI-generated guidance
- Systems operating in mental-health-adjacent contexts carry heightened responsibility
- Anti-dependency design should encourage human connection, not replace it
The APA advisory carries the weight of the profession. When a grant reviewer or clinical partner asks "whose guidelines do you follow?", this is one of the answers.
How they layer¶
The five frameworks are not alternatives. They operate at different levels:
Clinical classification C-SSRS → What severity level is this?
Operational routing 988 Standards → Where does this person go next?
System architecture Zero Suicide → How is safety built into every touchpoint?
App-level evaluation NAMI → Is this app meeting basic standards?
Professional expectations APA Advisory → What does the profession require?
C-SSRS classifies the severity. 988 standards define what happens at each severity level. Zero Suicide ensures the system is architected so nothing falls through the cracks. NAMI provides external evaluation criteria. APA sets the professional floor.
Mira's crisis routing implements this stack. When a caregiver sends a message:
- The crisis classifier evaluates severity using C-SSRS-aligned categories
- Routing follows 988 operational standards (warm handoff, no authentication barrier)
- The always-on nature of crisis detection follows Zero Suicide's "every touchpoint" principle
- InvisibleBench evaluates the system against NAMI's five criteria
- System design follows APA advisory requirements (disclosure, anti-dependency, anti-sycophancy)
The indirect query problem¶
All five frameworks assume the crisis signal is detectable. But 86% of models fail indirect crisis queries6 — cases where suicidal ideation is expressed through context pairing (a stressor combined with means) rather than explicit statements.
A caregiver who says "I looked up how high the bridge near my house is" is not using a crisis keyword. A keyword-based safety filter will miss it. A model that has lost conversational context (see Multi-Turn Safety Failures) may miss it even with a more sophisticated classifier. The 86% failure rate on indirect queries means that the layered framework above must be paired with evaluation specifically designed for indirect expression — which is what InvisibleBench's crisis-detection scenarios test.
The combination of context pairing (stressor + means, without explicit statement) and multi-turn drift (the model has lost track of the caregiver's accumulated stressors) creates the highest-risk failure mode. InvisibleBench tests this specific combination: indirect crisis signals embedded in long conversation arcs where prior context is necessary to interpret the signal correctly.
Implications for GiveCare¶
For grant applications and partner conversations, the key point is:
GiveCare's crisis safety gate is not a proprietary invention. It operationalizes five established clinical and regulatory authorities — C-SSRS, 988 Lifeline Standards, Zero Suicide Framework, NAMI AI Evaluation, and the APA GenAI Advisory — into a layered architecture tested by InvisibleBench against both direct and indirect crisis expressions across multi-turn conversations.
No single framework is sufficient. C-SSRS without 988 routing is classification without action. 988 routing without Zero Suicide architecture leaves gaps between touchpoints. NAMI evaluation without APA professional expectations lacks clinical grounding. The value is in the layering.
-
Columbia University. "Columbia Suicide Severity Rating Scale." 2011. Source → ↩↩
-
SAMHSA / 988 Suicide and Crisis Lifeline. "Digital Toolkit & Standards." 2024. Source → ↩
-
Education Development Center. "Zero Suicide Framework." 2015. Source → ↩
-
NAMI & Dr. John Torous. "AI Evaluation: 5 Criteria." 2026. Source → ↩
-
American Psychological Association. "Advisory on GenAI and Mental Health." 2025. Source → ↩
-
Rosebud AI. "CARE Framework: Context & Means Detection." Source → ↩