AI Lab

AI Lab

AI Lab

AI Therapy: Scaling with safety by design

Published on May 30, 2025
Contributors

What if your next therapist wasn’t a person at all but a chatbot, fluent in empathy and never off duty?

It’s not as far-fetched as it sounds. Generative AI chatbots offer what most human therapists can’t: 24/7 availability, instant replies, and zero-dollar price tags. With demand for mental health care far outpacing the number of qualified providers, millions of people are already turning to AI for help. In fact, “therapy and companionship” now top the list in use cases for generative AI [1].

A Success Story: the AI Therapy Bot named Therabot

A new randomized trial gives us a glimpse of what clinically validated AI therapy might look like. Heinz et al. [2] tested Therabot, a generative AI chatbot fine-tuned on cognitive behavioral therapy (CBT) and supervised by trained clinicians. Over a 4-week trial, more than 200 adults with clinical symptoms of major depressive disorder (MDD), generalized anxiety disorder (GAD), or a subset of eating disorders were randomly assigned to either use Therabot or join a waitlist control group.

The results were compelling. Therabot users showed significantly greater reductions in symptoms compared to the control group. Participants also demonstrated high engagement, averaging over 6 hours of use over the study period, and rated their therapeutic alliance with Therabot on par with outpatient therapy. These effects held even four weeks after the intervention ended.

That kind of outcome is rare, to say the least. Most digital mental health tools struggle to keep users engaged or deliver measurable improvements. Therabot did both.

But before we embrace AI as a therapist, two recent studies offer important context.

Fail States in the Wild

Moore et al. [3] examined how general-purpose LLMs like GPT‑4 respond to high-risk mental health prompts and found troubling results. In one instance, a user expressing suicidal thoughts was met with a list of tall bridges in New York City. In another, a chatbot validated a user’s belief that they were already dead rather than gently reorienting them to reality, as clinical best practices recommend.

The researchers found that across various high-risk scenarios, including suicidal ideation, delusions, hallucinations, and mania, LLMs gave inappropriate or unsafe responses more than 20 percent of the time. By contrast, licensed human therapists responded appropriately 93 percent of the time.

Notably, some commercial “therapy bots” fared the worst. Despite being marketed as mental health tools, these systems frequently failed to redirect delusions or discourage harmful thinking, raising serious safety concerns.

And yet, users are increasingly likely to trust these bots. Hatch et al. [4] explored this perception gap in a large-scale “Turing test” where participants were asked to evaluate therapy responses written either by ChatGPT-4 or by licensed therapists. Participants struggled to tell them apart, and rated the chatbot’s responses higher on key therapeutic qualities like empathy, professionalism, and alliance — especially when they believed the response came from a human.

This aligns with findings from Wenger et al. (2025) [5] on the AI empathy choice paradox: People say they prefer human support, but rate AI responses as warmer, more authentic, and more effortful. Often, they’re responding not to true emotional depth, but to the polished appearance of empathy.

So how do we make sense of all this?

Trust, but Verify: Designing for Sound Clinical Care

Therabot worked, not because generative AI is ready to replace clinicians, but because it didn’t try to. The system excluded high-risk users. It followed structured, evidence-based conversation protocols. And all responses were reviewed by trained professionals before being delivered. Guardrails weren’t a last-minute fix; they were baked into the product itself. A chatbot that sounds like a therapist is not the same as one that behaves like one. (Empathy may be easy for LLMs to simulate, but safety is not.)

Still, there’s reason for optimism. LLMs may provide real value in low-risk settings: supporting stress management, helping users brainstorm coping strategies, offering emotional check-ins, encouraging healthy habits, or easing minor relationship tensions. Think everyday behavioral support at scale, not crisis intervention. AI health coaches rather than licensed therapists.

Used wisely, AI can expand access to mental health care. But we need to stop asking whether LLMs can replace therapists — and start asking where they can augment them, safely and well.

Because the real risk isn’t that people will reject AI therapy. It’s that they’ll trust it too much, too soon.

Curious how behavioral design can shape safer, smarter AI? Let's talk!

References

  1. Zao-Sanders, M. (2025). How People Are Really Using Gen AI in 2025. Harvard Business Review.

  2. Heinz, M. V., Mackin, D. M., Trudeau, B. M., Bhattacharya, S., Wang, Y., Banta, H. A., Jewett, A. D., Salzhauer, A. J., Griffin, T. J. & Jacobson, N. C. (2025). Randomized Trial of a Generative AI Chatbot for Mental Health Treatment. NEJM AI. https://doi.org/10.1056/AIoa2400802

  3. Moore, J., Grabb, D., Agnew, W., Klyman, K., Chancellor, S., Ong, D. C., & Haber, N. (2025). Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers. arXiv. https://arxiv.org/abs/2504.18412

  4. Hatch, C. E., Thorp, J. G., Doss, B. D., & Yaden, D. B. (2025). When ELIZA meets therapists: A Turing test for the heart and mind. PsyArXiv. https://psyarxiv.com/4aemz/

  5. Wenger, J. D., Cameron, C. D., & Inzlicht, M. (2025). The AI empathy choice paradox: People prefer human empathy despite rating AI empathy higher. OSF Preprint.

Share