PPRNet Clinical summary: Generative AI Providing Psychotherapy

Computer-generated texts in the form of psychological therapy have been around for a long time. However, there have been many problems with dropout and low end-user engagement (i.e., patients). Explicitly programmed decision trees limited them, so they did not do a good job of emulating human dialogue. Developments in generative AI (like ChatGPT) may be a different phenomenon altogether. Generative AI provides personalized responses to human input, with some arguing that it has the capacity to form human-like bonds. Inevitably, this has led to the development of psychological treatments with generative AI, and now, a randomized controlled trial (RCT) to test generative AI treatment. Ideally, this development would increase access to “psychotherapy” for many people who need it. But there may also be inevitable problems, including increased alienation of an already alienated population, unethical use of patients’ data, and poor or even inadvertently dangerous advice or interventions by a machine. Heinz and colleagues compared Therabot (a generative AI program) to a wait-list control group in this first RCT of generative AI therapy. They enrolled 210 adults with “clinically significant” symptoms of major depression (MDD), generalized anxiety (GAD), or feeding and eating disorders (FED). Participants were randomly assigned to receive Therabot or no treatment. Participants were volunteers responding to an online Meta Ads campaign. They were not assessed by interview to have a disorder but rather had elevated scores on one of three scales measuring MDD, GAD, or FED. Participants receiving Therabot had four sessions of a manualized, very brief CBT intervention. Outcomes (changes in an MDD, GAD, and FED scale) were assessed pre-, post-, and 4 weeks post-intervention. Therabot performed significantly better than no treatment. Effect sizes at post-treatment for the difference between Therabot and no treatment for MDD (d = 0.85), GAD (d = 0.84) and FED (d = 0.82) were somewhat lower than post-treatment benchmarks reported in typical psychotherapy trials (g = 0.93). 

Practice Implications

At first blush, the results for Therabot seem promising, especially for MDD and could revolutionize how a broader population could access psychological treatment. However, there are caveats. First, our field has witnessed many examples of very large treatment effects for a new therapy dwindling over time as researchers who are not so invested in the treatment conduct their trials. We have seen this for Prozac, CBT, and others in which the early promise to “revolutionize” therapy dissipated, because early findings are often inflated by researcher allegiance. Second, this trial did not compare Therabot to a traditionally delivered psychotherapy. Although one can use benchmarks for effect sizes like I did in my summary above, benchmarks are no substitute for direct comparisons. For example, the “patients” in the Therabot trial were, on average, a particular group of younger, well-educated people who had mild to moderate levels of depression and who were attracted to a generative AI intervention. These participants may not be comparable to those in previous traditional psychotherapy trials. Third, perhaps most concerning, is the social and psychological implications of people relying on non-human “healing”. Psychological problems are fundamentally rooted in interpersonal and social contexts and a deep sense of alienation. Treatment by generative AI can potentially exacerbate this problem at a societal level. We’ve already witnessed the insidious impact of social media on young people’s depression, anxiety, and stress levels. Fourth, generative AI is not governed by ethical principles and regulatory bodies that require practice according to ethical guidelines. If generative AI gives dangerous advice, who is responsible? Also, corporations that own generative AI therapy tools are driven by profit. There is no guarantee or oversight to ensure that sensitive personal material is not sold or shared with others, such as insurance companies, employers, or governments. In the novel “1984”, George Orwell describes the ubiquitous "telescreens" in everyone’s homes and public spaces. They provide entertainment, propaganda, and companionship for an otherwise alienated existence. Ultimately, they were useful for surveillance, control, and keeping people from connecting with each other, which can be potentially subversive.

Heinz, M.V., Mackin, D.M., Trudeau, B.M., Bhattacharya, S., Wang, Y., …. Jacobson, N.C. (2024). Randomized trial of generative AI chatbot for mental health treatment. NEJM AI, 2, https://ai.nejm.org/doi/10.1056/AIoa2400802.