By Jacqueline Lutz & Lindsay Ayearst
In applied digital health, questions about evidence generation are inseparable from how products are built and whether they ever reach patients. Many scientists working on digital therapeutics are academically trained to run controlled clinical studies, behavioral experiments, or to study specific behavioral intervention mechanisms. That rigor does not disappear in industry settings. However, when evidence requirements shape which products are built, which features are deprioritized, and which interventions ultimately reach patients, a rebalance often occurs: engagement, feasibility, and the likelihood that an intervention will function in real-world contexts gain ethical weight alongside internal validity.
That rebalance also becomes visible in discussions about the appropriate placebo (or “sham”) control in the evaluation of digital therapeutics (DTx). The authors of this commentary have spent years in meetings where digital shams were debated, designed, abandoned, and worried over. We have advised multiple companies on how to “de-risk” sham designs and have been involved in close to a dozen sham implementations between us.
These experiences have made us increasingly aware of the possible, and many times unpredictable, failure modes of digital shams. What appears “inactive” in one context can become meaningfully therapeutic in another or fail to function as a credible control at all. This reflects a broader reality in the field: there is no simple playbook for shams in digital therapeutics, and the field has not converged on reusable control conditions comparable to placebo pills in pharmacological trials.
In practice, sham designs often fail in one of two directions.
One failure mode occurs when controls are inadvertently active. An example comes from a digital intervention for attention in ADHD, where the control condition was a structured, engaging word game designed to match expectancy and time-on-task while targeting different cognitive domains. Parents and participants were explicitly told the study compared two investigational interventions. While the primary outcome separated statistically, secondary symptom outcomes did not. The control, though scientifically well motivated, plausibly still delivered some therapeutic benefit through a repeated, structured, goal-directed, distraction-free activity, an intervention in itself for ADHD.
The opposite failure mode arises from inadvertently creating a minimally active sham that is too minimal to maintain optimal blinding. In a schizophrenia digital intervention trial, a highly stripped-down control app was used, displaying little more than a timer when opened. Post-study assessments revealed substantial unblinding, and uneven expectations across arms, with 38% of those randomized to sham unblinded (e.g. correctly guessing group assignment) compared to 78% in the treatment group.
Interestingly, in this case the sham also suffered from the first failure mode. Qualitative feedback suggested that even this minimal experience may have functioned therapeutically for some participants, for example, by redirecting attention away from hallucinations or serving as a mindfulness skills cue. While the true impact of this sham choice is difficult to quantify, it illustrates how minimal shams can introduce unpredictable effects and noise.
These examples demonstrate that the space between these failure modes is narrow, if not non-existent, in some instances and serve as a reminder that engagement features, design choices, behavioral structure or establishment of a digital working alliance, are not peripheral features of digital therapeutics. They are often core mechanisms of action, and difficult to subtract without dismantling the intervention itself.
While scientifically interesting, the cost of sham development and de-risking shams in preparation for efficacy studies is not trivial. Months of interdisciplinary work, and often separate feasibility studies, are devoted to developing product-specific shams. These resources could otherwise be spent on product features improving accessibility, usability, or inclusivity. In an industry dominated by small innovators, this can create a paradox in which more scientifically grounded companies struggle to reach patients, while unregulated alternatives proliferate.
This is not a general argument against sham-controlled studies. Rather, it is an argument for rebalancing how rigor is expressed across intervention types and product life cycles. In pharmacological trials, placebo-controlled designs rightly carry substantial ethical weight because of higher risk. In digital therapeutics, particularly low-risk, behavioral, or adjunctive tools, we argue that greater ethical weight may need to be placed on engagement, real-world effectiveness, and sustained patient benefit.
Ultimately, our ethical compass is based on patient impact. But patients do not experience placebo-adjusted effect sizes. They experience whether they notice meaningful improvements in their sleep, functioning, or symptoms. Meaningful and reliable improvements in a real-world context can greatly matter at scale, especially if the tools are accessible, inclusive, engaging, and fit into patients’ lives.
Paper title: Beyond Placebo Purism: Rebalancing Evidence Standards for Digital Therapeutics
Authors: Jacqueline Lutz1,2 & Lindsay Ayearst3
Affiliations: 1 Boston University, Chobanian & Avedisian School of Medicine, Department of Psychiatry, Boston, USA ; 2 Luja Digital Health Consulting LLC, Pittsfield, MA ; 3 Independent Consultant, Toronto, Canada
Competing interests: None declared
Social Media accounts of post authors:
LinkedIn: linkedin.com/in/jacqueline-lutz-phd linkedin.com/in/lindsayayearst
Bluesky: @linzayearst.bsky.social