John D. Loeser Award Lecture: Size does matter, but it isn't everything: the challenge of modest treatment effects in chronic pain clinical trials

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

1. Introduction Randomized clinical trials (RCTs) of treatments for pain have a long and distinguished history. The earliest clinical trials not only identified analgesic medications and their efficacious dosages but also contributed to the development of clinical trial research designs and methods that came to be used throughout medicine. The ground-breaking investigators who designed and conducted these early studies recognized that various sources of bias must be addressed, 68,69,78,105 and appreciation of the fundamental roles of study design and statistical principles became widespread as experience conducting RCTs grew. In this article, we first present analyses of a sample of chronic pain trials that show a decline in treatment effect estimates over the past few decades and discuss the implications of these results for determining sample sizes for future chronic pain trials. We then review explanations for the failure of RCTs to demonstrate the efficacy of truly efficacious treatments and address the role of excessive placebo group improvement. Finally, we consider various approaches that have the potential to improve the informativeness of clinical trials and their assay sensitivity, that is, their ability to distinguish an effective treatment from a less effective or ineffective treatment. 2. “The greatest teacher, failure is”: falsely negative and inconclusive clinical trial results It has been recognized for at least 2 decades that clinical trials of psychiatric medications often fail to show a statistically significant difference between an active medication and placebo. 29,53,63,74,82 Although some of these RCTs might have investigated treatments that truly lack efficacy, many were for medications that had demonstrated efficacy in multiple previous RCTs and had been approved by regulatory agencies around the world (eg, selective serotonin reuptake inhibitors for depression). Similarly, many RCTs of treatments for chronic pain have failed to demonstrate efficacy. 19,22,31 Some of these results also might reflect a true lack of efficacy—either in general or for the specific dosage studied—but some RCTs have failed to show efficacy of medications at dosages that had demonstrated efficacy in previous trials, had been approved by multiple regulatory agencies, and are generally considered first-line treatments. 19,22,31 It is common to refer to clinical trial results that fail to show the efficacy of truly efficacious treatments as “false negatives.” However, the failure of a clinical trial to reject the null hypothesis of no difference between an active treatment and placebo at a prespecified level of statistical significance does not necessarily indicate that the active treatment lacks efficacy. 86 Such nonsignificant study results can be accompanied by confidence intervals that are consistent with the possibility of a clinically meaningful treatment effect. When there is such an outcome, the results of the trial should be considered “inconclusive” rather than “negative.” 39 A failure to reject the null hypothesis can also be a result of chance, reflected in the type II error probability of failing to reject the null hypothesis of no difference between treatment groups when one truly exists. Table 1 presents a list of potential explanations for the failure of clinical trials of truly efficacious treatments to show their efficacy (see also Ref. 86). We focus on the roles of statistical power, excessive improvement in placebo groups, and various study methods and patient characteristics in contributing to falsely negative and inconclusive clinical trial outcomes. An additional explanation for such clinical trial results is the possibility that existing outcome measures have limited responsiveness to detect treatment effects. Most chronic pain RCTs have used numerical or visual analogue scales of pain intensity as primary outcome measures, 101 but other measures that could serve as primary outcomes—for example, ratings of pain relief, global improvement, or disease-specific pain-related symptoms—might have greater responsiveness. 18,44,97,98,102,109 Furthermore, chronic pain RCTs have typically not been designed to study patients selected on the basis of genotypes or phenotypes targeted by “precision” or “personalized” pain treatments. Although we believe that the development of improved clinical outcome assessments and of mechanism-based treatments 16,25,100 may make important contributions to the identification of pain treatments with greater efficacy or safety, further discussion of these issues is beyond the scope of this article. Table 1 Why can clinical trials of truly efficacious treatments fail to show their efficacy? 1. Chance 2. Placebo group patients improved “too much” 3. The optimal patients and phenotypes were not studied 4. Existing outcome measures have limited responsiveness to treatment effects 5. Temporal changes in characteristics of patients enrolling in trials 6. Temporal changes in types of clinical sites conducting trials 7. Research subject misbehaviour 8. Research site unintentional bias and misconduct 9. Inadequate sample sizes 3. Treatment effects and sample size determination Twenty years ago, Moore et al. 79 concluded on the basis of a series of simulations that “size is everything” if the samples of patients enrolled in RCTs are to have adequate statistical power to provide credible estimates of the efficacy of acute pain treatments. The results of recent meta-analyses of assay sensitivity and placebo group changes in RCTs of chronic neuropathic pain have found that treatment effects have decreased and placebo group changes have increased over the past several decades, perhaps especially in the United States. Tuttle et al. 110 concluded that from 1990 to 2013, placebo group changes increased while active treatment group changes remained relatively stable; as a consequence, “treatment advantage” vs placebo decreased substantially. Figure 1 presents the results of a second recent meta-analysis, on the basis of which Finnerup et al. 32 concluded that, from 1982 to 2017, there was an increase in mean numbers-needed-to-treat (NNTs) that was associated with increases in placebo group change, study duration, and sample size (note that we refer to active and placebo group “changes” rather than “responses” because the term “responses” fails to encompass regression to the mean, spontaneous improvement, and other nonspecific sources of improvement or worsening that are not actual responses to active or placebo treatments). Figure 1. Combined number-needed-to-treat (NNT) per year from a meta-analysis of randomized clinical trials of pharmacologic treatments for chronic neuropathic pain. 32 Although the results of these meta-analyses are generally consistent with what has been observed for RCTs in major depression 13,53,106,119 and other therapeutic areas, 5,52 only treatments for chronic neuropathic pain were examined and few such analyses have examined other chronic pain conditions. 24 Nevertheless, the results suggest that factors such as increasing placebo group change and changes in study methods may be limiting or reducing estimates of the effects of chronic pain treatments, which would necessitate larger sample sizes for adequate statistical power to detect minimally clinically important effects. When planning a clinical trial, appropriate sample size determination is necessary to avoid exposing more patients than necessary to a potentially nonefficacious or harmful treatment, while also including a sufficient number of participants to demonstrate a true treatment effect, if one exists. 26,77 Tuttle et al. 110 presented differences between medications and placebo in the percentage decrease in pain intensity from baseline, and Finnerup et al. 32 presented NNTs. Such data, however, are of limited value for determining sample sizes for analyses of continuous pain outcomes, for example, analysis of covariance adjusting for baseline pain, which is a common primary efficacy analysis used in confirmatory RCTs of chronic pain treatments. 21 In addition to type I and type II error probabilities—typically prespecified as 5% and 10% to 20%, respectively—sample size calculations for continuous variables require specification of the magnitude of the treatment effect and the variability of the outcome measure. A well-accepted approach to sample size determination for such a primary efficacy analysis involves the standardized effect size (SES), 26 which for a parallel group RCT is the mean change from baseline in the active group minus that in the placebo group divided by the pooled SD. 3.1. Methods and results We examined whether SESs of published neuropathic and non-neuropathic chronic pain trials have decreased over the past several decades by performing a secondary analysis of data from a recent meta-analysis of RCTs of efficacious medications conducted from 1980 to 2016 for low back pain, fibromyalgia, osteoarthritis pain, painful diabetic peripheral neuropathy, and postherpetic neuralgia. 102 The purpose of the initial meta-analysis was to compare the responsiveness of ratings of average pain intensity (API) and worst pain intensity (WPI), and in the current analysis, we explored the trajectories of API and WPI SESs over time. Twenty-three articles were identified for inclusion, with publication dates from 1999 to 2013. SESs were extracted or calculated using other reported data, and positive values indicate that the treatment reduced API or WPI more than placebo. 102 Mixed-effects meta-regression was used to test the significance of the relationship between time and both API SES and WPI SES. Preliminary analysis suggested that the relationships between time and both API SES and WPI SES were not linear. We therefore fit quadratic models regressing API SES and WPI SES on time and the square of time, where time is the number of years from 1999. Four articles included 2 active treatments compared with the same placebo arm. A robust variance estimator was used to account for correlations among the dependent effect size estimates in these 4 articles. All analyses were conducted using R version 3.5.1 with the robust.se function for robust variance estimation. 46,47 Table 2 presents the parameter estimates for time and the square of time for the API SES and WPI SES models. Figure 2 shows that API SES and WPI SES both increased slightly for a short time, but on average, the slopes decreased for every additional year after 1999. These results are consistent with the results of the meta-analyses of neuropathic pain trials 32,110 and demonstrate that the average benefit of efficacious analgesic medications shown in recent RCTs is modest. It is unknown whether the SESs for API and WPI will level off at approximately 0.30 or whether there will be a continued downward slope that will result in even lower SESs. Table 2 Parameter estimates for the pain intensity models. Average pain intensity SES Worst pain intensity SES Estimate (95% CI) P Estimate (95% CI) P Intercept 0.379 (0.308 to 0.451) <0.0001 0.401 (0.344 to 0.457) <0.0001 Time 0.036 (−0.002 to 0.074) 0.07 0.027 (−0.007 to 0.060) 0.13 Time2 −0.003 (−0.007 to −0.0003) 0.04 −0.003 (−0.006 to −0.000) 0.06 Time is the number of years since 1999. CI, confidence interval; SES, standardized effect size. Figure 2. Standardized effect sizes for average and worst pain intensity in randomized clinical trials of chronic pain treatments from 1999 to 2013. 3.2. Implications The results of our analysis do not address the causes of the decline in the SESs found in RCTs of efficacious medications for chronic pain. It is possible to speculate that this decline is due to efforts by the scientific community and government regulators to increase the rigor of clinical trial design, execution, and analysis through methods such as comprehensive prespecification of study methodology and analysis, limiting multiple hypothesis testing unless proper statistical adjustments are used, and principled methods to accommodate missing data. 41,42,50,51,99,103 Declines in SESs may also result from greater availability of pain treatments over time, which could reduce the pool of eligible patients and increase the percentage of study participants who have refractory pain. 22 Given the evidence that expectations are a major source of placebo effects, it is also possible that placebo group changes increase as evidence for a treatment's efficacy accumulates and becomes publicly available. 2 One important limitation of the present analyses is that they are based on published trials of 5 chronic conditions that reported both API and WPI. Although our results provide some information about the temporal trajectories of SESs from chronic pain trials, analyses that examine SESs for different chronic pain conditions or that include a larger sample of RCTs might produce different results; indeed, because clinical trials with nonsignificant results are less likely to be published, meta-analyses that include unpublished studies might show even greater declines. In addition, because the clinical trials we examined were limited to studies of efficacious medications for chronic pain, analyses of clinical trials of devices (eg, spinal cord stimulators) or of other nonpharmacologic treatments (eg, cognitive-behavior therapy and physical therapy) might also produce different results. For example, it has been observed that treatment effect estimates from RCTs of psychosocial treatments for depression are generally greater than those from trials of antidepressant medications; this observation may be explained by attenuation of the antidepressant treatment effect in trials in which a medication is compared with placebo and both groups are receiving intensive clinical management, which can be “substantially more therapeutic for patients with depression than doing nothing.” 90 The mean SES of approximately 0.30 for the most recent published chronic pain trials mirrors the mean SESs reported in meta-analyses of efficacious antidepressants for major depression. 43,61,62 Antidepressant trials share with analgesic RCTs several methodologic characteristics that might contribute to decreased assay sensitivity, including subjective outcomes, considerable placebo group improvements, and appreciable missing data. 41,61,110 Given the consistent meta-analysis results, it is crucial that analgesic and antidepressant RCTs be designed with realistic treatment effect estimates. To detect an SES of 0.30 with 80% power (α = 0.05, 2 tailed) in a parallel group trial, at least 175 patients per group would need to be randomized. An SES of 0.30 can be considered a modest treatment effect, and its clinical importance will depend on the risks and benefits of the treatment and its clinical context. 15,20 Such SESs reflect not only the specific effects of the treatments (eg, the pharmacologic activity of a medication) but also any methodologic characteristics of the clinical trials that decrease their assay sensitivity. 19,22 In designing chronic pain RCTs, an SES of 0.30 can serve as a benchmark that could be considered when performing sample size determinations. This approach addresses both the modest apparent efficacy of existing treatments and any limitations of the clinical trial methods that have been used to study them. It is important to acknowledge, however, that it is usually recommended that sample size determination be based on specifying an effect size that would be of minimal clinical importance to patients, clinicians, and other stakeholders. Given the often poor tolerability and risks of many existing treatments, doing so might be challenging because even a minimal treatment effect could be considered meaningful for a novel treatment that is well tolerated and safe. 15,20 4. Three eras of analgesic clinical trials The observation that clinical trials of medications with well-established efficacy are sometimes unable to demonstrate that efficacy provided the impetus for ongoing efforts to explain such results by examining associations between the research methods and patient characteristics of RCTs and their assay sensitivity. As can be seen from Figure 1, 3 eras of analgesic clinical trials can be identified from the NNTs associated with pharmacologic treatments for neuropathic pain. 32 The first era—from the early 1980s through the early 1990s—has the lowest NNTs (ie, greatest treatment vs placebo differences) and consists primarily of relatively small cross-over trials conducted by investigators such as Mitchell Max, Michael Rowbotham and Howard Fields, Søren Sindrup, and Peter Watson. These studies were typically conducted at a single clinical site with patients who were either personally known by the researchers or carefully assessed by clinician investigators with substantial expertise. The second era—from the mid-1990s to the mid-2000s—reflects the involvement of pharmaceutical companies in developing drugs for chronic pain. The early clinical trials of gabapentin, duloxetine, and pregabalin were conducted at multiple sites but often included investigators at academic medical centers with experience treating or researching the specific pain condition being studied. The third era—from the late 2000s to the present—has the highest NNTs and includes multinational RCTs with large sample sizes using primarily for-profit clinical research centers that conduct clinical trials across a wide range of therapeutic areas. The decrease in treatment effects reflected in these increasing NNTs could be a result of changes over time in research methods, study sites, and/or the patients enrolled in the trials. 32 Meta-analyses of RCTs of chronic neuropathic 23,32 and musculoskeletal pain 24 have found that greater trial assay sensitivity was associated with shorter trial durations and also smaller sample sizes. It is possible, however, that smaller trials that are negative or inconclusive are less likely to be published, and such publication bias might contribute to the results of these meta-analyses. Nevertheless, on the basis of data such as these, it has been suggested that larger and longer trials are not necessarily better at demonstrating whether a treatment is truly efficacious. 72,88 The decreased treatment effects observed over the past several decades could be a result of the pharmaceutical industry conducting an increasing number of appropriately powered RCTs intended to fulfill regulatory requirements for study durations that can examine durability of treatment effects. In addition, analyses of RCTs of depression 72 and Parkinson disease 45 have suggested that effect sizes might be smaller for patients who are enrolled later in the trial than for those enrolled earlier, perhaps due to the enrollment of patients who do not fulfill eligibility criteria because of pressure on sites to complete enrollment requirements. Also, with longer trials—for example, durations of 12 weeks or more rather than 5 to 8 weeks—there may be greater placebo vs active group improvement resulting from, as discussed in the next section, a greater number of study visits 90 and an increased opportunity for patients to develop supportive relationships with study staff. 87,91 It is also possible that over the course of these 3 eras of analgesic trials, the quality of RCT procedures and data, including patient clinical evaluations and outcome assessments, became more variable as greater numbers of study sites participated. 74 In addition, there has been increasing recognition of the potential roles of unintentional and intentional investigator bias 64,67,81 and frank research misconduct 27 in contributing to negative, inconclusive, and invalid study results. It has also become apparent that surprisingly large percentages of the participants enrolled in clinical trials are either professional subjects who are fabricating a clinical condition—and may be participating in more than one clinical trial at different sites, so-called “duplicate patients”—or are patients who intentionally falsify key eligibility criteria to be randomized. 10,11,76,96 Information provided on social media 71 and clinical trial websites can facilitate enrollment of such unqualified participants, and methods to identify professional subjects and mitigate patient misbehavior are now being developed, including the creation of research subject registries. 76,96 5. Placebo group changes and their interpretation The results of meta-analyses of RCTs have found meaningful relationships between placebo group changes and study methods and patient characteristics. Paralleling the results discussed above for treatment effects, greater placebo group changes in neuropathic pain trials were associated with longer trial durations and larger sample sizes. 19,32,110 In a larger number of meta-analyses of major depression trials, greater placebo group changes were associated with larger numbers of study sites, larger samples, greater frequency of study visits, longer trials, lower probability of receiving placebo, and higher patient expectations for improvement. 29,35,84,87,92,111,118 A robust finding that has emerged from multiple analyses of both pain and psychiatric treatments are associations between greater magnitudes of placebo group change and negative or inconclusive clinical trial outcomes, as evaluated, for example, by statistical significance, risk ratios, and NNTs. 32,52,53,59,110 In considering such relationships, it is important to recognize that random variation in the magnitudes of placebo group change across a set of RCTs will cause an association between placebo group changes and treatment effect estimates that reflect the difference between that placebo group and an active treatment. As Senn 93 observed many years ago, a “negative correlation between odds ratios and placebo rates in clinical trials does not of itself indicate the presence of a phenomenon of interest. Such an effect is to be expected on statistical grounds alone and there is thus no need to search for medical explanations.” Despite the statistical basis of associations between placebo group changes and treatment effect estimates, these associations can also reflect characteristics of the clinical trials that potentially reduce assay sensitivity. For example, it is uncommon for the mean pain intensity to fall below a mean of 3 or 4 on a 0 to 10 numerical rating scale. Such a “floor” of symptom reduction may represent an unresponsive core of refractory pain that if reached by patients in the placebo group would make it difficult to show any further pain reduction from an efficacious treatment. If this floor effect occurs, it could account, at least in part, for the associations between greater magnitudes of placebo group change and decreased treatment effects that have been reported. Assuming that there is such a floor effect, the separation between an efficacious treatment and placebo in an RCT might be greater if nonspecific sources of improvement in both treatment groups—such as placebo effects and regression to the mean—could be reduced, which could make it less likely that the placebo group would reach the floor. Another explanation for associations between placebo group changes and treatment effect estimates involves the presumption of additivity in placebo-controlled clinical trials. It is generally assumed that the specific effects of an active treatment provide an additive benefit to the nonspecific effects associated with treatment in the placebo group, which include placebo effects and regression to the mean. As noted by Kaptchuk, 58 this premise takes “for granted that the active drug response results partly from a placebo effect and that the placebo effect buried in the active arm is identical to the placebo effect of the dummy treatment.” But it is possible that response to the active treatment supplants at least part of the placebo group response, in which case the specific effects of the active treatment and the non-specific effects of trial participation, including placebo treatment, would be subadditive, that is, some of the nonspecific effects that occur in the placebo group would not occur in the active treatment group. 6,66,73 An example of such subadditivity is provided by Roose et al., 90 who noted that therapeutic contact with study staff—who have been reported to vary greatly in what they consider appropriate interactions with study participants 14 —may be “a potent contributor to symptomatic improvement in patients with depression, particularly patients in the placebo arm” of antidepressant RCTs. In several trials, number of study visits was more strongly associated with improvement in the placebo groups than in the antidepressant groups. It was concluded that “increasing the number of study visits significantly increases placebo response while leaving medication response generally unaffected,” for example, having only 6 rather than 10 visits over the course of a 12-week trial was associated with a difference in response rates between an antidepressant and placebo of 12.2% vs 0.4%. 90 Such differential effects on active and placebo group changes, if indeed causal, could reduce the apparent benefit of an efficacious treatment when compared with placebo. Although this subadditivity would decrease the assay sensitivity of any trials in which it occurs, it does provide a basis for hypothesizing that assay sensitivity can be increased if study procedures such as excluding certain patients 28 or training study participants 98,108 have differential effects on active and placebo group changes. 6. “Always in motion is the future”: emerging evidence-based approaches to the design of pain clinical trials Size does matter when determining the number of participants needed for an RCT to provide adequate statistical power to identify minimally clinically important effects 26 and to estimate their magnitude. 79 Nevertheless, it is important to recognize that various strategies for increasing the assay sensitivity of RCTs and decreasing the probability of inconclusive results should also be considered. 22 6.1. General methodologic considerations As recently emphasized in the International Council on Harmonisation E9 (R1) addendum on estimands in clinical trials, the preeminent consideration in designing clinical trials is to identify the scientific question of interest and the estimand, “a precise description of the treatment effect reflecting the clinical question posed by the trial objective.” 49 The choice of estimand determines the clinical trial design and the statistical analysis plan, including methods for accommodating inter-current events and missing data and the selection and interpretation of sensitivity analyses. 4,49,85 Discussion of the complex conceptual and statistical issues involved in determining estimands and prespecifying principled statistical analyses for their estimation is beyond the scope of this article; however, we believe it is important to emphasize that biostatisticians with expertise in clinical trials should be involved from the earliest consideration of conducting a clinical trial and continuing through its design, execution, analysis, interpretation, and reporting. The evidence that knowledge of clinical trial eligibility criteria can lead to intentional and unintentional biases among study staff and potential participants has provided a basis for recommending that key aspects of the protocol that do not involve safety should be concealed from all study staff and patients. 21,48,89 Blinding staff and patients to eligibility criteria could reduce the numbers of patients who are randomized but who do not actually fulfill these criteria because of inflated or falsified baseline assessments; use of electronic diaries and case report forms has made implementation of such blinding relatively straightforward. In addition, blinding study staff and patients to allocation ratios when patients are more likely to be randomized to active vs placebo treatment (eg, dose finding and active comparator trials) could also prevent the increases in placebo group improvements that have been found in trials in which patients know that their chance of receiving placebo is less than their chance of receiving an active treatment and, presumably as a result, have greater expectations for improvement. 83 Blinding patients and staff to the allocation ratio requires considerable attention to the language used in consent forms and patient materials and also involves explaining to ethics committees the anticipated benefits on assay sensitivity that might result. An important feature of clinical trials that is receiving increased attention as a source of poor data quality and of failures to demonstrate the efficacy of truly efficacious treatments is poor treatment adherence. Poor medication adherence can decrease estimates of efficacy and confound assessments of safety, 3,9 but it has typically been assessed using pill counts, which are known to be inaccurate. Although there are now a variety of more sophisticated methods for assessing medication adherence that have greater validity, 3,96 they have rarely been used in chronic pain RCTs. Minimizing placebo group changes also has the potential to enhance assay sensitivity. For example, in neuropathic pain RCTs, the time to onset of pain reduction in placebo groups has been shown to be longer than that associated with analgesic medications. 110 This is consistent with the observation that longer trials tend to have a progressive increase in placebo group changes 19,88 and suggests that shorter treatment durations may be preferable for proof-of-concept trials; of course, RCTs with longer durations would still be necessary to evaluate the durability of any benefits. In addition, when recruiting potential participants for a clinical trial evaluating a new treatment, placebo effects should be minimized by neutrally describing the treatment rather than enhancing participant expectations about its efficacy. 92,115,120 Placebo group changes might also be reduced by limiting the number of study visits and standardizing interactions between study staff and participants. 14 Importantly, whether such techniques reduce retention and thereby increase the amount of missing data should also be considered. Developing methods to mitigate unrealistic patient expectations is consistent with the obligation to ensure that patients understand the difference between participating in a clinical trial and receiving clinical care; any such standardized protocols intended to diminish placebo group improvement would ideally be evaluated in RCTs designed to examine their effectiveness and any unintended negative consequences. 6.2. Patient characteristics Various inclusion and exclusion criteria seem to be associated with increased assay sensitivity; for example, greater baseline pain intensity and prohibition of concomitant analgesic medications were found to be associated with greater assay sensitivity in clinical trials of chronic neuropathic 23,32 and musculoskeletal pain. 24 In addition, analyses of individual patient data showed that the subgroup of patients with excessive variability of pain ratings at baseline had reduced separation between the active treatment and placebo. 28,107 One approach to preventing the randomization of patients who do not fulfill eligibility criteria is to implement a central adjudication process, in which trial eligibility criteria are reviewed for each potential study patient. 33,75 This approach has the potential to increase the response to efficacious treatments by eliminating individuals who are unlikely to respond because they do not have the condition for which the treatment is indicated. Independent adjudication of eligibility criteria may also decrease placebo group changes by eliminating professional subjects and others who might be more likely report improvement. 33,76,96 6.3. Research designs There are several clinical trial designs that have the potential to increase assay sensitivity and the efficiency of identifying efficacious pain treatments (Table 3). One relatively straightforward approach is to conduct an interim blinded sample size re-estimation to ensure that the variability of the primary outcome measure was not underestimated in the initial sample size determination. 26 Interim futility analyses can also increase the efficiency of identifying efficacious treatments by determining whether a treatment is very unlikely to be statistically significantly different from the control treatment at the scheduled end of the trial. 56,104 Although use of such interim analyses in chronic pain RCTs has rarely been reported, they are routinely implemented in other therapeutic areas, and it has been recommended that they be considered in the design of clinical trials of pain treatments. 21,22,37 Table 3 Clinical trial designs that can improve the efficiency and informativeness of clinical trials of pain treatments. 1. Interim blinded sample size re-estimation 2. Interim futility analyses 3. Cross-over and multiple N-of-1 designs 4. Designs that might have greater assay sensitivity (eg, EERW, SPCD, and TED) 5. Adaptive designs 6. Master protocols, including, basket, umbrella, and platform designs EERW, enriched enrollment randomized withdrawal; SPCD, sequential parallel comparison design; TED, two-way enriched design. Cross-over designs can be used to reduce sample size requirements when studying pain conditions that are expected to remain stable throughout the trial duration and treatments that have relatively fast onset and offset of their pharmacodynamic effects. 26,40,94 However, cross-over trials also have several potential limitations, including carry-over effects, in which the effect of an active treatment in the first period may carry over to a placebo condition in the next period and reduce the second period treatment-placebo difference. Various methods for addressing these effects have been proposed, but the best approach is to design the trial to minimize potential carry-over effects and any other causes of treatment-by-period interaction. 26,94 When a cross-over trial randomizes patients to at least 2 periods with an active treatment and 2 periods with placebo—also referred to as an N-of-1 design when used in clinical practice 65 —it becomes possible to examine whether there is evidence of treatment-by-patient interaction. 17,38 Significant treatment-by-patient interaction indicates that there is heterogeneity of treatment effects among patients, that is, different patients truly respond differently to the treatment. Multiperiod cross-over trials, therefore, have the potential to identify those pain conditions and treatments for which efforts to determine genotypic and phenotypic predictors of treatment response could be worthwhile. 95 Enrichment designs may increase clinical trial assay sensitivity by randomizing those patients who are expected to be more likely to respond to treatment and not withdraw because of adverse events. 114 The most common type of enrichment design used in studying chronic pain treatments has been termed “enriched enrollment randomized withdrawal.” 60,80 In this design, an initial enrichment phase in which patients receive the active treatment is followed by a double-blind phase in which patients who have tolerated the treatment and reported an improvement in pain intensity are randomized to continued active treatment or to placebo. The results of published trials suggest that the assay sensitivity of these trials may be greater than the assay sensitivity of standard parallel group trials, but the evidence is not conclusive. 34,60,80 The sequential parallel-comparison design (SPCD) was developed to reduce placebo group improvements and thereby increase assay sensitivity in RCTs of antidepressant medications. 29,30 In the most common version, patients are first randomized to active treatment and placebo groups, typically with more participants allocated to placebo. Patients in the placebo group who do not improve in this phase are then rerandomized to either the active treatment or placebo. The efficacy analysis typically includes all first-phase data and second-phase data only from the placebo group patients who did not improve in the first phase. Because some patients contribute outcome data from both phases and there is typically a reduced magnitude of change in the placebo group in the second phase, SPCD trials can reduce required sample sizes. 12,29,54 The potential of this design for increasing the assay sensitivity of RCTs of chronic pain treatments has been discussed. 37 A two-way enriched design that is an extension of SPCD has also been described. 55 In this design, after randomization to either active or placebo treatment, patients in the active treatment group who improved and patients in the placebo group who did not improve are rerandomized to active or placebo treatment. The data from this second phase make it possible to test whether a treatment “that is significantly superior to placebo in achieving short-term efficacy will also be superior to placebo in the maintenance of efficacy.” 55 Adaptive clinical trial designs can be used for exploratory studies as well as for confirmatory trials, and their objectives have included (1) dose finding; (2) bridging phases 1 and 2 or phases 2 and 3 with seamless designs (eg, using dose-finding data to transition to a confirmatory trial); (3) response adaptive randomization to increase the percentage of patients randomized to treatments with promising interim data; and (4) interim sample-size re-estimation and futility analyses, as discussed above. 7,26,112 The benefits of adaptive designs can include smaller sample sizes, shorter durations, and an increased likelihood of achieving trial objectives. However, operational challenges include extensive simulation studies often required for study planning, medication supply, and monitoring of sites, data, and analyses. 36 Although it has been suggested that adaptive dose-finding designs can play an important role in early analgesic drug development, 57 there have been very few published RCTs of pain treatments that have used adaptive designs. There has recently been considerable attention to the potential of master protocols to increase the efficiency of drug development by using “a single infrastructure, trial design, and protocol to simultaneously evaluate multiple drugs and/or disease populations in multiple substudies.” 113 There are 3 different types of master protocols: (1) umbrella trials, in which multiple treatments are studied for a single disease; (2) basket trials, in which a single treatment is studied in multiple diseases or multiple subtypes of a single disease; and (3) platform trials, in which multiple treatments are studied for a single disease, as in umbrella trials, but in a perpetually continuing manner, and often with sharing of common control patients, treatments entering and exiting the platform on the basis prespecified decision algorithms, and early stopping for success or failure. 116 The most frequent use of master protocols has been in oncology, in which different designs have been used to study novel drugs and drug combinations, often in biomarker-defined subgroups of patients. Master protocols could have particular value for novel treatments that potentially have efficacy in one or more different pain conditions given the prevailing expectation that predictive biomarkers will be developed that can identify subgroups of patients who respond more robustly to treatment, 1,117 7. Discussion Sample size does matter for ensuring that clinical trials of pain treatments have adequate assay sensitivity, but it is not everything. In designing, conducting, and analyzing RCTs, a large number of additional methodologic issues and advances should also be considered. Unfortunately, very few studies have formally examined whether modifying study methods increases assay sensitivity or decreases placebo group changes in RCTs of efficacious pain treatments. Providing preliminary support for the value of patient training, the results of recent studies in which patients were randomized to training or no training showed that training can improve the accuracy of pain ratings 98,108 ; in addition, placebo group changes were reduced and there were numerically greater effect sizes in trained vs untrained patients in one of these studies, a clinical trial in painful diabetic peripheral neuropathy. 108 The ultimate objective of the research discussed in this article is to develop an evidence-based approach to the design of clinical trials, 19 and prospective RCTs must be conducted to test methods that are hypothesized to increase assay sensitivity. Nevertheless, on the basis of available evidence as well as general considerations involving study execution and data quality, recommendations have been presented for improving the design of acute 8 and chronic 21,37 pain trials and for increasing their assay sensitivity. 22 Adopting such recommendations and giving careful consideration to optimizing study design has the potential to increase the assay sensitivity and informativeness of RCTs of pain treatments. The results of the clinical trials conducted over the next decade will hopefully demonstrate whether these approaches give rise to a fourth era of analgesic clinical trials, one in which meaningful increases in treatment effects will occur. 8. Summary There is no better summary of our perspective on the current state of pain treatment than one provided by Paul Leber 70 for psychiatric medications. Based on his wide-ranging experiences as director of the U.S. Food and Drug Administration's Division of Neuropharmacologic Drug Products, Leber maintained that “given how little we actually understand about the behaviors and affects we seek to manage through pharmacological interventions…we are exceedingly fortunate to possess the number of modestly effective drugs that we do.” Conflict of interest statement S.M. Smith has received in the past 36 months a research grant from the Richard W. and Mae Stone Goode Foundation. For a complete list of lifetime disclosures for M. Fava, please see: https://mghcme.org/faculty/faculty-detail/maurizio_fava. M.P. Jensen has received in the past 36 months research grants from the U.S. National Institutes of Health, the U.S. Department of Education, the Administration of Community Living, the Patient-Centered Outcomes Institute, and National Multiple Sclerosis Society, the International Association for the Study of Pain, and the Washington State Spinal Injury Consortium, and compensation for consulting from Goalistics. O. Mbowe has no disclosures for the past 36 months. M.P. McDermott has been supported in the past 36 months by research grants from the U.S. National Institutes of Health, U.S. Food and Drug Administration, NYSTEM, SMA Foundation, Cure SMA, and PTC Therapeutics, has received compensation for consulting from Neuropore Therapies and Voyager Therapeutics, and has served on Data and Safety Monitoring Boards for U.S. National Institutes of Health, Novartis Pharmaceuticals Corporation, AstraZeneca, Eli Lilly, aTyr Pharma, Catabasis Pharmaceuticals, Vaccinex, Cynapsus Therapeutics, and Voyager Therapeutics. In the past 36 months, D.C. Turk has received research grants and contracts from U.S. Food and Drug Administration and U.S. National Institutes of Health, and compensation for consulting on clinical trial and patient preferences from AccelRx, Eli Lilly, GlaxoSmithKline, Nektar, Novartis, and Pfizer. R.H. Dworkin has received in the past 36 months research grants and contracts from U.S. Food and Drug Administration and U.S. National Institutes of Health, and compensation for serving on advisory boards or consulting on clinical trial methods from Abide, Acadia, Adynxx, Analgesic Solutions, Aptinyx, Aquinox, Asahi Kasei, Astellas, AstraZeneca, Biogen, Biohaven, Boston Scientific, Braeburn, Celgene, Centrexion, Chromocell, Clexio, Concert, Decibel, Dong-A, Editas, Eli Lilly, Eupraxia, Glenmark, Grace, Hope, Immune, Lotus Clinical Research, Mainstay, Merck, Neumentum, Neurana, NeuroBo, Novaremed, Novartis, Olatec, Pfizer, Phosphagenics, Quark, Reckitt Benckiser, Regenacy (also equity), Relmada, Sanifit, Scilex, Semnur, Sollis, Teva, Theranexus, Trevena, Vertex, and Vizuri.

Related collections

Most cited references 89

Record: found
Abstract: found
Article: not found

Interpreting the clinical importance of group differences in chronic pain clinical trials: IMMPACT recommendations.

Robert H Dworkin, Dennis C Turk, Michael McDermott … (2009)

An essential component of the interpretation of results of randomized clinical trials of treatments for chronic pain involves the determination of their clinical importance or meaningfulness. This involves two distinct processes--interpreting the clinical importance of individual patient improvements and the clinical importance of group differences--which are frequently misunderstood. In this article, we first describe the essential differences between the interpretation of the clinical importance of patient improvements and of group differences. We then discuss the factors to consider when evaluating the clinical importance of group differences, which include the results of responder analyses of the primary outcome measure, the treatment effect size compared to available therapies, analyses of secondary efficacy endpoints, the safety and tolerability of treatment, the rapidity of onset and durability of the treatment benefit, convenience, cost, limitations of existing treatments, and other factors. The clinical importance of individual patient improvements can be determined by assessing what patients themselves consider meaningful improvement using well-described methods. In contrast, the clinical meaningfulness of group differences must be determined by a multi-factorial evaluation of the benefits and risks of the treatment and of other available treatments for the condition in light of the primary goals of therapy. Such determinations must be conducted on a case-by-case basis, and are ideally informed by patients and their significant others, clinicians, researchers, statisticians, and representatives of society at large.

0 comments Cited 227 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Size is everything--large amounts of information are needed to overcome random effects in estimating direction and magnitude of treatment effects.

A. Moore, David Gavaghan, R M Tramèr … (1998)

Variability in patients' response to interventions in pain and other clinical settings is large. Many explanations such as trial methods, environment or culture have been proposed, but this paper sets out to show that the main cause of the variability may be random chance, and that if trials are small their estimate of magnitude of effect may be incorrect, simply because of the random play of chance. This is highly relevant to the questions of 'How large do trials have to be for statistical accuracy?' and 'How large do trials have to be for their results to be clinically valid?' The true underlying control event rate (CER) and experimental event rate (EER) were determined from single-dose acute pain analgesic trials in over 5000 patients. Trial group size required to obtain statistically significant and clinically relevant (0.95 probability of number-needed-to-treat within -/+0.5 of its true value) results were computed using these values. Ten thousand trials using these CER and EER values were simulated using varying group sizes to investigate the variation due to random chance alone. Most common analgesics have EERs in the range 0.4-0.6 and CER of about 0.19. With such efficacy, to have a 90% chance of obtaining a statistically significant result in the correct direction requires group sizes in the range 30-60. For clinical relevance nearly 500 patients are required in each group. Only with an extremely effective drug (EER > 0.8) will we be reasonably sure of obtaining a clinically relevant NNT with commonly used group sizes of around 40 patients per treatment arm. The simulated trials showed substantial variation in CER and EER, with the probability of obtaining the correct values improving as group size increased. We contend that much of the variability in control and experimental event rates is due to random chance alone. Single small trials are unlikely to be correct. If we want to be sure of getting correct (clinically relevant) results in clinical trials we must study more patients. Credible estimates of clinical efficacy are only likely to come from large trials or from pooling multiple trials of conventional (small) size.

0 comments Cited 149 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Does the probability of receiving placebo influence clinical trial outcome? A meta-regression of double-blind, randomized clinical trials in MDD.

George I Papakostas, Maurizio Fava (2009)

Substantial and highly variable placebo response rates represent a major obstacle to antidepressant development in major depressive disorder (MDD). However, whether the likelihood of receiving active treatment or placebo, a proxy of the degree of expectation of improvement, may itself influence clinical trial outcome is unclear. The goal of this work was to examine whether the probability of receiving placebo influences clinical trial outcome antidepressant MDD trials. Medline/Pubmed publication databases were searched for randomized, double-blind, placebo-controlled trials of antidepressants for adults with MDD. 146 manuscripts involving 182 clinical trials were pooled (n = 36,385). Pooled response rates for drug and placebo were 53.8% and 37.3%. A meta-regression (random-effects) established that the probability of receiving placebo, year of publication, and baseline severity were independent predictors of the risk ratio of responding to antidepressants versus placebo. Specifically, a greater probability of receiving placebo, greater baseline severity and an earlier year of publication predicted greater antidepressant-placebo "efficacy separation". Fixed versus flexible dose design, trial duration and population age did not influence clinical trial outcome.

0 comments Cited 106 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Pain

Journal ID (iso-abbrev): Pain

Journal ID (coden): JPAIN

Journal ID (pmc): Pain

Journal ID (publisher-id): JOP

Title: Pain

Publisher: Wolters Kluwer (Philadelphia, PA )

ISSN (Print): 0304-3959

ISSN (Electronic): 1872-6623

Publication date (Print): September 2020

Publication date (Electronic): 19 August 2020

Volume: 161

Issue: 1

Pages: S3-S13

Affiliations

Departments of [a ]Anesthesiology and Perioperative Medicine

[b ]Obstetrics and Gynecology and

[c ]Psychiatry, University of Rochester, Rochester, NY, United States

[d ]Department of Psychiatry, Massachusetts General Hospital, Boston, MA, United States

[e ]Department of Rehabilitation Medicine, University of Washington, Seattle, WA, United States

[f ]Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY, United States

[g ]Department of Neurology, University of Rochester, Rochester, NY, United States

[h ]Center for Health + Technology, University of Rochester, Rochester, NY, United States

[i ]Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA, United States

Author notes

[* ]Corresponding author. Address: Department of Anesthesiology and Perioperative Medicine, University of Rochester School of Medicine and Dentistry, 601 Elmwood Ave, Box 604, Rochester, NY 14642, United States. Tel.: +1 585-275-8214; fax: +1 585-244-7271. E-mail address: robert_dworkin@ 123456urmc.rochester.edu (R.H. Dworkin).

Article

Publisher ID: PAIN-D-20-00036 Accession ID: 00002

DOI: 10.1097/j.pain.0000000000001849

PMC ID: 7434212

PubMed ID: 33090735

SO-VID: 3df15a26-6445-451e-b54b-3cb0d2cef8e2

License:

This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.

History

Date received : 14 January 2020

Date revision received : 18 February 2020

Date accepted : 20 February 2020

Custom metadata

OPEN-ACCESS TRUE

ScienceOpen disciplines: Anesthesiology & Pain management

Data availability:

ScienceOpen disciplines: Anesthesiology & Pain management

Comments

Comment on this article

scite_

Cited by 8

See all cited by

Most referenced authors 1,143

See all reference authors

John D. Loeser Award Lecture: Size does matter, but it isn't everything: the challenge of modest treatment effects in chronic pain clinical trials

Read this article at

Abstract

Related collections

Irish Journal of Paramedicine

Most cited references 89

Interpreting the clinical importance of group differences in chronic pain clinical trials: IMMPACT recommendations.

Size is everything--large amounts of information are needed to overcome random effects in estimating direction and magnitude of treatment effects.

Does the probability of receiving placebo influence clinical trial outcome? A meta-regression of double-blind, randomized clinical trials in MDD.

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 49

Cited by 8

Most referenced authors 1,143