cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
Assessing the heterogeneity of treatment effects within subgroups

Subgroup analyses are frequently conducted, with approximately 70% of clinical trials reporting at least some results within subgroups. Subgroup analyses are important in that they provide clinicians with information on the potential for differential treatment response within important demographic, genetic, disease, environmental, behavioral, or regional characteristics. From a regulatory perspective, subgroup analyses often illustrate that the estimated overall effect is broadly applicable to patients, particularly when the study population is heterogeneous. Finally, subgroup analyses are useful for generating hypotheses for future research. Even if subgroup analyses are not performed within individual clinical trials (often due to limited sample size), there is the expectation from regulators that primary and key secondary efficacy and safety endpoints will be examined within the integrated efficacy and safety analyses performed across all trials within a clinical development program.

For subgroup analyses, transparency is key. It is generally recommended that analysts communicate the following details:

  • The subgroup size.
  • The number of subgroups assessed compared to subgroups reported.
  • Whether subgroups were determined pre- or post-hoc.
  • Whether multiplicity adjustments were applied.
  • Whether stratified randomization was used.
  • Whether heterogeneity was formally assessed.

The forthcoming Subgroup Screening report within JMP Clinical 19 is a straightforward way to assess how the treatment effect (or the treatment response, in the case of single-arm studies) for a particular endpoint can vary across subgroups and which factors might contribute to this heterogeneity. And unlike many experimental settings, the size and availability of subgroups are often not within the control of the sponsor for clinical trials. This information is easily accessible with Subgroup Screening. This report was inspired by an article in Therapeutic Innovation and Regulatory Science and an R package that was produced to visualize subgroups in an exploratory setting.

For the purposes of our discussion, the term factor refers to a variable that will be used either alone or in conjunction with other factors to determine subgroups, which we define as the individual levels of a factor or a combination of two or more factor levels. For example, the variable sex would be considered a factor, while male and female would be considered subgroups. Perhaps a second factor Age contains two subgroups: < 65 years and ≥ 65 years. Subgroups based on sex and age could produce the following four subgroups:

  • Males, < 65 years
  • Males, ≥ 65 years
  • Females, < 65 years
  • Females, ≥ 65 years

Throughout this blog post, we illustrate methodologies using data from patients with probable mild-to-moderate Alzheimer’s disease. The CDISC Pilot study includes data from 254 patients randomized to one of three treatments (Xanomeline high dose, Xanomeline low dose, or Placebo) for a 26-week treatment period. The primary endpoint was change from baseline in the Alzheimer’s disease assessment scale (ASAS-Cog Subscore), a cognitive subscale of 11 items that ranges from scores of 0 to 70, with higher scores indicating greater impairment.

Figure 1 contains the dialog for Subgroup Screening for the CDISC Pilot study.

Figure 1. Subgroup Screening dialogFigure 1. Subgroup Screening dialog

Users select a findings Domain and a single Findings Test. These selections represent the single endpoint of interest that will be explored within subgroups defined by one or more factors.

The Subgroup Options panel is where the user selects variables from ADSL that will be used as factors for the analysis. Some demographic variables are selected by default, but any number of factors can be accommodated. Each level of each factor will be analyzed as a separate subgroup. If Subgroup Twoway is selected, the pairwise combinations of all individual factor levels will be analyzed as subgroups as well.

Running the analysis produces the analysis in Figure 2.

Figure 2. Subgroup Screening resultsFigure 2. Subgroup Screening results

The Report Summary gives an overview of the number of subgroups. For this analysis, there were eight factors that had a total of 29 levels. With Subgroup Twoway selected, there are 365 possible subgroups, 317 of which have at least one patient, and 48 of which are not observed. Note that not all available subgroups will produce an observable treatment effect – in multi-arm studies, at least one patient is needed in a subgroup for each treatment for a given treatment pair.

The funnel plot in Figure 2 displays the difference in means for Xanomeline High Dose-Placebo (shown in the Section Filter of the output) within each subgroup by the Frequency of Patients. The X axis can be modified to be the Percentage of Patients, which divides the observed frequency for each subgroup by the total number of patients in the analysis population. Note that as the Frequency of Patients gets smaller, there is greater variability in the observed treatment effects; as the Frequency of Patients gets larger, this variability reduces and begins to converge to the treatment response represented by the entire analysis population, which is represented by the reference line and the right-most point (labeled “All”). Here, the overall difference in the ADAS-Cog Subscore is -1.04218, implying that Placebo has a slightly larger subscore than Xanomeline High Dose. If other treatment effects are of interest, the presented treatment effect can be modified using the Section Filter. The red highlighted point is for Males with BMI < 25 kg/m2. The parent subgroups, either Males or BMI < 25 kg/m2, are highlighted in yellow. Details on these subgroups are summarized in the tabbed tables below the funnel plot. These tables summarize Highlighted Subgroups, Parent Subgroups, Filtered Subgroups, and Memorized Subgroups. The Highlight Subgroups feature of the Display Options makes it possible to highlight subgroups based on a particular factor level or the magnitude or rank of the variability of the Variable Importance from a bootstrap forest.

An UpSet plot is summarized below the funnel plot and subgroup tables (Figure 3). The Local Data Filter was adjusted to display subgroups with at least 55 patients to make it more readable for this discussion.

Figure 3. UpSet plot for CDISC PilotFigure 3. UpSet plot for CDISC Pilot

UpSet plots are used to communicate the frequency of subgroups produced according to one or more factors compared to the marginal frequencies of considering each factor alone. The vertical and horizontal axes are ordered according to these frequencies so that the further right (or down) in the plot, the smaller the frequency.

  1. The line and dots area of the figure illustrate the connections between factor levels along the Factor Level vertical axis and the Subgroup horizontal axis.
  2. The bar chart shown in the Overall Frequency of Patients in Subgroup vertical axis and Subgroup horizontal axis presents the frequency of each Subgroup, which will show individual and pairs (if Subgroup Twoway is selected) of factor levels. This bar chart is presented in decreasing frequency, left to right.
  3. The bar chart shown in the Factor Level vertical axis and the Overall Frequency of Patients in Factor Level horizontal axis presents the frequency of each factor level individually. This bar chart is presented in decreasing frequency, top to bottom.
  4. The top area of the figure, within the Proportion of Treatment vertical axis and the Subgroup horizontal axis, is a stacked bar chart that shows the proportion of patients in each subgroup that takes a particular treatment for a particular treatment effect. This last point is important. For example, a three-arm study with 1:1:1 randomization to treatments A, B, and C (as in the CDISC Pilot study) will show that treatments A and B will be approximately 50% for the treatment effects of A-B or B-A. This part of the graph has an additional option that is discussed further below.

For example, the left-most “subgroup” with no factors selected and the greatest frequency (n = 168) represents the analysis population for these two treatments. The next largest subgroup (or the largest actual subgroup) is for Pooled Disease Duration >= 12, with 161 patients. The largest subgroup based on two factors is Pooled Disease Duration >= 12 and Ethnicity: Not Hispanic or Latino (n = 152). This can be compared to the factor level sample sizes of 161 and 159 for Pooled Disease Duration >= 12 and Ethnicity: Not Hispanic or Latino, respectively. Of the 152 patients in this subgroup, 51% and 49% of the patients are Xanomeline High Dose or Placebo, respectively. Marginal counts for each treatment are visible in the data table, or the user can click Frequencies by Treatment to change the bar charts to side-by-side bar charts with sample sizes for displayed for each treatment.

As mentioned above, the upper part of the UpSet plot (the stacked bar chart summarizing the proportion of treatments) can be modified for other details. As seen in Figure 3, the UpSet Format is selected as Stacked Bar, which is the default view. One other view is possible, Confidence Interval, which summarizes 95% confidence intervals of the treatment effect (Figure 4).

Figure 4. UpSet plot with confidence intervalsFigure 4. UpSet plot with confidence intervals

Figure 5. Interaction plotFigure 5. Interaction plot

The final plot in the Subgroup Screening output is an interaction plot that displays the treatment effects within the levels of one factor with an overlay for the levels of a second factor. These variables can be modified using the Interaction Plot area of the Display Options (Figure 6).

Figure 6. Interaction plot display optionsFigure 6. Interaction plot display options

Points are sized according to the number of patients within each subgroup for the selected treatment effect.

While JMP Clinical 19 makes it easy to explore endpoints within subgroups, care must be taken to not overinterpret any findings, and users are encouraged to assess the totality of evidence for any notable findings within a subgroup:

  1. Consistency of results across multiple time points for a given endpoint.
  2. Consistency of results across increasing doses of treatment.
  3. Consistency of results across related endpoints.
  4. The presence of a reasonable scientific explanation.
Last Modified: Jul 3, 2025 9:00 AM