By: Jon Armstrong, Chief Scientific Officer, Cofactor Genomics
Advances in new immunotherapies and novel combinations of therapies have resulted in an explosion of oncology clinical trials. While many of these therapies show promising results in early stage development, it’s often unclear why certain patients, usually the majority treated, do not respond.
Patient selection for clinical trials is a critical point for defining trial success, especially when the therapy will have a companion diagnostic and corresponding biomarker. Early biomarker development leveraged DNA mutations that were known to be involved in cancer biology, such as NRAS, BRAF, and EGFR. Other biomarkers included copy number variants (CNVs) and gene fusions, and began to make use of the dynamic nature of RNA. The most promising marker, now the on-label diagnostic for many of the leading PD-1/PD-L1 inhibitors, is the detection of PD-L1 protein by IHC. However, all of these approaches, especially the current state of practice of PD-L1, are considered imperfect. These individual analytes are considered in isolation versus considering all biological analytes in concert. Thus, these “isolated”, individual analytes cannot possibly capture the dynamic, cascaded biology of disease. Arguably, if we could do better at defining and utilizing biomarkers that more comprehensively characterize patients and disease, we’d see much higher success rates in drug development and clinical trials.
The European Society for Molecular Oncology noted in a recent Biomarker FactSheet, “it is likely that a more complex, multicomponent predictive biomarker system will be required” for successful patient selection for checkpoint inhibitor therapies. There are two key points identified here, the first being a need for increased predictive power, and second, the notion that this might be accomplished through a multicomponent marker. To date, several innovative approaches to multicomponent markers have shown promise. One example is an international collaboration focused on transcriptomic-based subtyping of colorectal cancer (CRC). This team has taken a forward-thinking approach to characterize CRC patients by, looking at mRNA expression to define genes, pathways, and immune signatures that might better define the disease. Other teams have taken a similar multigenic expression approach - some of which have found clinical utility and commercialization, such as the Afirma Gene Expression Classifier from Veracyte and Genomic Health’s Oncotype DX.
While these approaches have shown early, promising results over isolated, single-analyte biomarkers, they do not make use of the advanced data analysis capabilities that we have at our disposal - notably machine-learning. We see two major opportunities to utilize machine-learning to address this challenge:
Considering the improvements observed with multi-gene expression classifiers, more advanced models of disease should be considered. These multidimensional models are classified as Health Expression Models, which describes the ability to use machine-learning to model many different health states including cells, patients, and drug response. These models measure the dynamic, interconnected expression levels of many genes, beyond the presence or absence of a single, isolated transcript. The genes included in these models may be defined using machine-learning, thus mitigating bias while including more nuanced signals not previously recognized as important. By creating Health Expression Models of an important component of the immune response - immune cells and subtypes - we have shown increased sensitivity and specificity in detecting these cells in the tumor microenvironment of solid tumors. However, this increased sensitivity and specificity alone may not result in the necessary improvements in predictive power required to address the challenge of patient selection.
The solution to this potentially limited predictive power comes in the form of the new discipline of Predictive Immune Modeling. In this approach, patient cohorts from a study with robust clinical outcomes data are first characterized using Health Expression Models. The multigenic Health Expression Models detected in each patient sample are then benchmarked against the cohort groups which define response. The machine-learning algorithm considers all available Health Expression Models in combination, and determines the most predictive multidimensional biomarker possible. The predictive power and statistical significance of this Predictive Immune Model determine the next steps. One can either apply this model to future studies and/or diagnostic development, or add more data to improve its predictive power. These improvements have great promise in increasing the success of clinical trials - both for patients and drug developers.
The development of Health Expression Models is not that far off from where we started with gene expression classifiers and immune profiling. However, early results indicate that by making use of machine-learning, both for model generation and for combinations in Predictive Immune Modeling, we significantly increase the predictive power of the resulting biomarker. While machine-learning and advanced assays often seem difficult to adapt, the team at Cofactor has worked hard to develop standards and controls for assay validation. Learn more about how we're taking this approach forward in a CAP-validated workflow in our laboratory, including study design and results.