Comparative molecular cell-of-origin classification of diffuse large B-cell lymphoma based on liquid and tissue biopsies

Diffuse large B-cell lymphoma (DLBCL) is a heterogenous blood cancer, but can be broadly classified into two main subtypes, germinal center B-cell-like (GCB) and activated B-cell-like (ABC). GCB and ABC subtypes have very different clinical courses, with ABC having a much worse survival prognosis. It has been observed that patients with different subtypes also respond differently to therapeutic intervention, in fact, some have argued that ABC and GCB can be thought of as separate diseases altogether. Due to this variability in response to therapy, having an assay to determine DLBCL subtypes has important implications in guiding the clinical approach to the use of existing therapies, as well as in the development of new drugs. The current gold standard assay for subtyping DLBCL uses gene expression profiling on formalin fixed, paraffin embedded (FFPE) tissue to determine the “cell of origin” and thus disease subtype. However, this approach has some significant clinical limitations in that it 1) requires a biopsy 2) requires a complex, expensive and time-consuming analytical approach and 3) does not classify all DLBCL patients. Here, we took an epigenomic approach and developed a blood-based chromosome conformation signature (CCS) for identifying DLBCL subtypes. An iterative approach using clinical samples from 118 DLBCL patients was taken to define a panel of six markers (DLBCL-CCS) to subtype the disease. The performance of the DLBCL-CCS was then compared to conventional gene expression profiling (GEX) from FFPE tissue. The DLBCL-CCS was accurate in classifying ABC and GCB in samples of known status, providing an identical call in 100% (60/60) samples in the discovery cohort used to develop the classifier. Also, in the assessment cohort the DLBCL-CCS was able to make a DLBCL subtype call in 100% (58/58) of samples with intermediate subtypes (Type III) as defined by GEX analysis. Most importantly, when these patients were followed longitudinally throughout the course of their disease, the EpiSwitch™ associated calls tracked better with the known patterns of survival rates for ABC and GCB subtypes. This proof-of-concept study provides an initial indication that a simple, accurate, cost-effective and clinically adoptable blood-based diagnostic for identifying DLBCL subtypes is possible.


Background
Diffuse large B-cell lymphoma (DLBCL) is the most common type of blood cancer and numerous studies using different methodologies have demonstrated it to be genetically and biologically heterogeneous [1,2]. The two principal DLBCL molecular subtypes are germinal center B-cell-like (GCB) and activated B-cell-like (ABC), although more granular definitions of molecular subtypes have also been proposed. These two primary subtypes have a high degree of clinical relevance, as it has been observed that they have dramatically different disease courses, with the ABC subtype having a far worse survival prognosis. Perhaps more importantly, as novel investigational agents to treat GCB and ABC (or non-GCB) subtypes are evaluated in clinical settings and the historical observation that overall response rates in unselected patients is low, there is a pressing need to identify patient subtypes prior to the initiation of therapy. Historically, DLBCL subtypes are determined by identifying the "cell of origin" (COO). The original COO classification was based on the observed similarity of DLBCL gene expression to activated peripheral blood B cells or normal germinal center B-cells by hierarchical clustering analysis [3]. This COO-classification by whole-genome expression profiling (GEP) classifies DLBCL into activated B-cell like (ABC), germinal center B-cell like (GCB), and Type-III (unclassified) subtypes, with the ABC-DLBCL characterized by a poor prognosis and constitutive NF-kB activation [3][4][5][6][7]. In their seminal work, Wright et al. identified 27 genes that were most discriminative in their expression between ABC and GCB-DLBCL, and developed a linear predictor score (LPS) algorithm for COO-classification [5]. These original studies are entirely based on retrospective investigations of fresh-frozen (FF) lymphoma tissues. A major challenge for the application of this COO-classification in clinical practice has been an establishment of a robust clinical assay amenable to routine formalin-fixed paraffin-embedded (FFPE) diagnostic biopsies. Several studies have also investigated the possibility of COO classification of DLBCL using FFPE tissues by quantitative measurement of mRNA expression, including quantitative nuclease protection assay [8], GEP with the Affymetrix HG U133 Plus 2.0 platform or the Illumina whole-genome DASLassay [9][10][11], and NanoString Lymphoma Subtyping Test (LST) technology [12]. Several immunohistochemistry (IHC)-based algorithms have also been investigated to recapitulate the COOclassification by GEP. In general, these studies demonstrated high confidence of COO-classification of DLBCL using FFPE tissues and a robust separation in overall survival between ABC and GCB subtypes, but suffer from reproducibility issues, particularly lack of concordance between assays [13][14][15][16]. In addition, any IHC-based measure requires baseline tissue, which is not always available and current turnaround times from sample collection to assay readout are long, making implementation in clinical practice a challenge.
Among the approaches that have been used historically to subtype DLBCL, one method for COO assessment uses an assay that measures the expression of 27 genes from FFPE tissue by quantitative reverse transcription PCR (qRT-PCR) using the Fluidigm BioMark HD system [17]. While there are some advantages to this methodology over existing techniques, the approach still faces some major obstacles that limit its clinical application in that it 1) requires a tissue biopsy 2) relies on expensive, nonstandard and time-consuming laboratory procedures. As such, having a blood-based assay would advance the field by providing a simple, reliable and cost-effective method for DCBCL subtyping with enhanced clinical applicability.
In this study, we used a novel blood-based assay to determine COO classification in DLBCL patients by focusing on detecting changes in genomic architecture. As part of the epigenetic regulatory framework, genomic regions can alter their 3-dimensional structure as a way of functionally regulating gene expression [18]. A result of this regulatory mechanism is the formation of chromatin loops at distinct genomic loci. The absence or presence of these loops can be empirically measured using chromosome conformation capture (3C), a measurement technology originally developed in 2002 [19]. Multiple genomic regions contribute to epistatic modulation through the formation of stable, conditional long-range chromosome interactions. The collective measurement of chromosome conformations at multiple genomic loci results in a chromosome conformation signature (CCS), or a molecular barcode that reflects the genomes response to its external environment [18,20]. For detection, screening and monitoring of CCS we utilized the EpiSwitch platform, an established, high resolution and high throughput methodology for detecting CCSs. Based on 3C, the EpiSwitch platform has been developed to assess changes in chromatin structure at defined genetic loci as well as long-range non-coding cis-and trans-regulatory interactions [20]. Among the advantages of using EpiSwitch for patient stratification are its binary nature, reproducibility, relatively low cost, rapid turnaround time (samples can be processed in under 24 h), the requirement of only a small amount of blood (~50 μL) and compliance with FDA standards of PCR-based detection methodologies. Previous studies developing CCSbased biomarkers using EpiSwitch have provided valuable blood-based stratifications in a variety of oncological, immunological and neurodegenerative conditions [21][22][23][24][25]. For example, in a recently published study, a stepwise discovery approach was used to develop a 5-marker bloodbased CCS that could identify patients with rheumatoid arthritis (RA) who were predicted to be likely non- responders to the first line therapy, methotrexate (MTX). When the CCS was applied to samples from a blinded test cohort of RA patients prior to the initiation of MTX therapy, the panel was able to accurately identify likely nonresponders with a true negative response rate of 86 and 90% sensitivity [25]. Thus, chromosome conformations offer a stable, binary, readout of cellular states and represent an emerging class of biomarkers [26].
Here, we used an approach based on the assessment of changes in chromosomal architecture to develop a blood-based diagnostic test for DLBCL COO subtyping. We hypothesized that interrogation of genomic architecture changes in blood samples from DLBCL patients could offer an alternative method to tissue-based COO classification approaches and provide a novel, noninvasive, and more clinically applicable methodology to guide clinical decision making and trial design.

Patient characteristics
A total of 118 DLBCL patients with a known COO subtype and 10 healthy controls (HC) were used in this study. The samples were a subset of those collected in the MAIN study; a phase III, randomized, placebocontrolled, trial of rituximab plus bevacizumab in aggressive Non-Hodgkin lymphoma (registered at clinicaltrials.gov, NCT identifier: 00486759). Detailed methods for the randomized, placebo-controlled, phase III MAIN study have been described previously [27]. Briefly, adult patients aged ≥18 years with newly-diagnosed CD20positive DLBCL were randomized to R-CHOP or R-CHOP plus bevacizumab (RA-CHOP). Informed consent was obtained from all patients contributing tumor specimens for biomarker analysis at the time of data cut-off (November 30, 2011). Blood samples collected from 60 DLBCL patients were used as a development cohort to identify, evaluate, and refine the CCS biomarker leads. The patients from this cohort were all typed as high/ strong GCB [28] or ABC [28] with a high subtype specific LPS (linear predictor scores). The remaining 58 DLBCL samples had intermediate LPS and were determined as ABC, GCB or Unclassified by Fluidigm testing (Supplemental Figure 1). These patient samples were not used for CCSs biomarker discovery and development; but were used at a later stage to assess the resultant classifier. The Fluidigm testing was done using tissue obtained from lymph nodes (either as punch biopsies or removed during surgery), and the EpiSwitch analysis was done using matched peripheral whole blood collected from the patients prior to receiving any therapy.

Cell lines
In addition to patient samples, 12 cell lines (six ABC and six GCB) were also used in the initial stage of the biomarker screening to identify the set of chromosome conformations that could best discriminate between ABC and GCB disease subtypes (Supplemental Table 1). Cell lines were obtained from the American Type Culture Collection (ATCC), the German Collection of Microorganisms and Cell Cultures (DSMZ), and the Japan Health Sciences Foundation Resource Bank (JHSF).

COO assay by GEP
RNA was isolated and purified from pre-treatment FFPE biopsies. DLBCL subtypes were determined by adaption of the Wright et al. algorithm [5] to expression data from a custom Fluidigm gene expression panel containing the 27 genes of the DLBCL subtype predictor [29]. Validation of the COO assay by comparing Fludigm qRT-PCR to Affymetrix data in a cohort of 15 non-trial subjects revealed a high correlation between qRT-PCR measurements from matched fresh frozen (FF) and FFPE samples across 19 classifier genes used. We also found a high correlation between Affymetrix microarray and Fluidigm qRT-PCR measurements from the same FF samples. Classifier gene weights calculated from qRT-PCR data from the Fluidigm COO assay were highly concordant with weights obtained from previously published microarray data in an independent patient cohort [28]. We observed high correlation (76% concordance) between LPS derived from the Fluidigm assay, data in FFPE tumor, and LPS derived from Affymetrix microarray data in matched FF tissue in the technical registry cohort, applying the previously described gene expression signature [5].

Identification of EpiSwitch markers
A pattern recognition algorithm was used to annotate the human genome for sites with the potential to form longrange chromosome conformations. The proprietary EpiSwitch pattern recognition software [22,23] operates based on Bayesian-modelling and provides a probabilistic score that a region is involved in long-range chromatin interactions. Sequences from 97 gene loci (Supplemental Table 2), selected based on a systematic literature review for genes that have been associated with DLBCL, were processed through the pattern recognition software to generate a list of the 13,322 chromosomal interactions most likely to be able to discriminate between DLBCL subtypes. For the initial screening, array-based comparisons were performed as described previously [25,30]. 60-mer oligonucleotide probes were designed to interrogate these potential interactions and uploaded as a custom array to the Agilent SureDesign website. Each probe was present in quadruplicate on the EpiSwitch microarray. To subsequently evaluate a potential CCS, nested PCR (EpiSwitch PCR) was performed using sequence-specific oligonucleotides designed using Primer3. Oligonucleotides were tested for specificity using oligonucleotide specific BLAST.

Preparation of genomic templates
Chromatin with intact chromosome conformations from 50 μl of each blood sample was extracted using the EpiSwitch assay following the manufacturer's instructions (Oxford BioDynamics Plc) [21,23,24]. The EpiSwitch microarray and EpiSwitch PCR detection methods were performed as published previously [25,30,31].

Network analysis
The top ten genomic loci that were identified as being dysregulated in DLBCL were uploaded as a protein list to the Reactome Functional Interaction Network plugin in Cytoscape to generate a network of epigenetic dysregulation in DLBCL. The ten loci were also uploaded to STRING (Search Tool for the Retrieval of Interacting Genes/Proteins DB) (https://string-db.org/), a database containing over 9 million known and predicted proteinprotein interactions [32]. Restricting to only human interactions, the main network (i.e. non-connected nodes were excluded) was generated. The top false discovery rate (FDR)-corrected functional enrichments were identified by Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases [33][34][35]. The top ten genomic loci were also uploaded to the KEGG Pathway Database (https://www.genome.jp/ kegg/pathway.html) to identify specific biological pathways that exhibit dysregulation in DLBCL.

Statistical analysis
Exact and Fisher's exact test (for categorical variables) were used to identify discerning markers. The level of statistical significance was set at p ≤ 0.05, and all tests were 2-sided. The Random Forest classifier was used to assess the ability of the EpiSwitch markers to identify DLBCL subtypes. Long term survival analysis was done by Kaplan-Meier analysis using the survival and survminer packages in R [36]. Mean survival time was calculated using a two-tailed t-test.

Initial screening and definition of chromosome conformations in DLBCL
We employed a step-wise approach to discover and validate a CCS biomarker panel that could differentiate between DLBCL subtypes (Fig. 1). As a first step in the discovery of the EpiSwitch classifier, 97 genetic loci (Supplemental Table 2) previously associated with DLBCL [1,[37][38][39][40][41] were selected and annotated for the predicted presence of chromosome conformation interaction sites and screened for their empirical presence using the EpiSwitch CGH Agilent array. The annotated array design represented 13,322 chromosome interaction candidates, with an average of 99 distinct cisinteractions tested at each locus (99 ± 64, mean ± SD). This discovery array was used to screen and identify a smaller pool of chromosome conformations that could differentiate between the two main DLBCL subtypes. The samples used for this step were from GCB and ABC cell lines (Supplemental Table 1) as well as whole blood from four typed DLBCL patients (two GCB and two ABC) and four HCs. The cell lines were grouped into high ABC and GCB and low ABC and GCB based on gene expression analysis. The comparisons used on the array were: 1) individual comparisons of DLBCL patients to pooled HCs 2) pooled DLBCL samples to pooled HC samples 3) pooled high ABC compared to pooled high GCB cell lines, and 4) pooled low ABC versus pooled low GCB cell lines.
From the array analysis, we identified 1095 statistically significant chromosomal interactions that differentiated between high ABC and GCB cell lines and were present in blood samples from DLBCL patients, but absent in HCs. These were further reduced to the top 293 interactions using a set of statistical filters, 151 of which were associated with the ABC subtype and 143 of which were associated with the GCB subtype. The top 72 interactions from either subtype (36 interactions for ABC and 36 interactions for GCB) were selected for further refinement using the EpiSwitch PCR platform on 60 typed DLBCL patient samples. For all 118 DLBCL samples, initial subtype classification was assigned based on the Wright algorithm, which calculates a linear predictor score (LPS) from the expression of a panel of 27 genes [5]. 60 samples were classified as either ABC or GBC and used to develop the EpiSwitch classifier (the "Discovery Cohort") and 58 samples were of intermediate LPS scores and used to evaluate the performance of the EpiSwitch classifier (the "Assessment Cohort") ( Fig. 1).

Refinement of DLBCL-specific chromosome conformation interactions and definition of the DLBCL-CCS
The 72 interactions identified in the initial screen were narrowed to a smaller pool using both the DLBCL patient samples during the discovery step and a second cohort of 60 DLBCL typed (30 ABC and 30 GCB) patient samples along with 12 HC (Fig. 1). The DLBCL subtype calls made by the EpiSwitch assay were confirmed using the Fluidigm platform. The Fluidigm gene expression analysis was performed on tissue biopsy samples, whereas whole blood from the same patients was used for the EpiSwitch PCR assay. The initial steps in refinement were to confirm by PCR that the 72 chromosomal interactions identified in the initial screen were specific to DLBCL and were absent in the HC samples. This was first tested on six untyped DLBCL samples and two HCs and resulted in identification of 21 interactions that were specific for DLBCL. Next, we used EpiSwitch PCR to test 24 blood samples from typed DLBCL patient samples (12 ABC and 12 GCB) to identify DLBCL-specific chromosome interactions using Fisher's test. This resulted in a set of 10 discriminating chromosome conformation interactions that could accurately discriminate between ABC and GCB subtypes and were further evaluated on blood samples from an additional set of 36 DLBCL samples (18 ABC and 18 GCB) (Fig. 1).
To test the accuracy, performance and robustness of the 10-marker panel, we used Exact test for feature selection on 80% of the complete sample cohort (Total 48 samples: 24 ABC and 24 GCB), with the remaining 20% (12 samples, 6 ABC and 6 GCB) used for later testing of the final selected CCSs markers. The data was split 10 times and the Exact test run on each of the splits using the 80% training set of each split. The composite p-value for the 10 markers over the 10 splits was then used to rank the markers. This analysis identified six chromosome conformations in the IFNAR1, MAP3K7, STAT3, TNFRSF13B, MEF2B, and ANXA11 genetic loci. Collectively, these six interactions formed the DLBCL chromosome conformation signature (DLBCL-CCS) (Fig. 2).
Testing the performance of DLBCL-CCS and assessing the classifier model The six markers in the DLBCL-CCS were used to generate a Random forest classifier model and applied to classify the test sets for each of the data splits (12 samples, 6 ABC and 6 GCB) in the Discovery Cohort of known disease subtypes. By principal component analysis (PCA), the DLBCL-CCS classifier was able to separate ABC and GCB patients from healthy controls (Supplemental Figure 2). The composite prediction probabilities for the DLBCL-CCS is shown in Supplemental Table 3 along with the odds ratio for each marker and the odd ratio for the model generated using logistic regression. The model provided a prediction probability score for ABC and GCB, ranging from 0.186 to 0.81 (0 = ABC, 1 = GCB  Table 3). The AUC under the receiver operating characteristic (ROC) curve for the DLBCL-CCS classifier on this sample cohort was 1 (Fig. 3b). Last, we compared the DLBCL subtype calls made by the DLBCL-CCS to the long-term survival curves of the patients with known disease subtype. The patients called as ABC showed significantly worse survival than those patients called as GBC (Fig. 3c).

Comparative analysis of EpiSwitch DLBCL-CCS and Fluidigm to classify type III DLBCL patients
Next, we evaluated the performance of the DLBCL-CCS the Assessment Cohort of 58 DLBCL patients with a more intermediate LPS value. We applied the DLBCL-CCS to assign these patients into DLBCL subtypes and compared the readouts to those made by Fluidigm. The DLBCL-CCS made subtyping calls for all 58 samples, whereas the Fluidigm assay made subtyping calls for 37 of the samples, leaving 21 as "unclassified" (Fig. 4). Of the 37 samples where subtype calls for both assays was available, 15 samples (40%) were called similarly by both assays (8 ABC and 7 GCB) (Fig. 4). Next, we evaluated the performance of the DLBCL subtype calls made by the DLBCL-CCS and Fluidigm by comparing the subtype calls made at diagnosis with the long-term survival curves of the Type III patients. As shown in the Kaplan-Meier survival curves in Fig. 5, the ABC/GBC calls made by the DLBCL-CCS was able to separate the two populations based on the known survival trends in DLBCL, with the ABC subtype having a worse prognosis [42]. In contrast, the ABC and GCB populations as defined by Fluidigm showed the opposite of what has been observed clinically, with samples classified as ABC having longer survival times than those classified as GCB. Though not statistically significant, the subtype calls made by the DLBCL-CCS matched historical clinical observations of survival differences between the subtypes by Hazard ratio analysis (Supplemental Figure 3). We did find a significant difference in mean survival time between the two methods. The mean survival of patients classified as ABC and GCB by Fluidigm was 651 and 626 days, respectively (p = 0.854), while the mean survival of patients classified as ABC and GCB by the DLBCL-CCS assay was 550 and 801 days (p = 0.017) (Fig. 6).

Biological relevance of deregulated DLBCL-CCS loci to disease
In order to explore the relationship between the loci that were observed to be epigenetically dysregulated in this study and biological mechanisms that have previously been reported to be linked to DLBCL, we performed a series of network and pathway analyses using the top 10 dysregulated loci as inputs. First, we explored how these loci were biologically related by building a Reactome Functional Interaction Network in Cytoscape which revealed a network centred on NFKB1, STAT3 and NFATC1 (Fig. 7a). A similar picture emerged when the 10 loci were used to build a network using STRING DB, with the most connected hubs centring on NFKB1, STAT3 and MAP3K7 and CD40 (Fig. 7b). The top enriched GO term for biological process was "positive regulation of transcription, DNAtemplated", the top enriched GO term for molecular function was "transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific binding" and the "Toll-like receptor signalling pathway" was the most enriched KEGG pathway (Supplemental Table 3).

Discussion
Due to the observed differences in disease progression for the different DLBCL subtypes, there is a pressing clinical need for a simple and reliable test that can differentiate between ABC and GBC disease subtypes. Given the aggressive nature of the disease, DLBCL requires immediate treatment. The two main subtypes have different clinical management paradigms and with several therapeutic modalities in development that target specific subtypes, having a rapid and accurate disease diagnostic is critical when clinical management depends on knowing disease subtype. The field of COOclassification in DLBCL has expanded from IHC based methodologies [13][14][15][16]43] to DNA microarrays, parallel quantitative reverse transcription PCR (qRT-PCR) and digital gene expression [3][4][5]17]. A current favoured method is based on identification of the COO by GEP on FFPE tissue and suffers from some technical and logistical limitations that limit its broad adoption in the clinical setting. In addition, there are many factors that affect the performance and reliability of COOclassification by GEP on FFPE tissue; including the nature/quality of lymphoma specimen, the experimental methods for data collection; data normalization and transformation, the type of classifier used, and the probability cut offs used for subtype assignment. Last, going from sample collection to an end readout using the Fluidigm approach is a complex and time-consuming process with many steps in between having the potential to introduce performance variability. All of these factors have an impact on the overall turnaround time of the assay and limits how it can be used clinically to diagnose and inform treatment of the disease using existing medications as well as select patients for late stage trials for novel DLBCL therapeutics. Thus, the need for a simple, minimally invasive and reliable assay to differentiate DLBCL subtypes is needed. Using a stepwise discovery approach, we identified a 6-marker epigenetic biomarker panel, the DLBCL-CCS, that could accurately discriminate between DLBCL subtypes. When compared to the subtype results derived from the gene expression signature there was perfect concordance; which was expected as these were samples that were used to develop the classifier. The concordance between the two assays when applied to samples with an intermediate LPS was lower (just over 40%). This is perhaps expected, as it has been noted that there is a lack of overall concordance in DLBCL subtype calls with different methods of classification, and the Type III samples are perhaps a more heterogenous population reflecting a more intermediate biology to begin with [44,45]. However, when we evaluated the predictive classification ability of the EpiSwitch assay in the Type III DLBCL patients followed longitudinally as their disease progressed, baseline predictions of disease subtype using the EpiSwitch assay was better at predicting actual disease subtype based on observed survival curves in patients with unclassified disease. The observation that the epigenetic readout based on regulatory 3D genomics used here is more consistent with actual clinical outcomes than the transcription-based gold-standard molecular approaches represents an actionable advance in the management of DLBCL. It is also consistent with latest system biology evaluations of regulatory 3D genomics as a molecular modality closely linked to phenotypical differences in oncological conditions [20].
We do note that DLBCL operates on a biological continuum, with significant heterogeneity in disease biology between subtypes. By design, the DLBCL-CCS was set up to classify Type III samples into either ABC or GCB subtypes. By GEX analysis, the Type III samples were identified as having intermediate subtype biology so may represent a more heterogenous population of patients. However, the overall observation that the DLBCL-CCS was a better predictor of disease subtype as measured by clinical progression than using a GEX-based approach and the fact that the EpiSwitch assay was able to make subtype calls in all samples, provides an initial indication that this approach can be applied in a clinical setting to inform on prognostic outlook, potentially guide treatment decisions, and provide predictions for response to novel therapeutic agents currently in development.
In the network analysis, the NF-kB and STAT3 signalling cascades emerged as putative mediators that differentiate between DLBCL subtypes. The role of NF-kB signalling in DLBCL has been studied before, in fact, one of the discriminating features of the ABC subtype is constitutive expression of NF-kB target genes, a mechanism which has been hypothesized for the poor prognosis in these patients [7,46]. In addition, mutations causing constitutive signalling activation have been observed predominantly in the ABC subtype for several NF-kB pathway genes, including TNFAIP3 and MYD88, [47,48]. In a study published early this year, Liu et al. used a bioinformatics approach to analyse three sets of previously published GEP studies (including the study that was used to develop the current gold standard subtyping assay) performed on FFPE DLBCL samples to identify key hub genes and pathways that were associated with DLBCL subtypes. In addition to validating the expression of STAT3 as a key gene, this meta-analysis of 500 DLBCL samples identified JAK-STAT and NF-kB as key pathways associated with the two subtypes of DLBCL [49]. Specifically, gene set enrichment analysis (GSEA) analysis revealed that genes involved in the JAK-STAT signalling pathway were upregulated in ABC and downregulated in GCB. As both STAT3 and NF-kB have been identified as therapeutic targets for DLBCL, the network analysis here confirms the biological link between the genomic loci in the DLBCL-CCS and mechanisms of disease progression and supports the prognostic and monitoring capabilities of the EpiSwitch CCSs identified here [50][51][52][53]. In addition to validating known mechanisms of DLBCL, the network analysis here identified a novel potential target for therapeutic intervention in DLBCL. For example, ANXA11, a calcium-regulated phospholipid-binding protein, has been implicated in other oncological conditions such as colorectal cancer, gastric cancer and ovarian cancer and could be a novel therapeutic intervention point in DLBCL [54][55][56].
One of the major clinical advantages of the approach to DLBCL subtyping described here lies in the simplified laboratory methodology and workflow. Conventional, gold-standard subtyping by GEP can be done using a variety of commercial platforms but all generally follow (and require) a four-step approach: 1) acquisition of a tissue biopsy, 2) preparation of FFPE tissue sections 3) gene expression analysis and 4) algorithmic classification of subtype. Obtaining a fine needle tissue biopsy of an enlarged, peripheral lymph node requires an inpatient visit to a clinical site and an invasive medical procedure requiring anaesthetic. Once obtained, the fresh biopsy needs to be prepared for paraffin embedding. This is a multi-step process, but generally involves immersion in liquid fixing agent (such as formalin) long enough for it to penetrate through the entire specimen, sequential dehydration through an ethanol gradient, followed by clearing in xylene, a toxic chemical. Last, the biospecimen needs to be infiltrated with paraffin wax and left to cool so that it solidifies and can be cut into micrometer sections using a microtome and mounted onto laboratory slides. The entire process of going from fresh tissue to FFPE sections on a slide can take several days. Next, in order to perform gene expression analysis, inherently unstable RNA is extracted from slide-mounted tissue sections and prepared for hybridization to microarrays according to the array manufacturer's specifications, a process that can take over a day. Following microarray hybridization, digital readouts of relative gene expression levels for the are obtained and fed into a classification algorithm to determine DLBCL subtype. All told, the process of going from a patient with suspected DLBCL to a subtype readout can take up to a week or longer, involves many different experimental steps using expensive technologies, each of which has the potential to introduce experimental variability along the way. In the approach described here, the time and the number of steps from biofluid collection to subtype readout are dramatically decreased. A patient with suspected DLBCL can present to an outpatient clinic for a routine, small volume (~1 mL) blood draw. Fresh frozen blood can then be shipped to a central, accredited reference lab for analysis of the absence/presence of the chromosome conformations identified in this study; a process that uses an even smaller volume (~50 mL) of whole blood as input along with specific PCR primer sets and reaction conditions to detect the chromosome conformations using simple and routine PCR instrumentation in less than 24 h from sample receipt. The approach to DLBCL subtyping described here offers an additional advantage in that the potential for further refinement using the proposed methodology exists. In this study, final readout of the DLBCL-CCS was done using a set of nested PCR reactions to detect chromosome conformations making up the classifier. This PCR-based output can be further refined to utilize quantitative PCR as a readout and operate under the minimum information for publication of quantitative real-time PCR experiments (MIQE) guidelines, designed to enhance experimental reproducibility and reliability across reference labs and testing sites.
Last, the approach described here is adaptable to the evolving understanding of the disease itself. Recent studies have suggested that DLBCL is more physiologically heterogeneous than initially appreciated and rather than simply two subtypes, a spectrum of different genetic subtypes exist [57]. While more detailed clinical annotations described in this work were not available at the time of the study, the general discovery approach taken here can easily be applied to additional sample cohorts. While the current study focused on assessing chromosome conformation changes in specific disease context, the general approach described has broader applications. The results presented here are consistent with the recent evidence of epigenetic markers acting as strong, systemic, surrogate signatures across oncological conditions [58]. While DLBCL serves as a notable example where the need to assess the molecular characteristics of a disease in order to guide treatment strategies, there are many other oncological conditions where the need for objective, reliable, and easily measured molecular biomarkers are needed; especially in cases where patient prognosis is poor and/or there is an increasing availability of molecularly targeted therapeutic options. A notable example in oncology is the development of checkpoint inhibitor therapies, where a multitude of drugs targeting the PD-L [1] pathway have been approved or are in development, but where biomarkers for prediction of response are lacking [59,60]. Recently, profiling of chromosome conformation signatures in peripheral blood has shown utility in the prediction of response to checkpoint inhibitor therapy, prior to the start of treatment, in patients with non-small cell lung cancer and other cancers; with better clinical performance characteristics than current molecular approaches [61][62][63]. In sum, our results suggest that the application of 3D genomics technologies that assess systemic alterations in genome topology represent a novel and informative class of emerging molecular biomarkers for the assessment of oncological disease and response to therapeutic intervention.

Conclusion
In conclusion, here we developed a robust complementary method for non-invasive COO assignment from whole blood samples using EpiSwitch CCSs readouts. We demonstrated the clinical validity of this classification approach on a large cohort of DLBCL patients. Chromosome conformations have emerged as a promising new class of biomarkers in oncology. In fact, a recent study performed Hi-C on primary B-cells of a DLBCL patient and detected significant structural variation between the DLBCL patient and healthy B-cells [64,65].
The EpiSwitch platform has several attractive features as a biomarker modality with clinical utility. CCSs have very high biochemical stability, can be detected using very small amounts of blood (typically around 50 μl) and detection utilizes established laboratory methodologies and standard PCR readouts (including MIQE-compliant qPCR) [26]. Finally, the rapid turnaround time (~8-16 h) of the EpiSwitch assay compares favourably to the over 48 h for the Fluidigm platform [66]. The application of this complementary assay can enable prospective selection of patients for therapeutic clinical trials and ultimately, can be used to guide appropriate patient management in clinical practice.