Medical Policy
Policy Num: 11.003.001
Policy Name: Laboratory Tests Post Transplant and for Heart Failure
Policy ID: [11.003.001] [Ac / B / M- / P-] [2.01.68]
Last Review: November 15, 2024
Next Review: November 20, 2025
Related Policies:
07.003.009 - Heart/Lung Transplant
07.003.007 - Heart Transplant
References No. | Populations | Interventions | Comparators | Outcomes |
1 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
2 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
3 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
4 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
5 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
6 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
7 | Individuals:
| Interventions of interest are:
| Comparators of interest are:
| Relevant outcomes include:
|
Clinical assessment and noninvasive imaging of chronic heart failure can be limited in accurately diagnosing patients with heart failure because symptoms and signs can poorly correlate with objective methods of assessing cardiac dysfunction. For management of heart failure, clinical signs and symptoms (eg, shortness of breath) are relatively crude markers of decompensation and occur late in the course of an exacerbation. Thus, circulating biomarkers have potential benefit in heart failure diagnosis and management.
In transplant recipients, despite the progress in immunosuppressant therapy, the risk of rejection remains. Diagnosis of allograft rejection continues to rely on clinical monitoring and histologic confirmation by tissue biopsy. However, due to limitations of tissue biopsy, including a high degree of interobserver variability in the grading of results and its potential complications, less invasive alternatives have been investigated. Several laboratory-tested biomarkers of transplant rejection have been evaluated and are commercially available for use. The laboratory tests for heart transplant rejection currently evaluated in this policy include the Presage® ST2 Assay kit, which measures the soluble suppression of tumorigenicity-2 (sST2) protein biomarker; ; the Heartsbreath test, which measures breath markers of oxidative stress; the AlloSure, Prospera Heart and myTAIHEART tests for assessment of donor-derived cell-free DNA (dd-cfDNA); the AlloMap test, which uses gene expression profiling (GEP); and the HeartCare test, which combines AlloMap GEP testing with the AlloSure test. Also included in this policy are the AlloSure and Prospera dd-cfDNA tests for assessment of renal and lung transplant rejection.
For individuals who have chronic heart failure who receive the sST2 assay to determine prognosis and/or to guide management, the evidence includes correlational studies and 2 meta-analyses. Relevant outcomes are overall survival (OS), quality of life, and hospitalization. Most of the evidence is from reanalysis of existing randomized controlled trials (RCTs) and not from studies specifically designed to evaluate the predictive accuracy of sST2, and prospective and retrospective cross-sectional studies made up a large part of 1 meta-analysis. Studies have mainly found that elevated sST2 levels are statistically associated with an elevated risk of mortality. A pooled analysis of study results found that sST2 significantly predicted overall mortality and cardiovascular mortality. Several studies, however, found that sST2 test results did not provide additional prognostic information compared with N-terminal pro B-type natriuretic peptide levels. Moreover, no comparative studies were identified on the use of the sST2 assay to guide the management of patients diagnosed with chronic heart failure. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have heart transplantation who receive sST2 assay to determine prognosis and/or to predict acute cellular rejection, the evidence includes a small number of retrospective studies on the Presage ST2 Assay. Relevant outcomes are OS, morbid events, and hospitalization. No prospective studies were identified that provide high-quality evidence on the ability of sST2 to predict transplant outcomes. One retrospective study (n = 241) found that sST2 levels were associated with acute cellular rejection and mortality; another study (n = 26) found that sST2 levels were higher during an acute rejection episode than before rejection. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive a measurement of volatile organic compounds to assess cardiac allograft rejection, the evidence includes a diagnostic accuracy study. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The published study found that, for identifying grade 3 (now grade 2R) rejection, the NPV of the breath test the study evaluated (97.2%) was similar to endomyocardial biopsy (96.7%) and the sensitivity of the breath test (78.6%) was better than that for biopsy (42.4%). However, the breath test had a lower specificity (62.4%) and a lower positive predictive value (PPV) (5.6%) in assessing grade 3 rejection than a biopsy (specificity, 97%; PPV, 45.2%). The breath test was also not evaluated for grade 4 rejection. This single study is not sufficient to determine the clinical validity of the test measuring volatile organic compounds and no studies on clinical utility were identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive dd-cfDNA testing to determine acute rejection, the evidence includes diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. Evidence from 3 studies suggests that the dd-cfDNA fraction is elevated in acute rejection, but optimal fraction cut-offs for detection of acute rejection have not been established. Using dd-cfDNA thresholds ranging from 0.12% to 0.32% resulted in NPVs ranging from 82% to 98% and AUCs ranging from 0.61 to 0.86 in 3 studies. At present, no studies evaluating the clinical utility for the measurement of dd-cfDNA for heart transplant rejection have been identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive GEP to assess cardiac allograft rejection, the evidence includes 2 diagnostic accuracy studies and several RCTs evaluating clinical utility. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The 2 studies, Cardiac Allograft Rejection Gene Expression Observation (CARGO, CARGO II) examining the diagnostic performance of GEP for detecting moderate-to-severe rejection lacked a consistent threshold for defining a positive GEP test (ie, 20, 30, or 34) and reported a low number of positive cases. In the available studies, although the NPVs were relatively high (ie, at least 88%), the performance characteristics were only calculated based on 10 or fewer cases of rejection; therefore, performance data may be imprecise. Moreover, the PPV in CARGO II was only 4.0% for patients who were at least 2 to 6 months post transplant and 4.3% for patients more than 6 months post transplant. The threshold indicating a positive test that seems to be currently accepted (a score of 34) was not prespecified; rather it evolved partway through the data collection period in the Invasive Monitoring Attenuation through Gene Expression (IMAGE) study. In addition, the IMAGE study had several methodologic limitations (eg, lack of blinding); further, the IMAGE study failed to provide evidence that GEP offers an incremental benefit over biopsy performed on the basis of clinical exam or echocardiography. Patients at the highest risk of transplant rejection are patients within 1 year of the transplant, and, for that subset, there remains insufficient data on which to evaluate the clinical utility of GEP. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive GEP with testing of dd-cfDNA) to assess cardiac allograft rejection, the evidence includes 1 retrospective analysis of the HeartCare test and 1 diagnostic accuracy study of the AlloSure dd-cfDNA component of the HeartCare test. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The HeartCare analysis reported a 12.7% reduction in endomyocardial biopsy volume among patients undergoing routine surveillance. However, this observation is limited by lack of reporting on long-term health outcomes and incomplete assessment of diagnostic performance for combined testing, as patients with negative dd-cfDNA scores did not undergo biopsy regardless of GEP score per study protocol. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals with a renal transplant who are undergoing surveillance or have clinical suspicion of allograft rejection who receive testing of dd-cfDNA to assess renal allograft rejection, the evidence includes diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. One study examined the diagnostic performance of dd-cfDNA for detecting moderate-to-severe rejection; the NPV was moderately high (84%), and performance characteristics were calculated on 27 cases of active transplant rejection. The threshold indicating a positive test was not prespecified. A subsequent smaller single-center study that explored variation in clinical validity based on different rejection mechanisms found the strongest performance characteristics for AlloSure with antibody-mediated rejection. A retrospective single-center study of the Prospera dd-cfDNA test reported a PPV and NPV of 52% and 95%, respectively, for detection of active rejection among a combined cohort of patients undergoing surveillance or for-cause biopsies, using the 1% dd-cfDNA threshold previously proposed for the AlloSure test. A second, prospective Prospera study reported PPVs of 68% and 71% and NPVs 91% and 83% using combined dd-cfDNA fraction and absolute quantity compared with two different reference standards. Larger prospective studies validating the dd-cfDNA thresholds for active rejection are needed to develop conclusions for each test. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals with a lung transplant who receive testing of dd-cfDNA to assess lung allograft rejection, the evidence includes 4 small diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. One study examined the diagnostic performance of dd-cfDNA testing at a threshold of 0.87% for detecting acute cellular rejection, yielding a PPV of 34.1% and a NPV of 85.5%. A second study reported a PPV of 43.3% and NPV of 83.6% for an aggregate rejection cohort composed of patients with acute cellular rejection, antibody-mediated rejection, and CLAD. In the third study, using a dd-cfDNA cut-off of 1.0%, PPV was 51.9% and NPV was 97.3% for acute rejection, and 43.6%, and 91.0% for acute rejection, CLAD/NRAD or infection. One study that used dd-cfDNA testing as part of a home surveillance program found a PPV 43.4% and NPV 96.5% for detection of ACR, AMR or infection, but when limited to patients with a contemporaneous reference standard surveillance bronchoscopy independent of dd-cfDNA level, PPV 66.7% and NPV was 79.2%. All 4 studies were limited by small sample sizes, and no clinical utility studies were identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Not applicable.
The objective of this evidence review is to determine whether the measurement of various selected biomarkers improves the detection of allograft rejection in transplant patients or in the diagnosis and management of heart failure, thus improving net health outcomes.
The use of the Presage ST2 Assay to evaluate the prognosis of individuals diagnosed with chronic heart failure is considered investigational.
The use of the Presage ST2 Assay to guide management (eg, pharmacologic, device-based, exercise) of individuals diagnosed with chronic heart failure is considered investigational.
The use of the Presage ST2 Assay in the post cardiac transplantation period, including but not limited to predicting prognosis and predicting acute cellular rejection, is considered investigational.
The measurement of volatile organic compounds to assist in the detection of moderate grade 2R (formerly grade 3) heart transplant rejection is considered investigational.
The use of peripheral blood measurement of dd-cfDNA in the post cardiac transplantation period, including but not limited to predicting prognosis and predicting acute cellular rejection, is considered investigational.
The use of peripheral blood gene expression profile tests alone or in combination with peripheral blood measurement of donor-derived cell-free DNA (dd-cfDNA) in the management of individuals after heart transplantation, including but not limited to the detection of acute heart transplant rejection or heart transplant graft dysfunction, is considered investigational.
The use of peripheral blood measurement of dd-cfDNA in the management of individuals after renal transplantation, including but not limited to the detection of acute renal transplant rejection or renal transplant graft dysfunction, is considered investigational.
The use of peripheral blood measurement of dd-cfDNA in the management of individuals after lung transplantation, including but not limited to the detection of acute lung transplant rejection or lung transplant graft dysfunction, is considered investigational.
The U.S. Food and Drug Administration has indicated that the Heartsbreath (Menssana Research) test is only for use as an aid in the diagnosis of grade 3 (now known as grade 2R) heart transplant rejection in patients who have received heart transplants within the preceding year and who have had endomyocardial biopsy within the previous month.
See the Codes table for details.
BlueCard/National Account Issues
State or federal mandates (eg, Federal Employee Program) may dictate that certain U.S. Food and Drug Administration-approved devices, drugs, or biologics may not be considered investigational, and thus these devices may be assessed only by their medical necessity.
Monitoring of heart transplantation rejection is a specialized procedure that may require an out-of-network referral. Specifically, the AlloMap test is only performed in the company’s laboratory.
Benefits are determined by the group contract, member benefit booklet, and/or individual subscriber certificate in effect at the time services were rendered. Benefit products or negotiated coverages may have all or some of the services discussed in this medical policy excluded from their coverage.
Heart failure is a major cause of morbidity and mortality worldwide. The term heart failure refers to a complex clinical syndrome that impairs the heart's ability to move blood through the circulatory system.1, The prevalence of heart failure in the U.S. between 2013 and 2016 was an estimated 6.2 million for Americans ≥20 years old, up from 5.7 million between 2009 and 2012.2,3, Heart failure is the leading cause of hospitalization among people older than age 65 years, with direct and indirect costs estimated at $37 billion annually in the U.S.2, Although survival has improved with treatment advances, absolute mortality rates of heart failure remain near 50% within 5 years of diagnosis.
Heart failure can be caused by disorders of the pericardium, myocardium, endocardium, heart valves or great vessels, or metabolic abnormalities. Individuals with heart failure may present with a wide range of left ventricular (LV) anatomy and function. Some have normal LV size and preserved ejection fraction; others have severe LV dilatation and depressed ejection fraction. However, most patients present with key signs and symptoms secondary to congestion in the lungs from impaired LV myocardial function.1, They include dyspnea, orthopnea, and paroxysmal dyspnea. Other symptoms include weight gain due to fluid retention, fatigue, weakness, and exercise intolerance secondary to diminished cardiac output.
Initial evaluation of a patient with suspected heart failure is typically based on clinical history, physical examination, and chest radiograph. Because people with heart failure may present with nonspecific signs and symptoms (eg, dyspnea), accurate diagnosis can be challenging. Therefore, noninvasive imaging procedures (eg, echocardiography, radionuclide angiography) are used to quantify pump function of the heart, thus identifying or excluding heart failure in patients with characteristic signs and symptoms. These tests can also be used to assess prognosis by determining the severity of the underlying cardiac dysfunction.1, However, clinical assessment and noninvasive imaging can be limited in accurately evaluating patients with heart failure because symptoms and signs can poorly correlate with objective methods of assessing cardiac dysfunction.4,5,6, Thus, invasive procedures (eg, cardiac angiography, catheterization) are used in select patients with presumed heart failure symptoms to determine the etiology (ie, ischemic vs. nonischemic) and physiologic characteristics of the condition.
Patients with heart failure may be treated using a number of interventions. Lifestyle factors such as the restriction of salt and fluid intake, monitoring for increased weight, and structured exercise programs are beneficial components of self-management. A variety of medications are available to treat heart failure. They include diuretics (eg, furosemide, hydrochlorothiazide, spironolactone), angiotensin-converting enzyme inhibitors (eg, captopril, enalapril, lisinopril), angiotensin receptor blockers (eg, losartan, valsartan, candesartan), b-blockers (eg, carvedilol, metoprolol succinate), and vasodilators (eg, hydralazine, isosorbide dinitrate). Numerous device-based therapies are also available. Implantable cardioverter defibrillators reduce mortality in patients with an increased risk of sudden cardiac death. Cardiac resynchronization therapy improves symptoms and reduces mortality for patients who have disordered LV conduction evidenced by a wide QRS complex on electrocardiogram. Ventricular assist devices are indicated for patients with end-stage heart failure who have failed all other therapies and are also used as a bridge to cardiac transplantation in select patients.1,
Because of limitations inherent in standard clinical assessments of patients with heart failure, a number of objective disease biomarkers have been investigated to diagnose and assess heart failure patient prognosis, with the additional goal of using biomarkers to guide therapy.7, They include a number of proteins, peptides, or other small molecules whose production and release into circulation reflect the activation of remodeling and neurohormonal pathways that lead to LV impairment. Examples include B-type natriuretic peptide (BNP), its analogue N-terminal pro B-type natriuretic peptide (NT-proBNP), troponin T and I, renin, angiotensin, arginine vasopressin, C-reactive protein, and norepinephrine.1,7,
BNP and NT-proBNP are considered the reference standards for biomarkers in assessing heart failure patients. They have had a substantial impact on the standard of care for diagnosis of heart failure and are included in the recommendations of all major medical societies, including the American College of Cardiology Foundation and American Heart Association, 8, European Society of Cardiology,9, and the Heart Failure Society of America.10, Although natriuretic peptide levels are not 100% specific for the clinical diagnosis of heart failure, elevated BNP or NT-proBNP levels in the presence of clinical signs and symptoms reliably identify the presence of structural heart disease due to remodeling and heightened risk for adverse events. Natriuretic peptides also can help in determining the prognosis of heart failure patients, with elevated blood levels portending a poorer prognosis.
In addition to diagnosing and assessing the prognosis of heart failure patients, blood levels of BNP or NT-proBNP have been proposed as an aid for managing patients diagnosed with chronic heart failure. 8,11,12, Levels of either biomarker rise in response to myocardial damage and LV remodeling, whereas they tend to fall as drug therapy ameliorates symptoms of heart failure. Evidence from a large number of randomized controlled trials (RCTs) that have compared BNP- or NT-proBNP-guided therapy with clinically guided adjustment of pharmacologic treatment of patients who had chronic heart failure has been assessed in recent systematic reviews and meta-analyses. However, these analyses have not consistently reported a benefit for BNP-guided management. Savarese et al (2013) published the largest meta-analysis to date–a patient-level meta-analysis that evaluated 2686 patients from 12 RCTs.11, This meta-analysis showed that NT-proBNP-guided management was associated with significant reductions in all-cause mortality and heart failure-related hospitalization compared with clinically guided treatment. Although BNP-guided management in this meta-analysis was not associated with significant reductions in these parameters, differences in patient numbers and characteristics may explain the discrepancy. Troughton et al (2014) conducted a second patient-level meta-analysis that included 11 RCTs with 2000 patients randomized to natriuretic peptide-guided pharmacologic therapy or usual care.12, The results showed that, among patients 75 years of age or younger with chronic heart failure, most of whom had impaired left ventricular ejection fraction, natriuretic peptide-guided therapy was associated with significant reductions in all-cause mortality compared with clinically guided therapy. Natriuretic-guided therapy also was associated with significant reductions in hospitalization due to heart failure or cardiovascular disease.
A protein biomarker, ST2, has elicited interest as a potential aid to predict prognosis and manage therapy of heart failure.13,14,15,16,17,18,19, This protein is a member of the interleukin-1 (IL-1) receptor family. It is found as a transmembrane isoform (ST2L) and a soluble isoform (sST2), both of which have circulating IL-33 as their primary ligand. ST2 is a unique biomarker that has pluripotent effects in vivo. Thus, binding between IL-33 and ST2L is believed to have an immunomodulatory function via T-helper type 2 lymphocytes and was initially described in the context of cell proliferation, inflammatory states, and autoimmune diseases.20, However, the IL-33/ST2L signaling cascade is also strongly induced through the mechanical strain of cardiac fibroblasts or cardiomyocytes. The net result is mitigation of adverse cardiac remodeling and myocardial fibrosis, which are key processes in the development of heart failure.21, The soluble isoform of ST2 is produced by lung epithelial cells and cardiomyocytes and is secreted into circulation in response to exogenous stimuli, mechanical stress, and cellular stretch. This form of ST2 binds to circulating IL-33, acting as a "decoy," thus inhibiting the IL-33-associated antiremodeling effects of the IL-33/ST2L signaling pathway. Thus, on a biologic level, IL-33/ST2L signaling plays a role in modulating the balance of inflammation and neurohormonal activation and is viewed as pivotal for protection from myocardial remodeling, whereas sST2 is viewed as attenuating this protection. In the clinic, blood concentrations of sST2 appear to correlate closely with adverse cardiac structure and functional changes consistent with remodeling in patients with heart failure, including abnormalities in filling pressures, chamber size, and systolic and diastolic function.7,15,17,
An enzyme-linked immunosorbent-based assay is commercially available for determining sST2 blood levels (Presage ST2 Assay).18, The manufacturer claims a limit of detection of 1.8 ng/mL for sST2, and a limit of quantification of 2.4 ng/mL, as determined according to Clinical and Laboratory Standards Institute guideline EP-17-A. Mueller and Dieplinger (2013) reported a limit of detection of 2.0 ng/mL for sST2 in their study.18, In the same study, the assay had a within-run coefficient of variation of 2.5% and a total coefficient of variation less than 4.0%, demonstrated linearity within the dynamic range of the assay calibration curve, and exhibited no relevant interference or cross-reactivity.
The ST2 biomarker is not intended to diagnosis heart failure because it is a relatively nonspecific marker that is increased in many other disparate conditions that may be associated with acute or chronic manifestations of heart failure.17,18, Although the natriuretic peptides (BNP, NT-proBNP) reflect different physiologic aspects of heart failure compared with sST2, they are considered the reference standard biomarkers when used with clinical findings to diagnose, prognosticate, and manage heart failure and as such are the comparator to sST2.
Most cardiac transplant recipients experience at least a single episode of rejection in the first year after transplantation. The International Society for Heart and Lung Transplantation (2005) modified its grading scheme for categorizing cardiac allograft rejection.22, The revised (R) categories are listed in Table 1.
New Grade | Definition | Old Grade |
0R | No rejection | |
1R | Mild rejection | 1A, 1B, and 2 |
2R | Moderate rejection | 3A |
3R | Severe rejection | 3B and 4 |
Acute cellular rejection is most likely to occur in the first 6 months after transplantation, with a significant decline in the incidence of rejection after this time. Although immunosuppressants are required on a life-long basis, dosing is adjusted based on graft function and the grade of acute cellular rejection determined by histopathology. Endomyocardial biopsies are typically taken from the right ventricle via the jugular vein periodically during the first 6 to 12 months post transplant. The interval between biopsies varies among clinical centers. A typical schedule is weekly for the first month, once or twice monthly for the following 6 months, and several times (monthly to quarterly) between 6 months and 1-year post transplant. Surveillance biopsies may also be performed after the first postoperative year (eg, on a quarterly or semiannual basis). This practice, although common, has not been demonstrated to improve transplant outcomes. Some centers no longer routinely perform endomyocardial biopsies after 1 year in patients who are clinically stable.
While the endomyocardial biopsy is the criterion standard for assessing heart transplant rejection, it is limited by a high degree of interobserver variability in the grading of results and potential morbidity that can occur with the biopsy procedure. Also, the severity of rejection may not always coincide with the grading of the rejection by biopsy. Finally, a biopsy cannot be used to identify patients at risk of rejection, limiting the ability to initiate therapy to interrupt the development of rejection. For these reasons, an endomyocardial biopsy is considered a flawed criterion standard by many. Therefore, noninvasive methods of detecting cellular rejection have been explored. It is hoped that noninvasive tests will assist in determining appropriate patient management and avoid overuse or underuse of treatment with steroids and other immunosuppressants that can occur with false-negative and false-positive biopsy reports. Two techniques are commercially available for the detection of heart transplant rejection.
In addition to its use as a potential aid to predict prognosis and manage therapy of heart failure, elevated serum ST2 levels have also been associated with an increased risk of antibody-mediated rejection following a heart transplant. For this reason, ST2 has also been proposed as a prognostic marker post heart transplantation and as a test to predict acute cellular rejection (graft-versus-host disease). The Presage ST2 Assay, described above, is a commercially available sST2 test that has been investigated as a biomarker of heart transplant rejection.
The Heartsbreath test, a noninvasive test that measures breath markers of oxidative stress, has been developed to assist in the detection of heart transplant rejection. In heart transplant recipients, oxidative stress appears to accompany allograft rejection, which degrades membrane polyunsaturated fatty acids and evolving alkanes and methylalkanes that are, in turn, excreted as volatile organic compounds in breath. The Heartsbreath test analyzes the breath methylated alkane contour, which is derived from the abundance of C4 to C20 alkanes and monomethylalkanes and has been identified as a marker to detect grade 3 (clinically significant) heart transplant rejection.
Cell-free DNA (cfDNA), released by damaged cells, is normally present in healthy individuals.23, In patients who have received transplants, dd-cfDNA may be also present. It is proposed that allograft rejection, which is associated with damage to transplanted cells, may result in an increase in dd-cfDNA. HeartCare (CareDx) is a commercially-available test that combines AlloMap gene expression profiling with a next-generation sequencing assay that quantifies the fraction of dd-cfDNA in cardiac transplant recipients relative to total cfDNA. The AlloMap score, AlloMap score variability, and AlloSure % dd-cfDNA are reported.
Prospera Heart (Natera) is a commercially available assay that uses massively multiplexed PCR (mmPCR) followed by next-generation sequencing (NGS) to quantify the fraction of dd-cfDNA in transplant recipients. Donor versus recipient cfDNA is differentiated via an advanced bioinformatics analysis of >13,000 single-nucleotide polymorphisms (SNPs) without the need for prior recipient or donor genotyping or computational adjustments for related donors.24, The Prospera Heart test reports the dd-cfDNA fraction in the patient’s blood as a predictor of acute rejection, although the optimal dd-cfDNA cut-point is not described by the manufacturer.
Using proprietary myTAIHEART software (TAI Diagnostics), the myTAIHEART test uses multiplexed, high-fidelity amplification followed by allele-specific qPCR of a panel of 94 highly informative bi-allelic single nucleotide polymorphisms (SNPs) and two controls to quantitatively genotype cfDNA in the patient’s plasma after cardiac transplant, and accurately distinguish dd-cfDNA originating from the engrafted heart from cfDNA originating from the recipient’s native cells.25, The ratio of dd-cfDNA to total cfDNA is reported as the donor fraction (%) and categorizes the patient as at low or increased risk of moderate (grade 2R) to severe (grade 3R) acute cellular rejection: low donor fractions indicate less damage to the transplanted heart and a lower risk for rejection, while increased donor fractions indicate more damage to the transplanted heart and an increased risk for rejection. Testing with myTAIHEART does not require a donor specimen. TAI Diagnostics suspended production of the myTAIHEART test in 2020. As of September 2022, TAI Diagnostics appears to no longer be operational and it is unclear if myTAIHEART will be available through another company in the future.
Another approach has focused on patterns of gene expression of immunomodulatory cells, as detected in the peripheral blood. For example, microarray technology permits the analysis of the expression of thousands of genes, including those with functions known or unknown. Patterns of gene expression can then be correlated with known clinical conditions, permitting a selection of a finite number of genes to compose a custom multigene test panel, which then can be evaluated using polymerase chain reaction techniques. AlloMap (CareDx) is a commercially available molecular expression test that has been developed to detect acute heart transplant rejection or the development of graft dysfunction. The test involves polymerase chain reaction-expression measurement of a panel of genes derived from peripheral blood cells and applies an algorithm to the results. The proprietary algorithm produces a single score that considers the contribution of each gene in the panel. The score ranges from 0 to 40. The AlloMap website states that a lower score indicates a lower risk of graft rejection; the website does not cite a specific cutoff for a positive test.26, All AlloMap testing is performed at the CareDx reference laboratory in California.
Other laboratory-tested biomarkers of heart transplant rejection have been evaluated. They include brain natriuretic peptide, troponin, and soluble inflammatory cytokines. Most have had low accuracy in diagnosing rejection. Preliminary studies have evaluated the association between heart transplant rejection and micro-RNAs or high-sensitivity cardiac troponin in cross-sectional analyses but the clinical use has not been evaluated.27,28,
Allograft dysfunction is typically asymptomatic and has a broad differential, including graft rejection. Diagnosis and rapid treatment are recommended to preserve graft function and prevent loss of the transplanted organ. For a primary kidney transplant from a deceased donor (accounting for about 70% of kidney donors), graft survival at 1 year is 95%; at 5 years, graft survival is 78%.29,30,
Surveillance of transplant kidney function relies on routine monitoring of serum creatinine, urine protein levels, and urinalysis.31, Allograft dysfunction may also be demonstrated by a drop in urine output or, rarely, as pain over the transplant site. With clinical suspicion of allograft dysfunction, additional noninvasive workup including ultrasonography or radionuclide imaging may be used. A renal biopsy allows a definitive assessment of graft dysfunction and is typically a percutaneous procedure performed with ultrasonography or computed tomography guidance. Biopsy of a transplanted kidney is associated with fewer complications than biopsy of a native kidney because the allograft is typically transplanted more superficially than a native kidney. Renal biopsy is a low-risk invasive procedure that may result in bleeding complications; loss of a renal transplant, as a complication of renal biopsy, is rare.32,
Kidney biopsies allow for diagnosis of acute and chronic graft rejection, which may be graded using the Banff Classification.33,34, Pathologic assessment of biopsies demonstrating acute rejection allows clinicians to further distinguish between acute cellular rejection and antibody-mediated rejection, which are treated differently.
AlloSure Kidney (CareDx) is a commercially available, next-generation sequencing assay that quantifies the fraction of dd-cfDNA in renal transplant recipients relative to total cfDNA by measuring 266 single nucleotide variants. Separate genotyping of the donor or recipient is not required but patients who receive a kidney transplant from a monozygotic (identical) twin are not eligible for this test. The fraction of dd-cfDNA relative to total cfDNA present in the peripheral blood sample is cited in the report. For patients undergoing surveillance, a routine testing schedule is recommended for longitudinal monitoring.
Prospera Kidney (Natera) is a commercially available assay that quantifies the fraction of dd-cfDNA in renal transplant recipients. The manufacturer recommends use of the Prospera test when there is clinical suspicion of active rejection and for regular surveillance of subclinical rejection in renal transplant recipients.35, In a surveillance scenario, regular testing is recommended at 1, 2, 3, 4, 6, 9 and 12 months after renal transplant or most recent rejection.36, Thereafter, the test should be repeated quarterly. The proportion of dd-cfDNA relative to total cfDNA is reported, with detection of ≥1% dd-cfDNA indicating increased risk for active rejection. The percent dd-cfDNA change between tests is also reported.
Despite advances in induction and maintenance immunosuppressive regimens, lung transplant recipients have a median overall survival of 6 years, with more than a third of patients receiving treatment for acute rejection in the first year after transplant.37,38, Acute cellular rejection, lymphocytic bronchiolitis, and antibody-mediated rejection are all risk factors for subsequent development of chronic lung allograft dysfunction (CLAD). Pathologic grading of acute cellular rejection is based on the histological assessment of perivascular and interstitial mononuclear cell infiltrates. Antibody-mediated rejection may be clinical (symptomatic or asymptomatic allograft dysfunction) or subclinical (normal allograft function). Key diagnostic criteria established via consensus by the International Society for Heart and Lung Transplantation include the presence of antibodies directed toward donor human leukocyte antigens and characteristic lung histology with or without evidence of complement 4d within the graft.39, The most common phenotype of CLAD is a persistent obstructive decline in lung function known as bronchiolitis obliterans syndrome (BOS), which is graded based on the degree of decrease in FEV1. Approximately 50% of patients develop BOS within 5 years post-transplant. Median survival following a diagnosis of BOS is 3-5 years. Acute rejection may present with non-specific physical symptoms or be asymptomatic. However, the role of surveillance bronchoscopy for screening asymptomatic patients for acute rejection is controversial, and performance of surveillance bronchoscopies varies across transplant centers.
AlloSure Lung (CareDx) is a commercially available, NGS assay that quantifies the fraction of dd-cfDNA in lung transplant patients relative to total cfDNA by measuring single nucleotide polymorphisms. The test is intended to provide a direct, noninvasive measure of organ injury in lung transplant patients who are undergoing surveillance. Suggested thresholds for severe injury, injury, and quiescence are 1%, 0.85%, and <0.5%, respectively.40,
Prospera Lung (Natera) is a commercially available assay that uses the same methodology as Propera Heart and Prospera Kidney to quantify the fraction of dd-cfDNA in transplant recipients. The Prospera Lung test reports the dd-cfDNA fraction in the patient’s blood as a predictor of acute rejection, chronic rejection, or infection although the optimal dd-cfDNA cut-point for each outcome is not described by the manufacturer.41,
The U.S. Food and Drug Administration (FDA) has cleared multiple biomarker tests for the detection of heart and renal allograft rejection. Table 2 provides a summary of the biomarker tests currently included in this policy that have FDA clearance.
Test | Manufacturer | FDA Clearance Type, Product Number | FDA Clearance Date | Indicated Use |
Heartsbreath™ | Menssana Research | Humanitarian device exemption, H030004 | 2004 | To aid in diagnosing grade 3 heart transplant rejection in patients who have received heart transplants within the preceding year. The device is intended as an adjunct to, and not as a substitute for, endomyocardial biopsy and is also limited to patients who have had endomyocardial biopsy within the previous month. |
AlloMap® Molecular Expression Testing | CareDx, formerly XDx | 510(k), k073482 | 2008 | The test is to be used in conjunction with clinical assessment, for aiding in the identification of heart transplant recipients with stable allograft function and a low probability of moderate-to-severe transplant rejection. It is intended for patients at least 15 years old who are at least 2 months post transplant. |
Presage® ST2 Assay kit | Critical Diagnostics | 510(k), k093758 | 2011 | For use with clinical evaluation as an aid in assessing the prognosis of patients diagnosed with chronic heart failure |
FDA: Food and Drug Administration.
There are also commercially available laboratory-developed biomarker tests for the detection of heart and renal allograft rejection. Clinical laboratories may develop and validate tests in-house and market them as a laboratory service; laboratory-developed tests must meet the general regulatory standards of the Clinical Laboratory Improvement Amendments. To-date, AlloSure (CareDx) renal and lung and Prospera (Natera) renal dd-cfDNA testsare regulated under the Clinical Laboratory Improvement Amendments standards.
myTAIHEART is also a laboratory developed test (LDT) developed for clinical diagnostic performance exclusively in the College of American Pathologists (CAP) and Clinical Laboratory Improvement Amendment (CLIA) accredited TAI Diagnostics Clinical Reference Laboratory.24, This test was developed and its performance characteristics were determined by TAI Diagnostics.
These LDTs have not been cleared or approved by the FDA.
Other commercially available LDTs without FDA clearance or approval for use have been excluded from this evidence review when studies reporting on the clinical validity of the marketed version of the test could not be identified and/or where the test is marketed for research use only. Excluded tests and their descriptions are summarized for reference purposes in Table 3.
Test | Manufacturer | Technology | Indications for Use |
KidneyCare® | CareDx | dd-cfDNA and GEP | Available as a research tool through the OKRA Registry. |
AlloSeq® HCT | CareDx | NGS | To aid in the assessment of engraftment following HCT via NGS analysis of 202 biallelic SNPs. The fraction of recipient and donor genomic DNA is reported. The test is marketed for research use only. |
AlloSeq® Tx17 | CareDx | NGS | An NGS test utilizing Hybrid Capture Technology conducted pre-transplant to identify optimal transplant matches. The test sequences full HLA genes and other transplant-associated genes (KIR, MICA/B, C4, HPA, ABO). This test is marketed for research use only. |
Viracor TRAC® | Eurofins | dd-cfDNA | To aid in the diagnosis of solid organ transplant rejection via NGS analysis. The fraction of dd-cfDNA is reported.1 |
MMDx® Heart | Kashi Clinical Laboratories | Tissue-based microarray | Tissue-based microarray mRNA gene expression test of 1283 genes post-transplant to provide a probability score of rejection as a complement to conventional biopsy processing. The test is not marketed to provide information for the diagnosis, prevention, or treatment of disease or to aid in the clinical decision-making process. |
MMDx® Kidney | Kashi Clinical Laboratories | Tissue-based microarray | Tissue-based microarray mRNA gene expression test of 1494 genes post-transplant to provide a probability score of rejection as a complement to conventional biopsy processing. The test is not marketed to provide information for the diagnosis, prevention, or treatment of disease or to aid in the clinical decision-making process. |
dd-cfDNA: donor-derived cell-free DNA; GEP: gene expression profiling; HCT: hematopoietic cell transplantation; HLA: human leukocyte antigen; MMDx: molecular microscope diagnostic system; NGS: next-generation sequencing; OKRA: Outcomes in KidneyCare in Renal Allografts; SNP: single-nucleotide polymorphism; TRAC: transplant rejection allograft check. 1 Published studies reporting on the clinical validity of the marketed version of the test were not identified.
This evidence review was created in November 2004 and has been updated regularly with searches of the PubMed database. The most recent literature update was performed through August 21, 2023.
Evidence reviews assess whether a medical test is clinically useful. A useful test provides information to make a clinical management decision that improves the net health outcome. That is, the balance of benefits and harms is better when the test is used to manage the condition than when another test or no test is used to manage the condition.
The first step in assessing a medical test is to formulate the clinical context and purpose of the test. The test must be technically reliable, clinically valid, and clinically useful for that purpose. Evidence reviews assess the evidence on whether a test is clinically valid and clinically useful. Technical reliability is outside the scope of these reviews, and credible information on technical reliability is available from other sources.
Promotion of greater diversity and inclusion in clinical research of historically marginalized groups (e.g., People of Color [African-American, Asian, Black, Latino and Native American]; LGBTQIA (Lesbian, Gay, Bisexual, Transgender, Queer, Intersex, Asexual); Women; and People with Disabilities [Physical and Invisible]) allows policy populations to be more reflective of and findings more applicable to our diverse members. While we also strive to use inclusive language related to these groups in our policies, use of gender-specific nouns (e.g., women, men, sisters, etc.) will continue when reflective of language used in publications describing study populations.
The purpose of the Soluble Suppression of Tumorigenicity-2 (sST2) assay is to determine prognosis and/or to guide management in patients with chronic heart failure as an alternative to or an improvement on existing tests and clinical assessment.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with chronic heart failure.
The test being considered is sST2 assay to determine prognosis and/or to guide management. Elevated sST2 levels are purported to predict a higher risk of poor outcomes.
Comparators of interest include standard prognostic markers, including B-type natriuretic peptide levels and clinical assessment.
The general outcomes of interest are overall survival (OS), quality of life, and hospitalizations. Follow-up of 6-12 months would be appropriate to assess quality of life outcomes.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
For the evaluation of clinical validity of sST2 testing, methodologically credible studies were selected using the following principles:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described
Included a validation cohort separate from the development cohort
A number of clinical studies in which sST2 blood levels were determined using the Presage ST2 Assay have reported that there is an association between ST2 levels and adverse outcomes in patients diagnosed with chronic heart failure. A substantial body of biomarker evidence has been reported retrospectively from subsets of patients enrolled in randomized controlled trials (RCTs) of heart failure interventions. These RCTs include the Valsartan Heart Failure Trial (Val-HeFT)42,; Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION)43,; Controlled Rosuvastatin Multinational Trial in Heart Failure (CORONA)44,; ProBNP Outpatient Tailored Chronic Heart Failure study (PROTECT).45, Although patients in these RCTs were well-characterized and generally well-matched between study arms, the trials were neither intended nor designed specifically to evaluate biomarkers as risk predictors. At present, no prospectively gathered evidence is available from an RCT in which sST2 levels were compared with levels of a B-type natriuretic peptide (BNP or N-terminal pro B-type natriuretic peptide [NT-proBNP]) to predict risk for adverse outcomes among well-defined cohorts of patients with diagnosed chronic heart failure. Key results of larger individual studies are summarized in Table 4.
Study | Population | Mean Age, y | Study Description and Biomarkers | Primary Endpoints | Mean FU | Synopsis of Findings |
Ky et al (2011)46, | Ambulatory CHF (N = 1,141, 75% of Penn HF Study population) | 56 | Retrospective analysis of sST2 and NT-proBNP levels and their incremental usefulness over clinical SHFM | Mortality or cardiac transplant | 2.8 y |
|
Bayes-Genis et al (2012)47, | Ambulatory decompensated HF (N = 891) | 70 | Retrospective analysis of sST2 and NT-proBNP levels from consecutive series | Mortality | 2.8 y |
|
Broch et al (2012)48, | Ischemic CHF (N = 1,149, 30% of CORONA RCT) | 72 | Retrospective analysis of sST2, NT-proBNP, and CRP levels | CV mortality, nonfatal myocardial infarction or stroke | 2.6 y |
|
Felker et al (2013)49, | Ambulatory HF (N = 910, 39% of HF-ACTION RCT) | 59 | Retrospective analysis of sST2 and NT-proBNP levels | Mortality, hospitalization, functional capacity | 2.5 y |
|
Gaggin et al (2013)50, | Recently decompensated CHF (n=151, 100% of PROTECT RCT) | 63 | Retrospective analysis of sST2 and NT-proBNP levels | Composite outcome (worsening HF, hospitalization for HF, clinically significant CV events) | 0.8 y |
|
Anand et al (2014)51, | CHF (n=1,650, 33% of Val-HeFT RCT) | 63 | Retrospective analysis of sST2, NT-proBNP, and other biomarker levels | All-cause mortality and composite outcome (mortality, SCD with resuscitation, hospitalization for HF, or administration of IV inotropic or vasodilator drug for ≥4 h without hospitalization) |
| |
Zhang et al (2015)52, | De novo HF or decompensated CHF (N = 1161) | 58 | Prospective analysis of sST2 in a hospitalized sample at 1 center in China | All-cause mortality | 1 y |
|
Dupuy et al (2016)53, | HF for ≥6 mo (N = 178) | 75 | Prospective analysis of sST2, NT-proBNP, and other biomarker levels in a sample from 1 center in France | All-cause mortality and CV mortality | 42 moa |
|
CHF: chronic heart failure; CRP: C-reactive protein; CV: cardiovascular; FU: follow-up; HF: heart failure; HF-ACTION: Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training; IV: intravenous; NT-proBNP: N-terminal pro B-type natriuretic peptide; PROTECT: ProBNP Outpatient Tailored Chronic Heart Failure study; RCT: randomized controlled trial; SCD: sudden cardiac death; SHFM: Seattle Heart Failure Model; sST2: soluble suppression of tumorigenicity-2; Val-HeFT: Valsartan Heart Failure Trial. a Median.
Aimo et al (2017) pooled findings of studies on the prognostic value of sST2 for chronic heart failure in a meta-analysis.54, The meta-analysis selected 7 studies, including post hoc analyses of RCTs, and calculated the association between the Presage ST2 Assay and health outcomes. A pooled analysis of 7 studies found that sST2 was a statistically significant predictor of overall mortality (hazard ratio [HR] = 1.75; 95% confidence interval [CI], 1.37 to 2.22). Moreover, a pooled analysis of 5 studies found that sST2 was a significant predictor of cardiovascular mortality (HR = 1.79; 95% CI, 1.22 to 2.63).
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No evidence is available from randomized or nonrandomized controlled studies in which outcomes from groups of well-matched patients managed using serial changes in sST2 blood levels were compared with those managed using the reference standard of BNP or NT-proBNP levels.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
No inferences can be drawn about the clinical utility of sST2 levels for chronic heart failure.
Several analyses, mostly retrospective, have evaluated whether sST2 levels are associated with disease prognosis, especially mortality outcomes. Studies mainly found that elevated sST2 levels were statistically associated with an elevated risk of mortality. A pooled analysis of study results found that sST2 levels significantly predicted overall mortality and cardiovascular mortality. Several studies, however, found that sST2 test results did not provide additional prognostic information compared with BNP or NT-proBNP levels. In general, it appears that elevated sST2 levels predict higher risk of poor outcomes better than lower levels. The available evidence is limited by interstudy inconsistency and differences in patient characteristics, particularly the severity of heart failure, its etiology, duration, and treatment. Furthermore, most of the evidence was obtained from retrospective analyses of sST2 levels in subsets of larger patient cohorts within RCTs, potentially biasing the findings. The evidence primarily shows associations between elevated sST2 levels and poor outcomes, but does not go beyond that in demonstrating a clinical connection among biomarker status, treatment received, and clinical outcomes.
For individuals who have chronic heart failure who receive the sST2 assay to determine prognosis and/or to guide management, the evidence includes correlational studies and 2 meta-analyses. Relevant outcomes are overall survival (OS), quality of life, and hospitalization. Most of the evidence is from reanalysis of existing randomized controlled trials (RCTs) and not from studies specifically designed to evaluate the predictive accuracy of sST2, and prospective and retrospective cross-sectional studies made up a large part of 1 meta-analysis. Studies have mainly found that elevated sST2 levels are statistically associated with an elevated risk of mortality. A pooled analysis of study results found that sST2 significantly predicted overall mortality and cardiovascular mortality. Several studies, however, found that sST2 test results did not provide additional prognostic information compared with N-terminal pro B-type natriuretic peptide levels. Moreover, no comparative studies were identified on the use of the sST2 assay to guide the management of patients diagnosed with chronic heart failure. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 1Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
The purpose of sST2 assay is to determine prognosis and/or to predict acute cellular rejection in patients with heart transplantation an alternative to or an improvement on existing tests.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with heart transplantation.
The test being considered is sST2 assay to determine prognosis and/or to predict acute cellular rejection.
Comparators of interest include endomyocardial biopsy for predicting acute cellular rejection.
The general outcomes of interest are OS, quality of life, and hospitalizations.
Outcomes | Details | Timing |
Morbid events | Short-term and long-term events, such as acute cellular rejection, myocardial infarction, and stroke | 30 days, 6 months, 1-5 years |
Hospitalizations | Inpatient hospital admissions | 30 days, 6 months, 1-5 years |
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
For the evaluation of clinical validity of sST2 testing, methodologically credible studies were selected using the following principles:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described
Included a validation cohort separate from the development cohort
Serum ST2 levels have been proposed as a prognostic marker post heart transplantation and as a test to predict acute cellular rejection (graft-versus-host disease). There is very little evidence available for these indications. Januzzi et al (2013) retrospectively assessed sST2 levels in 241 patients post–heart transplant.55, Over a follow-up out to 7 years, sST2 levels were predictive of total mortality (HR = 2.01; 95% CI, 1.15 to 3.51; p =.01). Soluble ST2 levels were also associated with risk of acute cellular rejection, with a significant difference between the top and bottom quartiles of sST2 levels in the risk of rejection (p =.003).
Pascual-Figal et al (2011) reported on 26 patients post cardiac transplantation with and an acute rejection episode.56, Soluble ST2 levels were measured during the acute rejection episode and compared with levels measured when acute rejection was not present. Soluble ST2 levels were higher during the acute rejection episode (130 ng/mL) than during the nonrejection period (50 ng/mL; p =.002). Elevated sST2 levels greater than 68 ng/mL had a positive predictive value (PPV) of 53% and a negative predictive value (NPV) of 83% for the presence of acute cellular rejection. The addition of sST2 levels to serum BNP resulted in incremental improvement in identifying rejection episodes.
Study | Study Type | Country | Dates | Participants | Treatment | Follow-Up |
Januzzi (2013)55, | Retrospective | United States | NR | Post–cardiac transplantation | sST2 levels assessment (n=241) | Median 7.1 years |
Pascual-Figal (2011)56, | Retrospective | Spain | 2002-2007 | Post–cardiac transplantation with acute rejection | sST2 levels assessment (n=26) | Median 3 months |
NR: not reported, sST2: soluble suppression of tumorigenicity-2.
Study | Total Mortality | ST2 Levels | PPV | NPV |
Januzzi (2013)55, | NR | NR | ||
HR (95% CI) | 2.02 (1.16-3.52) | ≥ 30 ng/mL at 7-year follow-up | NA | NA |
P-value | .01 | NR | NA | NA |
Pascual-Figal (2011)56, | 53% | 83% | ||
Rejection Episode | NR | 130 ng/mL (IQR 60-238 ng/mL) | NA | NA |
Nonrejection Period | NR | 50 ng/mL (IQR 28-80 ng/mL) | NA | NA |
P-value | NR | .002 | NA | NA |
HR: hazard ratio; IQR: interquartile range; NA: not applicable; NPV: negative predictive value; NR: not reported, PPV: positive predictive value.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No RCTs were identified using sST2 levels that directed patient management in heart transplantation patients and which assessed patient outcomes.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
No inferences can be drawn about the clinical utility of sST2 levels for patients with heart transplantation.
Few studies are available, and they are observational and retrospective. No prospective studies were identified that provide high-quality evidence on the ability of sST2 levels to predict transplant outcomes. One retrospective study (N = 241) found that sST2 levels were associated with acute cellular rejection and mortality; another study (N = 26) found that sST2 levels were higher during an acute rejection episode than before rejection.
For individuals who have heart transplantation who receive sST2 assay to determine prognosis and/or to predict acute cellular rejection, the evidence includes a small number of retrospective studies on the Presage ST2 Assay. Relevant outcomes are OS, morbid events, and hospitalization. No prospective studies were identified that provide high-quality evidence on the ability of sST2 to predict transplant outcomes. One retrospective study (n = 241) found that sST2 levels were associated with acute cellular rejection and mortality; another study (n = 26) found that sST2 levels were higher during an acute rejection episode than before rejection. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 2Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
The purpose of measuring volatile organic compounds in patients with a heart transplant is to assess for heart allograft rejection in a noninvasive manner.
The following PICO was used to select literature to inform this review.
The relevant population of interest are individuals with a heart transplant.
The test being considered measures volatile organic compounds to assess for allograft rejection.
The following test is currently being used to diagnose heart allograft rejection: routine endomyocardial biopsy.
The general outcomes of interest are OS, test validity, morbid events, and hospitalizations. Follow-up over months to years is necessary to monitor for signs of allograft rejection.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
For the evaluation of the clinical validity of measuring volatile organic compounds, studies that met the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard (describe the reference standard)
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
The U.S. Food and Drug Administration approval of the Heartsbreath test was based on the results of the Heart Allograft Rejection: Detection with Breath Alkanes in Low Levels (HARDBALL) study sponsored by the National Heart, Lung, and Blood Institute.57, The HARDBALL study was a 3-year, multicenter study of 1061 breath samples in 539 heart transplant patients. Before the scheduled endomyocardial biopsy, patient breath was analyzed by gas chromatography and mass spectroscopy for volatile organic compounds. The amount of C4 to C20 alkanes and monomethylalkanes was used to derive the marker for rejection, known as the breath methylated alkane contour. The breath methylated alkane contour results were compared with subsequent biopsy results, as interpreted by 2 readers using the International Society for Heart and Lung Transplantation biopsy grading system as the criterion standard for rejection.22,
The authors of the HARDBALL study reported that the abundance of breath markers that measured oxidative stress was significantly greater in grade 0, 1, or 2 rejection than in healthy normal persons. In contrast, in grade 3 rejection, the abundance of breath markers that measure oxidative stress was found to be reduced, most likely due to accelerated catabolism of alkanes and methylalkanes that make up the breath methylated alkane contour. The authors also reported that in identifying grade 3 rejection, the NPV of the breath test (97.2%) was similar to endomyocardial biopsy (96.7%) and that the breath test could potentially reduce the total number of biopsies performed to assess for rejection in patients at low-risk for grade 3 rejection. The sensitivity of the breath test was 78.6% vs 42.4% with biopsy. However, the breath test had a lower specificity (62.4%) and a lower PPV (5.6%) in assessing grade 3 rejection than a biopsy (specificity, 97%; PPV = 45.2%). In addition, the breath test was not evaluated in grade 4 rejection.
Findings from the HARDBALL study were published by Phillips et al (2004). No subsequent studies evaluating the use of the Heartsbreath test to assess for graft rejection were identified in literature updates.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No RCTs assessing the measurement of volatile organic compounds to diagnose cardiac allograft rejection were identified.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of measuring volatile organic compounds to assess for cardiac allograft rejection has not been established, a chain of evidence to support clinical utility cannot be constructed.
A published study found that for identifying grade 3 (now grade 2R) rejection, the NPV of the breath test the study evaluated (97.2%) was similar to endomyocardial biopsy (96.7%), and the sensitivity of the breath test (78.6%) was better than that for biopsy (42.4%). However, the breath test had a lower specificity (62.4%) and a lower PPV (5.6%) in assessing grade 3 rejection than a biopsy (specificity, 97%; PPV = 45.2%). The breath test was also not evaluated for grade 4 rejection. At present, no studies evaluating the clinical utility for the measurement of volatile organic compound testing for heart transplant have been identified.
For individuals who have a heart transplant who receive a measurement of volatile organic compounds to assess cardiac allograft rejection, the evidence includes a diagnostic accuracy study. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The published study found that, for identifying grade 3 (now grade 2R) rejection, the NPV of the breath test the study evaluated (97.2%) was similar to endomyocardial biopsy (96.7%) and the sensitivity of the breath test (78.6%) was better than that for biopsy (42.4%). However, the breath test had a lower specificity (62.4%) and a lower positive predictive value (PPV) (5.6%) in assessing grade 3 rejection than a biopsy (specificity, 97%; PPV, 45.2%). The breath test was also not evaluated for grade 4 rejection. This single study is not sufficient to determine the clinical validity of the test measuring volatile organic compounds and no studies on clinical utility were identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 3Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
The purpose of donor-derived cell-free DNA (dd-cfDNA) testing in patients with a heart transplant is to assess for allograft rejection.
The following PICO was used to select literature to inform this review.
The relevant population of interest are individuals with heart transplants.
The test being considered is dd-cfDNA testing to assess for allograft rejection (i.e., AlloSure, Prospera, myTAIHEART).
The AlloSure and Prospera tests report the fraction of dd-cfDNA, with both tests using a proposed high-risk of active transplant rejection cutoff of ≥0.15%. Clinical interpretation of alternate thresholds for quiescence (<0.12%), injury (0.20%) and severe injury (0.35%) have also been proposed.
The myTAIHEART test uses proprietary software to quantitatively genotype cfDNA in the patient’s plasma after cardiac transplant, and distinguish dd-cfDNA originating from the engrafted heart from cfDNA originating from the recipient’s native cells. Production of the myTAIHEART test was halted in 2020.
The following test is currently being used to diagnose cardiac allograft rejection: routine endomyocardial biopsy.
The general outcomes of interest are OS, test validity, morbid events, and hospitalizations. Follow-up over months to years is needed to monitor for signs of allograft rejection.
Beneficial outcomes resulting from a true-negative test result are avoiding unnecessary subsequent biopsy. Harmful outcomes resulting from a false-positive result may include unnecessary biopsy or unnecessary treatment. Harmful outcomes from a false-negative result are increased risk of adverse transplant outcomes.
In a triage scenario, the test would need to identify precisely a group of patients that could safely forgo biopsy; therefore, the sensitivity, NPV, and negative likelihood ratio are key test performance characteristics.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
For the evaluation of the clinical validity of dd-cfDNA testing, studies that met the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard (describe the reference standard)
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
Khush et al (2019) published performance characteristics for the AlloSure Heart dd-cfDNA test as assessed in the D-OAR prospective, multicenter registry study.58, Patients already undergoing AlloMap testing for surveillance were eligible for inclusion; however following a protocol amendment, dd-cfDNA specimens were only obtained in patients with clinical suspicion of rejection and a planned for-cause biopsy after 2016 through 2018. The majority of dd-cfDNA samples (81%) were drawn in the first-year post-transplant. The D-OAR cohort included 841 biopsy-paired dd-cfDNA results, of which 587 were performed for routine surveillance of rejection. Overall, cell-mediated rejection and antibody-mediated rejection were biopsy-confirmed in 17 and 18 cases, respectively. The AUC for detecting acute rejection was 0.64 (95% CI, 0.52 to 0.75). At a 0.2% cutoff for dd-cfDNA, the sensitivity, specificity, PPV, and NPV for detection of acute rejection was 80%, 44%, 8.9%, and 97.1% respectively. For the subgroup of patients undergoing surveillance, the sensitivity, specificity, PPV, and NPV were 38.1%, 84.0%, 8.1%, and 97.3%, with a corresponding AUC of 0.61 (95% CI, 0.46 to 0.74). Among for-cause samples, the sensitivity, specificity, PPV, and NPV were 53.8%, 76.1%, 11.6%, and 96.6%, respectively. The study is limited by the protocol changes designed to increase the number of observed rejection events overall and low availability of concurrent dd-cfDNA results with respect to biopsy specimens (58%).
Kim et al (2022)59, assessed the clinical validity of the Prospera Heart dd-cfDNA test versus endocardial biopsy for prediction of acute heart transplant rejection. The study included 811 samples (703 prospectively collected and 108 retrospectively collected) from 223 heart transplant patients with a planned biopsy from 2 U.S. centers. The median patient age was 54 years and 27% were female. Race/ethnicity of the study population was: 54% White, 21% Hispanic, 12% Black, 6% Asian and 5% other race/ethnicity. The majority (91% [737/811]) of reference standard biopsies were conducted for surveillance, and median dd-cfDNA was lower in the surveillance samples (0.04%) than the for-cause samples (0.22%). The time from transplant to biopsy was 10 weeks, and the total prevalence of acute rejection was 9.0%. Median dd-cfDNA % was 0.58% in patients with acute rejection, although fractions varied according to rejection type/grade and were higher in those with antibody mediated rejection (median range 0.44% to 3.43%) than those with acute cellular rejection (median range 0.045% to 0.13%). In patients without acute rejection, dd-cfDNA % was 0.04. Diagnostic accuracy for 3 dd-cfDNA fractions were explored: 0.12%, 0.15% and 0.20%. At a cut-off off of 0.12%, sensitivity was 86.6%, specificity was 72.0%, PPV was 23.4%, and NPV 98.2%. Corresponding values at a dd-cfDNA cut-of of 0.15% were 78.6%, 76.9%, 25.1% and 97.3%, and 78.6%, 82.1%, 30.3% and 97.5% at a dd-cfDNA cut-off of 0.20%. This resulted in an AUC for detection of acute rejection of 0.86 (95% CI 0.77 to 0.96). The optimal dd-cfDNA fraction for detection of heart transplant rejection has yet to be established. Limitations of the study include potential selection bias, as only patients with a scheduled biopsy were included in the study, and study authors noted that the prevalence of acute rejection in the study cohort was higher than in other cohorts.
In a study funded by TAI Diagnostics, Inc., North et al (2020) performed a blinded clinical validation study on 158 matched pairs of endomyocardial biopsy-plasma samples collected from 76 volunteer adult and pediatric heart transplant recipients (ages 2 months or older, and 8 days or more post-transplant) between June of 2010 and Aug 2016 from 2 Milwaukee transplant centers.25, Based on acute cellular rejection grade as defined by the 2004 International Society for Heart and Lung Transplantation (ISHLT) classification, ROC analysis was performed to evaluate diagnostic accuracy across all possible dd-cfDNA % cutoffs. To maximize diagnostic accuracy, Youden’s Index was used to select the optimal cutoff, found to correspond to a donor fraction value of 0.32%. Using this cutoff, clinical performance characteristics of the assay included a NPV of 100.00% for grade 2R or higher acute cellular rejection, with 100.00% sensitivity and 75.48% specificity; AUC for this analysis was 0.842, indicative of robust ability of the donor fraction assay to rule out 2R or greater acute cellular rejection for donor fraction values less than 0.32%. There was no statistically significant correlation of donor fraction with age. Donor fraction elevation can also be caused by other forms of injury to the donor heart such as acute cellular rejection 1R, acute antibody-mediated rejection (AMR), and presence of coronary artery vasculopathy (CAV), thereby requiring correlation of myTAIHEART results with other clinical indicators.
In study funded by a grant from the National Institutes of Health and TAI Diagnostics, Inc., Richmond et al (2019) assessed 174 post-cardiac transplant patients from 7 centers (ages 2.4 months-73.4 years) with myTAIHEART testing (before transplant; 1, 4, and 7 days following transplant; and at discharge from transplant hospitalization) using blinded analysis of biopsy-paired samples.60, All the patients were followed for at least 1 year. dd-cfDNA was higher in acute cellular rejection 1R/2R (n = 15) than acute cellular rejection 0R (healthy) (n = 42) (p =.02); an optimal donor fraction threshold (0.3%) was determined by the use of ROC analysis, revealing an AUC of 0.81 with a sensitivity of 0.65, specificity of 0.93, and an NPV of 81.8% for the absence of any allograft rejection.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No RCTs assessing the measurement of dd-cfDNA to diagnose cardiac allograft rejection were identified.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of measuring dd-cfDNA to assess for cardiac allograft rejection has not been established, a chain of evidence to support clinical utility cannot be constructed.
Studies measuring dd-cfDNA suggest that the dd-cfDNA fraction is elevated in acute rejection, but optimal fraction cut-offs for detection of acute rejection have not been established. Using dd-cfDNA thresholds ranging from 0.12% to 0.32% resulted in NPVs ranging from 82% to 98% and AUCs ranging from 0.61 to 0.86 in 3 studies. At present, no studies evaluating the clinical utility for the measurement of dd-cfDNA for heart transplant rejection have been identified.
For individuals who have a heart transplant who receive dd-cfDNA testing to determine acute rejection, the evidence includes diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. Evidence from 3 studies suggests that the dd-cfDNA fraction is elevated in acute rejection, but optimal fraction cut-offs for detection of acute rejection have not been established. Using dd-cfDNA thresholds ranging from 0.12% to 0.32% resulted in NPVs ranging from 82% to 98% and AUCs ranging from 0.61 to 0.86 in 3 studies. At present, no studies evaluating the clinical utility for the measurement of dd-cfDNA for heart transplant rejection have been identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 4Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
The purpose of gene expression profiling (GEP) and donor-derived cell-free DNA (dd-cfDNA) testing in patients with a heart transplant is to assess for allograft rejection.
The following PICO was used to select literature to inform this review.
The relevant population of interest are individuals with heart transplants.
The test being considered is GEP to assess for allograft rejection (ie, AlloMap), used alone or in combination with AlloSure Heart dd-cfDNA testing. The combination of these tests is commercially marketed as HeartCare (CareDx).
AlloMap test results are reported on a scale from 0 to 40, with a proposed high-risk cutoff of ≥ 30 for patients < 6 months post-transplant and ≥34 for patients ≥6 months post-transplant. The HeartCare report provides the AlloMap score, AlloMap score variability, and AlloSure percent dd-cfDNA. Direct guidance for the combined interpretation of results is not provided in the HeartCare report, but potential clinical implications of concordant and discordant test scenarios have been proposed.61,
The following test is currently being used to diagnose cardiac allograft rejection: routine endomyocardial biopsy.
The general outcomes of interest are OS, test validity, morbid events, and hospitalizations. Follow-up over months to years is needed to monitor for signs of allograft rejection.
Beneficial outcomes resulting from a true-negative test result are avoiding unnecessary subsequent biopsy. Harmful outcomes resulting from a false-positive result may include unnecessary biopsy or unnecessary treatment. Harmful outcomes from a false-negative result are increased risk of adverse transplant outcomes.
In a triage scenario, the test would need to identify precisely a group of patients that could safely forgo biopsy; therefore, the sensitivity, NPV, and negative likelihood ratio are key test performance characteristics.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
For the evaluation of the clinical validity of GEP testing, studies that met the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard (describe the reference standard)
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
A TEC Assessment (2011) reviewed the evidence on the use of GEP using the AlloMap test.62, The Assessment concluded that the evidence was insufficient to permit conclusions about the effect of the AlloMap test on health outcomes. Key evidence in the TEC Assessment is described below.
Patterns of gene expression for the development of the AlloMap test were studied in the Cardiac Allograft Rejection Gene Expression Observation (CARGO) study, which included 8 U.S. cardiac transplant centers enrolling 629 cardiac transplant recipients.63, The study included the discovery and validation phases. In the discovery phase, patient blood samples were obtained during the endomyocardial biopsy, and the expression levels of more than 7000 genes involved in immune responses were assayed and compared with the biopsy results. A subset of 252 candidate genes was identified, from which a panel of 11 genes was selected for evaluation. A proprietary algorithm was applied to the results, producing a single score that considers the contribution of each gene in the panel.
The validation phase of the CARGO study, published by Deng et al (2006), was prospective, blinded, and enrolled 270 patients.63, Primary validation was conducted using samples from 63 patients independent from discovery phases of the study and enriched for biopsy-proven evidence of rejection. A prospectively defined test cutoff value of 20 resulted in a sensitivity of 84% of patients with moderate/severe rejection but a specificity of 38%. Of note, in the “training set” used in the study, these rates were 80% and 59%, respectively. The authors evaluated the 11-gene expression profile on 281 samples collected at 1 year or more from 166 patients who were representative of the expected distribution of rejection in the target population (and not involved in discovery or validation phases of the study). When a test cutoff of 30 was used, the NPV (no moderate/severe rejection) was 99.6%; however, only 3.2% of specimens had grade 3 or higher rejection. In this population, grade 1B scores were found to be significantly higher than grade 0, 1A, and 2 scores but were similar to grade 3 scores.
A second prospective multicenter study evaluating the clinical validity of GEP with the AlloMap test (CARGO II) was published by Crespo-Leiro et al (2016).64, The study enrolled 499 heart transplant recipients undergoing surveillance for allograft rejection. The reference standard for rejection status was histologic grade from an endomyocardial biopsy performed on the same day as blood samples were collected. Blood samples need to be collected 55 days or more post transplant, more than 30 days after blood transfusion, more than 21 days after administration of prednisone 20 mg/day or more, and more than 60 days after treatment for a prior rejection. Patients had a total of 1579 eligible blood samples for which paired GEP scores and endomyocardial biopsy rejection grades were available.
As in the original CARGO study, the proportion of cases of rejection was small. The prevalence of moderate-to-severe rejection (grade 2R/>3A) reported by local pathologists was 3.2%, which was reduced to 2.0% when confirmation from 1 or more other independent pathologists was required. At a GEP cutoff of 34, for patients who were at least 2 to 6 months post transplant, the sensitivity of GEP for detecting grade 2R/>3A was 25.0%, and the specificity was 88.7%. The PPV and NPV were 4.0% and 98.4%, respectively. Using the same cutoff of 34, for patients more than 6 months post transplant, the sensitivity of GEP was 25.0%, the specificity was 88.8%, the PPV was 4.3%, and the NPV was 98.3%. The number of true-positives used in the above calculations was 5 (9.1%) of 55 for patients at least 2 to 6 months post transplant and 6 (10.2%) of 59 for patients more than 6 months post transplant.
Kanwar et al (2021) published data from the Outcomes AlloMap Registry (OAR) indicating that asymptomatic or active cytomegalovirus infection is associated with significantly higher AlloMap scores among heart transplant recipients compared to those without infection, even in the absence of acute rejection, potentially resulting in unnecessary biopsies among surveillance patients.65, Donor-derived cell-free DNA levels measured by the AlloSure Heart test available for a small subset of samples (5.3%) were not significantly different between groups. The authors conclude that further assessment of the combined use of AlloMap and AlloSure scores is required to determine if this will improve differentiating infection-related from rejection-related immune activation. The combined use of these tests, commercially available as HeartCare (CareDx), is addressed in the following section.
The commercially available HeartCare (CareDx) test combines AlloMap GEP testing with AlloSure Heart measurement of percent dd-cfDNA. The combined use of GEP and dd-cfDNA testing for surveillance of acute rejection was assessed in a single-center, retrospective study conducted by Gondi et al (2021) between February 2019 and March 2020.66, Patients (N=153) were required to be ≥ 55 days post transplant, hemodynamically stable, ≥ 15 years of age, and single-organ recipients. The majority of patients were male (74.5%) and white (78.4%) with an average age of 54.5 years. Patients were assessed once monthly between 2 and 12 months, every 3 months between 12 and 24 months, and every 6 months between 24 and 36 months postmtransplant. Pre-specified thresholds for GEP scores were ≥ 30 for patients < 6 months post transplant and ≥ 34 for patients ≥ 6 months post-transplant. The pre-specified threshold for percent dd-cfDNA was ≥ 0.20% based on a prior study of the AlloSure test by Khush et al (2019),58, described in the following section. In patients < 6 months post-transplant, endomyocardial biopsy was performed regardless of test results. For patients ≥ 6 months post-transplant who received both GEP and dd-cfDNA testing, endomyocardial biopsy was canceled in patients with dd-cfDNA < 0.20% regardless of AlloMap score. In patients with positive AlloMap scores but negative dd-cfDNA, endomyocardial biopsy could be performed or deferred in favor of repeat dd-cfDNA testing. Among 495 samples, overall test result distributions were 59.6% for patients negative on both tests, 12.3% for patients positive by dd-cfDNA only, 22.6% for patients positive by GEP only, and 5.5% positive by both GEP and dd-cfDNA. The combined testing approach resulted in a 12.7% reduction (48 biopsies) in endomyocardial biopsy volume compared to GEP testing alone. Among the 172 biopsies performed, 2 patients with cell-mediated rejection were identified, with corresponding dual-positive tests. Two patients with antibody-mediated rejection were identified, with corresponding tests that were only positive by dd-cfDNA. The study is limited by its retrospective design, incomplete evaluation of performance characteristics, and lack of reporting on health outcomes.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
Kobashigawa et al (2015) published the results of a pilot RCT evaluating the use of the AlloMap test in patients who were 55 days to 6 months post transplant.67, The trial design was similar to that of the Invasive Monitoring Attenuation through Gene Expression (IMAGE) RCT, discussed next. Sixty subjects were randomized to rejection monitoring with AlloMap or with endomyocardial biopsy at prespecified intervals of 55 days and 3, 4, 5, 6, 8, 10, and 12 months post transplant. The threshold for a positive AlloMap test was set at 30 for patients 2 to 6 months post transplant and 34 for patients after 6 months post transplant, based on data from the CARGO study. Endomyocardial biopsy outside of the scheduled visits was obtained in either group if there was clinical or echocardiographic evidence of graft dysfunction and for the AlloMap group if the score was above the specified threshold. The incidence of the primary outcome at 18 months post transplant (a composite outcome of the first occurrence of any of the following: death or retransplant, rejection with hemodynamic compromise, or allograft dysfunction due to other causes) did not differ significantly between the AlloMap and biopsy groups (10% vs 17%; p =.44). The number of biopsy-proven rejection episodes (International Society for Heart and Lung Transplantation grading system ≥ 2R) within the first 18 months did not differ significantly between groups (3 in the AlloMap group vs 1 in the biopsy group; p =.31). Of the rejections in the AlloMap group, 1 was detected after an elevated routine AlloMap test, while 2 were detected after patients presenting with hemodynamic compromise. As in the IMAGE study, a high proportion of rejection episodes were detected by clinical signs or symptoms (however, this study had only 3 rejection episodes in the AlloMap group).
In 2010, the results of the IMAGE study were published.68,69, This was an industry-sponsored, nonblinded, noninferiority RCT that compared outcomes in 602 patients managed with the AlloMap test (n=297) or with routine endomyocardial biopsies (n=305). The trial included adults from 13 centers who underwent cardiac transplantation between 1 and 5 years prior to participating, were clinically stable and had a left ventricular ejection fraction of at least 45%. To increase enrollment, the trial protocol was later amended to include patients who had undergone transplantation between 6 months and 1 year prior to participating; this subgroup ultimately comprised only 15% of the final sample (n=87). Each transplant center used its own protocol for determining the intervals for routine testing. At all sites, patients in both groups underwent clinical and echocardiographic assessments in addition to the assigned surveillance strategy. According to the study protocol, patients underwent biopsy if they had signs or symptoms of rejection or allograft dysfunction at clinic visits (or between visits) or if the echocardiogram showed a left ventricular ejection fraction decrease of at least 25% compared with the initial visit. Additionally, patients in the AlloMap group underwent biopsy if their test score was above a specified threshold; however, if they had 2 elevated scores with no evidence of rejection found on 2 previous biopsies, no additional biopsies were required. The AlloMap test score varied from 0 to 40, with higher scores indicating a higher risk of transplant rejection. The investigators initially used 30 as the cutoff for a positive score; the protocol was amended to use a cutoff of 34 to minimize the number of biopsies needed. Fifteen patients in the AlloMap group and 26 in the biopsy group did not complete the trial.
The primary outcome was a composite variable: (1) the first occurrence of rejection with hemodynamic compromise; (2) graft dysfunction due to other causes; (3) death; or (4) retransplantation. Use of the AlloMap test was considered noninferior to the biopsy strategy if the 1-sided upper boundary of the 95% CI for the hazard ratio comparing the 2 strategies was less than the prespecified margin of 2.054. The margin was derived using the estimate of a 5% event rate per year in the biopsy group, taken from published observational studies, and allowing for an event rate of up to 10% per year in the AlloMap group.
According to Kaplan-Meier analysis, the 2-year event rate was 14.5% in the AlloMap group and 15.3% in the biopsy group. The corresponding hazard ratio was 1.04 (95% CI, 0.67 to 1.68). The upper boundary of the CI of the hazard ratio (1.68) fell within the prespecified noninferiority margin (2.054); thus, GEP was considered noninferior to endomyocardial biopsy. Death from all causes, a secondary outcome, did not differ significantly between groups. There were 13 (6.3%) deaths in the AlloMap group and 12 (5.5%) in the biopsy group (p =.82). During follow-up, there were 34 treated episodes of graft rejection in the AlloMap group. Only 6 (18%) of the 34 patients with graft rejection presented solely with elevated AlloMap scores. Twenty (59%) patients presented with clinical signs/symptoms and/or graft dysfunction on echocardiogram and 7 patients had an elevated AlloMap score plus clinical signs/symptoms with or without graft dysfunction on echocardiogram. In the biopsy group, 22 patients were detected solely due to an abnormal biopsy.
A total of 409 biopsies were performed in the AlloMap group and 1249 in the biopsy group. Most biopsies in the AlloMap group (67%) were performed because of elevated gene profiling scores. Another 17% were performed due to clinical or echocardiographic manifestations of graft dysfunction, and 13% were performed as part of routine follow-up after treatment for rejection. There was 1 (0.3%) adverse event associated with biopsy in the AlloMap group and 4 (1.4%) in the biopsy group. In terms of quality of life, the physical health and mental health summary scores of the 12-Item Short-Form Health Survey were similar in the 2 groups at baseline and did not differ significantly between groups at 2 years.
A limitation of the trial was that the threshold for a positive AlloMap test was changed partway through the study; thus, the optimal test cutoff remains unclear. Moreover, the trial was not blinded, which could have affected treatment decisions based on clinical findings, such as whether to recommend a biopsy. In addition, the study did not include a group that only received clinical and echocardiographic assessment, so the value of AlloMap testing beyond that of clinical management alone cannot be determined. The uncertain incremental benefit of the AlloMap test is highlighted by the finding that only 6 of the 34 treated episodes of graft rejection detected during follow-up in the AlloMap group were initially identified solely due to an elevated GEP score. Since 22 episodes of asymptomatic rejection were detected in the biopsy group, the AlloMap test does not appear to be a sensitive test, possibly missing more than half of the episodes of asymptomatic rejection. Because clinical outcomes were similar in the 2 groups, there are at least 2 possible explanations: the clinical outcome of the study may not be sensitive to missed episodes of rejection, or it is not necessary to treat asymptomatic rejection. In addition, the trial was only statistically powered to rule out more than a doubling of the rate of the clinical outcome, which some may believe is an insufficient margin of noninferiority. Finally, only 15% of the final study sample had undergone transplantation less than 1 year before study participation; therefore, findings might not be generalizable to the population of patients 6 to 12 months post transplant.
Direct evidence of clinical utility was not identified for the HeartCare test.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of GEP testing, alone or in combination with dd-cfDNA testing, to assess for cardiac allograft rejection has not been established, a chain of evidence to support clinical utility cannot be constructed.
The 2 studies (CARGO, CARGO II) examining the diagnostic performance of GEP using the AlloMap test for detecting moderate or severe rejection were flawed by lack of a consistent threshold (ie, 20, 30, or 34) for determining positivity and by a small number of positive cases. In the available studies, although the NPVs were relatively high (ie, at least 88%), the performance characteristics were calculated based on the detection of 10 or fewer cases of rejection each. Moreover, the PPV in the CARGO II study was only 4.0% for patients who were at least 2 to 6 months post transplant and 4.3% for patients more than 6 months post transplant. The ability of the AlloMap test to differentiate between infection-related and rejection-related graft injury has also been called into question.
The most direct evidence on the clinical utility of GEP using the AlloMap test comes from a large RCT comparing a GEP-directed strategy with an endomyocardial biopsy-directed strategy for detecting rejection; it found that the GEP-directed strategy was noninferior. However, given the high proportion of rejection episodes in the GEP-directed strategy group detected by clinical signs/symptoms, the evidence is insufficient to determine that health outcomes are improved because of the uncertain incremental benefit of GEP. In addition, a minority of subjects assessed were in the first year post transplant. Results from a pilot RCT would suggest that GEP may have a role in evaluating for heart transplant rejection beginning at 55 days post transplant, but the trial was insufficiently powered to permit firm conclusions about the noninferiority of early GEP use.
One retrospective study assessing the combined use of GEP testing with AlloMap and dd-cfDNA testing with AlloSure Heart reported a 12.7% reduction in endomyocardial biopsy volume when combined testing was used compared to AlloMap alone. However, this observation is limited by a lack of reporting on long-term health outcomes and incomplete diagnostic performance assessment for combined testing.
For individuals who have a heart transplant who receive GEP to assess cardiac allograft rejection, the evidence includes 2 diagnostic accuracy studies and several RCTs evaluating clinical utility. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The 2 studies, Cardiac Allograft Rejection Gene Expression Observation (CARGO, CARGO II) examining the diagnostic performance of GEP for detecting moderate-to-severe rejection lacked a consistent threshold for defining a positive GEP test (ie, 20, 30, or 34) and reported a low number of positive cases. In the available studies, although the NPVs were relatively high (ie, at least 88%), the performance characteristics were only calculated based on 10 or fewer cases of rejection; therefore, performance data may be imprecise. Moreover, the PPV in CARGO II was only 4.0% for patients who were at least 2 to 6 months post transplant and 4.3% for patients more than 6 months post transplant. The threshold indicating a positive test that seems to be currently accepted (a score of 34) was not prespecified; rather it evolved partway through the data collection period in the Invasive Monitoring Attenuation through Gene Expression (IMAGE) study. In addition, the IMAGE study had several methodologic limitations (eg, lack of blinding); further, the IMAGE study failed to provide evidence that GEP offers an incremental benefit over biopsy performed on the basis of clinical exam or echocardiography. Patients at the highest risk of transplant rejection are patients within 1 year of the transplant, and, for that subset, there remains insufficient data on which to evaluate the clinical utility of GEP. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive GEP with testing of dd-cfDNA) to assess cardiac allograft rejection, the evidence includes 1 retrospective analysis of the HeartCare test and 1 diagnostic accuracy study of the AlloSure dd-cfDNA component of the HeartCare test. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The HeartCare analysis reported a 12.7% reduction in endomyocardial biopsy volume among patients undergoing routine surveillance. However, this observation is limited by lack of reporting on long-term health outcomes and incomplete assessment of diagnostic performance for combined testing, as patients with negative dd-cfDNA scores did not undergo biopsy regardless of GEP score per study protocol. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 5Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
The purpose of dd-cfDNA testing in patients with renal transplant who are undergoing surveillance or have clinical suspicion of allograft rejection is to detect allograft rejection.
The question addressed in this evidence review is: Does testing for dd-cfDNA improve outcomes in renal transplant patients who are undergoing surveillance or have clinical suspicion of allograft rejection?
The following PICO was used to select literature to inform this review.
The relevant population of interest are individuals with renal transplants who are undergoing surveillance or who have a clinical suspicion of allograft rejection.
Clinical suspicion of allograft rejection may be indicated by clinical symptoms (eg, pain) or dynamic changes in laboratory parameters.
Allograft dysfunction is typically asymptomatic and has a broad differential, including graft rejection. Diagnosis and rapid treatment are recommended to preserve graft function and prevent loss of the transplanted organ.
The test being considered is dd-cfDNA testing to assess for renal allograft rejection (ie, AlloSure or Prospera).
Various clinical pathways have been proposed for these tests. Use of the Prospera test is recommended when there is clinical suspicion of active rejection and for regular surveillance of subclinical rejection.35, In a surveillance scenario, regular testing is recommended at 1, 2, 3, 4, 6, 9, and 12 months after renal transplant or most recent rejection. Thereafter, the test should be repeated quarterly. The proportion of dd-cfDNA relative to total cfDNA is reported, with detection of ≥ 1% dd-cfDNA indicating increased risk for active rejection. The percent dd-cfDNA change between tests is also reported.24, In the surveillance scenario, patients with a negative result may avoid biopsy and it is recommended that a positive test result is incorporated with clinical findings to determine whether a biopsy is indicated. When there is clinical suspicion of rejection, testing is recommended as an adjunct to biopsy for treatment response monitoring, or as a rule-out test for biopsy.
For the AlloSure test, various dd-cfDNA thresholds are suggested depending on the clinical scenario and include the detection of antibody-mediated rejection (ABMR) in patients with donor-specific antibodies (DSA), the detection of "likely" active rejection, the prediction of adverse outcomes as an adjunct to biopsy-confirmed T cell-mediated (TCMR) 1A/borderline rejections, and for the exclusion of active rejection.70, A routine testing schedule is also recommended, and details regarding its clinical rationale have been published.71,
The following test is currently being used to confirm a clinical suspicion of allograft rejection: renal biopsy. The adoption of protocol (ie, surveillance) biopsies varies across transplant centers and its use is not standardized.
Clinical suspicion of allograft rejection may be indicated by physical symptoms and/or dynamic changes in laboratory parameters (eg, serum creatinine, estimated glomerular filtration rate [eGFR], DSA).
The general outcomes of interest are OS, test validity, morbid events, and hospitalizations. Follow-up over months to years is needed to monitor for signs of allograft rejection.
For a primary kidney transplant, graft survival at 1 year is 94.7%; at 5 years, graft survival is 78.6%.29,
Beneficial outcomes resulting from a true-negative test result are avoiding unnecessary subsequent biopsy. Harmful outcomes resulting from a false-positive result may include an unnecessary biopsy or unnecessary treatment. Harmful outcomes from a false-negative result are increased risk of adverse transplant outcomes.
In a triage scenario, the test would need to identify precisely a group of patients that could safely forgo biopsy; therefore, the sensitivity, NPV, and negative likelihood ratio are key test performance characteristics.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse)
For the evaluation of the clinical validity of dd-cfDNA testing, studies that met the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard (describe the reference standard)
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
Major study results are summarized in Table 8.
Development of the AlloSure test was conducted in the multicenter prospective Circulating Donor-Derived Cell-Free DNA in Blood for Diagnosing Acute Rejection in Kidney Transplant Recipients (DART) study by Bloom et al (2017), which both recruited patients who were less than 3 months after renal transplant (n=245) and recruited renal transplant patients requiring a biopsy for suspicion of graft rejection (n=139).72, For the primary analysis, an active rejection was defined as the combined categories of T cell-mediated rejection, acute/active antibody-mediated rejection, and chronic/active antibody-mediated rejection as defined by the Banff working groups. Only patients undergoing biopsy were considered; further exclusion of biopsies that were not for cause or had an inadequate or incomplete collection of biopsies or corresponding blood samples or had prior allograft in situ. These exclusions resulted in the main study cohort of 102 patients (107 biopsies). Within this population, acute rejection was noted in 27 patients (27 biopsies). After statistical analysis accounting for multiple biopsies from the same patient, the threshold dd-cfDNA fraction corresponding to acute rejection was set to 1.0% or higher. In the main study group, this resulted in a sensitivity of 59% (95% CI, 44% to 74%) and specificity of 85% (95% CI, 79% to 81%) for detecting active rejection versus no rejection. Using the original data set including all biopsies performed for clinical suspicion of rejection, 58 cases of acute rejection were diagnosed in 204 biopsies (170 patients). This PPV was 61% and the NPV 84%. Biopsies performed for surveillance (n=34 biopsies) were excluded from analysis in this study, as only 1 biopsy for surveillance demonstrated acute rejection. Study limitations included the absence of a validation data set.
Huang et al (2019) conducted a smaller single center that recruited 63 renal transplant patients with suspicion of rejection that had AlloSure assessment of dd-cfDNA within 30 days of an allograft biopsy.73, Median years from transplant to dd-cfDNA measurement was 2.0 (interquartile range, 0.3 to 6.5). Within this population, biopsy found acute rejection in 34 (54%) of patients; 10 (15.9%) were cell-mediated only, 22 (25.4%) were antibody-mediated only, and 2 (3.2%) were mixed cell-mediated and antibody-mediated. In contrast to the study by Bloom et al (2017), the optimal threshold for a positive dd‐cfDNA result was identified as ≥ 0.74%. For the outcome of any rejection (ie, cell-mediated, antibody-mediated, or mixed), use of this threshold was associated with an overall sensitivity of 79.4%, specificity of 72.4%, PPV of 77.1%, and NPV of 75.0%. Discrimination of rejection differed by biopsy findings, however. For the subgroup of patients with antibody-mediated rejection, the sensitivity was 100%, specificity was 71.8%, PPV was 68.6%, and NPV was 100%. The dd-cfDNA test did not discriminate rejection in patients with cell-mediated rejection, as evidenced by an AUC of 0.43 (95% CI, 0.17 to 0.66). The major limitations of this study are its small sample size and single-center setting.
Stites et al (2020) assessed clinical outcomes in 79 patients diagnosed with TCMR 1A/borderline rejection with simultaneous AlloSure assessment of dd-cfDNA across 11 centers between June 2017 and May 2019.74, Timing of testing with respect to the date of transplantation was not reported. Elevated levels of dd-cfDNA (≥ 0.5%) were detected in 42 (53.2%) patients. No statistically significant differences between dd-cfDNA distributions when stratified by protocol versus for-cause biopsies was detected (p =.7307). Elevated levels of dd-cfDNA were associated with adverse clinical outcomes compared to patients with low levels (< 0.5%), including decline in eGFR (8.5% versus 0%; p =.004), de novo DSA formation (40% versus 2.7%; p <.0001), and future or persistent rejection (21.4% versus 0%; p =.003). The authors hypothesize that the use of dd-cfDNA may complement histological evaluation and risk stratify patients with TCMR 1A or borderline rejection identified on biopsy and propose the use of reference ranges as opposed to absolute dd-cfDNA cutoff thresholds.
Additional analyses of the DART study have reported on associations between first-year AlloSure dd-cfDNA fraction or serial variability and subsequent eGFR decline (Sawinski et al [2021]),75, and combined use of dd-cfDNA and DSA testing to diagnose active antibody-mediated rejection (Jordan et al [2018], Mayer et al [2021]).76,77,
Puliyanda et al (2021) conducted a prospective study of 67 pediatric renal transplant recipients enrolled across 2 medical centers between 2017 and 2019.78, Patients had a median age of 11 years (interquartile range [IQR], 4 to 13) and median time post-transplant to first AlloSure dd-cfDNA measurement was 55.6 months. Nineteen patients (28.4%) received dd-cfDNA testing in the absence of clinical suspicion of rejection. Median dd-cfDNA scores in the surveillance versus for-cause cohorts were 0.37% (IQR, 0.19% to 1.10%) and 0.47% (IQR, 0.24% to 2.15%), respectively. Among patients undergoing surveillance, 26.3% (5/19 patients) had a dd-cfDNA score >1% with biopsies indicating 4 cases of antibody-mediated rejection and 1 case of mixed rejection. Among patients with clinical suspicion of rejection, 43.8% (21/48 patients) had dd-cfDNA scores >1%. All for-cause biopsies showed evidence of rejection, including 11 cases of antibody-mediated rejection, 2 cases of T cell-mediated rejection, and 8 cases of mixed rejection. An additional 7 patients with clinical suspicion of rejection underwent biopsy despite dd-cfDNA scores < 1%, revealing 4 cases without rejection, 1 case with antibody-mediated rejection, 1 case with cell-mediated rejection, and 1 case of mixed rejection. Among all patients with biopsy-matched results (33/67), dd-cfDNA >1% was associated with a sensitivity of 86% and specificity of 100%, with a corresponding AUC of 0.996 (p =.002). No significant difference in serum creatinine change from baseline to testing was identified for those with rejection compared to those without. The study is limited by the small sample size and lack of biopsy-matched data for a complete assessment of false negatives. The authors also note that the 1% dd-cfDNA cutoff threshold was used based on prior studies in adults and it is unclear if this is appropriate for the pediatric population. Additionally, the authors suggest that relative increases in dd-cfDNA, as opposed to absolute values, may be more valuable in the pediatric population, given that appropriate cutoff thresholds may depend on child age and size.
Sigdel et al (2019) evaluated the diagnostic accuracy of the Prospera Kidney dd-cfDNA test in a retrospective analysis of 300 biorepository plasma samples from kidney transplant recipients at a single academic medical center.79, Of the 300 samples (193 patients), 217 were biopsy-matched with 38 cases of active rejection, 72 cases of borderline rejection, 82 with stable allografts, and 15 cases of other kidney injuries. The sample cohort was demographically diverse, including women (42.5%), Hispanic and Latino patients (34.6%), Black or African American patients (14%), and pediatric patients (20%). Indication for renal transplantation was unknown in 45.6% of samples. The majority of samples (72.3%) were drawn on the day of surveillance (n = 114 [52.5%] patients) or clinically indicated biopsy (n=103 [47.5%] patients). Timing of tests with respect to the date of transplantation was not reported. Biopsies were evaluated by a single pathologist according to 2017 Banff criteria and classified as active rejection or non-rejection (ie, borderline rejection, other injury, or stable allograft status). Median dd-cfDNA levels were significantly higher in biopsy-proven active rejection (2.32%) versus non-rejection subgroups (0.47%; p <.0001). All subtypes of active rejection could be detected, and median dd-cfDNA did not differ significantly between antibody-mediated (2.2%), T cell-mediated (2.7%), and combined subtypes (2.6%).
Sigdel et al (2019) also assessed the performance characteristics of eGFR, which was calculated as a function of serum creatinine with adjustments for age, sex, and race based on the Modification of Diet in Renal Disease (MDRD) Study equation.79, At a cutoff threshold of < 60, the sensitivity and specificity for eGFR were lower compared to dd-cfDNA, at 67.8% (95% CI, 51.3% to 84.2%) and 65.3% (95% CI, 57.6% and 73.0%), respectively, with a corresponding AUC of 0.74 (95% CI, 0.66 to 0.83). However, the relevance of absolute eGFR measurements is limited as dynamic changes in laboratory parameters (eg, serum creatinine elevation, eGFR decline) are used to flag impaired kidney function in clinical practice in the transplant population. Separate eGFR estimates in the for-cause subgroup were not reported. Major limitations of this study include its retrospective design and single-center setting. While the dd-cfDNA cutoff was prespecified, it was based on prior studies of the AlloSure test and may not be optimized for Prospera.
Bunnapradist et al (2021) noted that while % dd-cfDNA is a promising noninvasive biomarker for detecting renal allograft rejection, levels can be artificially depressed by high levels of circulating cfDNA, which may be observed in patients who are obese, have recently undergone surgery, have medical complications, or receive certain medications, potentially leading to false-negative results.80, The authors suggested that a combination of dd-cfDNA fraction and absolute quantity thresholds may improve the sensitivity of allograft rejection while maintaining high specificity.
Preliminary results from the ongoing Trifecta study (NCT04239703) published by Halloran et al (2022) provide assessment of combined dd-cfDNA fraction and absolute values for prediction of active kidney allograft rejection.81, The study reported data from 218 individuals included in a test set (median age 51 years) enrolled from December 2019 to July 2021. Thirty-eight patients were female and 17% were Black or African American; other race or ethnicity data were not reported. The mean post-transplant time was 1,439 days (3.9 years). The study used a training set (n=149) to identify optimal % dd-cfDNA (≥1%) and absolute values cut-offs (≥78 cp/mL). Accuracy of dd-cfDNA testing was compared with the Molecular Microscope Diagnostic System (MMDx) and histological analysis using Banff criteria as reference standards. The use of two reference standards in this study is based on previous Trifecta analysis that suggested a strong correlation between dd-cfDNA fraction and molecular changes due to rejection assessed using MMDx.82,
Study; dd-cfDNA threshold | Biopsy-Matched Samples | Prevalence, n (%)a | Sensitivity, % (95% CI) | Specificity, % (95% CI) | AUC (95% CI) | PPV, % (95% CI) | NPV, % (95% CI) |
Allosure | |||||||
Bloom et al (2017) (≥1%)72, | |||||||
For-cause, dd-cfDNA | 107 | 27 (25.2) | 59 (44 to 74) | 85 (79 to 81) | 0.74 (0.61 to 0.86) | 61 (NR) | 84 (NR) |
For-cause, SCr | 204 | 58 (28.4) | NR | NR | 0.54 (0.43 to 0.66) | NR | NR |
Huang et al (2019) (≥0.74%)73, | |||||||
For-cause, any rejection | 63 (patients) | 34 (54) | 79.4 (NR) | 72.4 (NR) | 0.71 (0.58 to 0.85) | 77.1 (NR) | 75 (NR) |
For-cause, CMR | 63 (patients) | 10 (16) | NR | NR | 0.42 (0.17 to 0.66) | NR | NR |
Prospera | |||||||
Sigdel et al (2019) (≥1%)79, | |||||||
Overall, dd-cfDNA | 217 | 33 (17.5) | 88.7 (77.7 to 99.8) | 72.6 (65.4 to 79.8) | 0.87 (0.80 to 0.95) | 52.0 (44.7 to 59.2)c | 95.1 (90.5 to 99.7)c |
Overall, eGFR | 217 | 33 (17.5) | 67.8 (51.3 to 84.2) | 65.3 (57.6 to 73.0) | 0.74 (0.66 to 0.83) | 39.4 (31.6 to 47.3)c | 85.9 (75.9 to 92.2)c |
Surveillance, dd-cfDNA | 114 | 12 (11.4) | 92.3 (64.0 to 99.8) | 75.2 (65.7 to 83.3) | 0.89 (0.79 to 0.99) | 55.4 (46.2 to 64.7)c32.4 (24.8 to 41.1)d | 96.7 (90.6 to 99.9)c98.7 (92.0 to 99.8)d |
For-cause, dd-cfDNA | 103 | 25 (24.3) | 84.0 (63.9 to 95.5)b | 68.0 (56.4 to 78.1)b | NR | 45.7 (36.8 to 54.8)d | 93.0 (84.2 to 97.1)d |
Halloran et al (2022)81, ≥1%; ≥78 cp/mL | |||||||
dd-cfDNA % + absolute quantity; MMDx criteria | 218 | 71 (32.6) | 83.1% (95% CI NR) | 81.0% (95% CI NR) | 0.88 (95% CI NR) | 67.8% (95% CI NR) | 90.8% (95% CI NR) |
dd-cfDNA % + absolute quantity; Banff criteria | 213 | 83 (39.0) | 73.5% (95% CI NR) | 80.8% (95% CI NR) | 0.82 (95% CI NR) | 70.9% (95% CI NR) | 82.7% (95% CI NR) |
AUC: area under the receiver-operating curve; CI: confidence interval; CMR: cell-mediated rejection; dd-cfDNA: donor-derived cell-free DNA; eGFR: estimated glomerular filtration rate; MMDx: molecular microscope diagnostic system; NPV: negative predictive value; NR: not reported; PPV: positive predictive value; SCr: serum creatinine. a Study disease prevalence. b Calculated based on reported case numbers. c Projected value as reported based on assumed disease prevalence of 25% in an at-risk population. d Calculated value based on study disease prevalence.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No RCTs assessing the clinical utility of dd-cfDNA (ie, AlloSure, Prospera) testing to diagnose renal allograft rejection were identified.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of dd-cfDNA (ie, AlloSure, Prospera) testing to assess for renal allograft rejection has not been established, a chain of evidence to support clinical utility cannot be constructed.
A discovery phase prospective study using the AlloSure test has been performed in a multicenter setting. A subsequent smaller single-center study that explored variation in clinical validity based on different rejection mechanisms found the strongest performance characteristics for AlloSure with antibody-mediated rejection. A retrospective study of the Prospera test reported a PPV and NPV or 52% and 95% respectively using a ≥1% dd-cfDNA threshold. A second, prospective Prospera study reported PPVs of 68% and 71% and NPVs 91% and 83% using combined dd-cfDNA fraction and absolute quantity compared with two different reference standards. Larger prospective studies validating dd-cfDNA thresholds for active rejection are needed to develop conclusions for each test. At present, no studies evaluating the clinical utility for AlloSure or Prospera dd-cfDNA testing were identified.
For individuals with a renal transplant who are undergoing surveillance or have clinical suspicion of allograft rejection who receive testing of dd-cfDNA to assess renal allograft rejection, the evidence includes diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. One study examined the diagnostic performance of dd-cfDNA for detecting moderate-to-severe rejection; the NPV was moderately high (84%), and performance characteristics were calculated on 27 cases of active transplant rejection. The threshold indicating a positive test was not prespecified. A subsequent smaller single-center study that explored variation in clinical validity based on different rejection mechanisms found the strongest performance characteristics for AlloSure with antibody-mediated rejection. A retrospective single-center study of the Prospera dd-cfDNA test reported a PPV and NPV of 52% and 95%, respectively, for detection of active rejection among a combined cohort of patients undergoing surveillance or for-cause biopsies, using the 1% dd-cfDNA threshold previously proposed for the AlloSure test. A second, prospective Prospera study reported PPVs of 68% and 71% and NPVs 91% and 83% using combined dd-cfDNA fraction and absolute quantity compared with two different reference standards. Larger prospective studies validating the dd-cfDNA thresholds for active rejection are needed to develop conclusions for each test. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 6Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
The purpose of dd-cfDNA testing in patients with lung transplant who are undergoing surveillance is to detect allograft rejection.
The following PICO was used to select literature to inform this review.
The relevant population of interest are individuals with lung transplants who are undergoing surveillance for allograft rejection.
The test being considered is dd-cfDNA testing to assess for lung allograft rejection.
A regular testing schedule is recommended for patients undergoing surveillance, with monthly testing in the first-year post-transplant and quarterly in the years 2-3. The proportion of dd-cfDNA relative to total cfDNA is reported. The report also notes that a threshold of >0.85% dd-cfDNA is associated with a higher probability of acute cellular rejection, chronic lung allograft dysfunction (CLAD), and antibody-mediated rejection and that the NPV is maximized at a % dd-cfDNA cutoff of 0.20%.
The following test is currently being used to confirm a clinical suspicion of allograft rejection: bronchoscopy with transbronchial biopsy.
The general outcomes of interest are OS, test validity, morbid events, and hospitalizations. Follow-up over months to years is needed to monitor for signs of allograft rejection.
Beneficial outcomes resulting from a true-negative test result are avoiding unnecessary subsequent biopsy. Harmful outcomes resulting from a false-positive result may include an unnecessary biopsy or unnecessary treatment. Harmful outcomes from a false-negative result are increased risk of adverse transplant outcomes.
In a triage scenario, the test would need to identify precisely a group of patients that could safely forgo biopsy; therefore, the sensitivity, NPV, and negative likelihood ratio are key test performance characteristics.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse)
For the evaluation of the clinical validity of dd-cfDNA testing, studies that met the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard (describe the reference standard)
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
Sayah et al (2020) conducted a pilot study investigating the ability of AlloSure dd-cfDNA testing to detect acute cellular rejection.83, Biopsy-matched biorepository samples from 69 lung transplant recipients who had previously enrolled in the multicenter Lung Allograft Gene Expression Observational (LARGO) Study were evaluated. Diagnostic cohorts included patients with respiratory allograft infection (n=26), normal histopathology without infection or rejection (n=30), and acute cellular rejection without concurrent infection (n=13). Samples were obtained between > 14 days and < 1-year post-transplant, and samples associated with potential concurrent infection with rejection were excluded. Median dd-cfDNA levels were 0.485% (IQR, 0.220 to 0.790) in the normal cohort, 1.52% (IQR, 0.520 to 2.550) in the acute cellular rejection cohort, and 0.595% (IQR, 0.270 to 1.170) in the infection cohort. While dd-cfDNA levels were significantly higher in the acute cellular rejection cohort compared to the normal cohort (p =.026), samples associated with infection were not significantly different from the normal (p =.282) or acute cellular rejection (p =.100) cohorts. The AUC for detection of acute cellular rejection was 0.717 (95% CI, 0.547 to 0.887; p =.025). At a threshold of 0.87% dd-cfDNA and an estimated prevalence rate of 25%, sensitivity for acute cellular rejection was 73.1% (95% CI, 52.2% to 88.4%), specificity was 52.9% (95% CI, 27.8% to 77.0%), positive likelihood ratio was 1.55, negative likelihood ratio was 0.51, PPV was 34.1%, and NPV was 85.5%. The study is limited by the small sample size and use of archived samples, and raises concerns regarding the ability of AlloSure dd-cfDNA testing to detect antibody-mediated rejection and to discriminate between infection and rejection.
Khush et al (2021) evaluated 107 biorepository plasma samples from 38 lung transplant recipients enrolled in the Genome Transplant Dynamics Study via AlloSure dd-cfDNA testing.84, The study cohort included 14 patients (22 samples) with acute cellular rejection confirmed by histopathology, 6 patients (7 samples) treated for acute cellular rejection without a confirmed histopathological diagnosis, 6 patients (8 samples) with obstructive CLAD, 7 patients (9 samples) with antibody-mediated rejection (AMR), 22 patients (33 samples) with infection without rejection, and 18 patients (28 samples) with stable allografts. The median dd-cfDNA levels in the acute cellular rejection (0.91%; IQR, 0.39% to 2.07%) and CLAD (2.06%; IQR, 0.97% to 3.34%) cohorts were significantly higher compared to the stable cohort (p =.02, respectively). However, the antibody-mediated rejection cohort was not statistically different when compared with the stable cohort (p =.07). The median dd-cfDNA level in an aggregated rejection cohort, composed of acute cellular rejection, AMR, and CLAD samples, was approximately 3-fold higher when compared to the stable cohort (1.06% versus 0.38%). At a dd-cfDNA threshold of 0.85%, the sensitivity for this spectrum of rejection was 55.6%, specificity was 75.8%, PPV was 43.3%, and NPV was 83.6%. The study is limited by the small sample size and use of archived samples. The authors suggest that AlloSure dd-cfDNA testing may have clinical utility as a plasma marker of "tissue injury" and that the 0.85% dd-cfDNA threshold requires further prospective clinical validation.
A retrospective study conducted by Keller et al (2022) included 157 patients enrolled in a post-transplant home surveillance program that included the AlloSure test for detection of acute allograft rejection.85, The study analyzed data from patients at 4 U.S. centers. Data were collected from March to September 2020, during the COVID-19 pandemic at a time when in-office visits were limited and routine, surveillance bronchoscopy was deferred. Home monitoring was intended to identify those patients most at risk for acute rejection for triage to bronchoscopy. Study inclusion was limited to adults >18 years between 30 days and 3 years post-transplant. Of the total cohort, the mean age was 59 years and the majority were male (54%) and White race (64%). Eighteen percent were Black, 3% Asian, and 15 % other race/ethnicity. The mean time since transplantation as 13 months, and 82% underwent bilateral transplantation. Diagnosis of ACR, AMR, infection, or a composite of these outcomes (Acute Lung Allograft Dysfunction [ALAD]), was made based on biopsy and/or clinical diagnosis. Mean dd-cfDNA % was 1.6% for acute rejection (ACR+AMR) and 1.7% for ALAD. In comparison, the mean dd-cfDNA in stable patients was 0.37%. Using a dd-cfDNA cut-off of 1.0% for detection of ALAD, the sensitivity was 73.9%, specificity 87.7%, PPV 43.4% and NPV 96.5%. Of the 157 patients with dd-cfDNA measurement for surveillance, 52 also had a contemporaneous reference standard surveillance bronchoscopy independent of dd-cfDNA level (i.e. patients who were not triaged to bronchoscopy). When analysis was limited to this subgroup, diagnostic performance declined slightly: 76.2% sensitivity, 70.0% specificity, 66.7% PPV and 79.2% NPV. The study was limited by the small sample size, particularly the limited number of unselected patients who underwent both dd-cfDNA testing and bronchoscopy.
Rosenheck et al (2022) assessed the predictive ability of dd-cfDNA testing using the Prospera test for lung transplant rejection.86, The study included 195 samples from 103 patients, who were predominantly White (93%) and male (60%); mean age was 62 years. Black and Hispanic patients comprised 6% and 1% of the study population, respectively. The median time since lung transplant was 198 days, and most patients (85%) underwent lung biopsy for routine transplant surveillance. Consistent with other dd-cfDNA studies, median dd-cfDNA % was higher in patients with acute rejection (AR), which included acute cellular rejection (1.43%) or antibody-mediated rejection (2.50%), than those who were stable (0.46%). Prevalence of acute rejection was 28% (29/103), and prevalence of CLAD or neutrophilic-responsive allograft dysfunction (NRAD) was 21% (22/103); patients could be included in both diagnostic groups. Using a dd-cfDNA threshold of ≥1% for prediction of acute rejection, sensitivity was 89.1% and specificity was 82.9%, resulting in an AUC of 0.91 (95% CI 0.83 to 0.98). PPV was 51.9% and NPV was 97.3%. For a combined measure that included AR, CLAD/NRAD, and infection, sensitivity was 59.9%, specificity 83.9%, AUC 0.76, PPV 43.6%, and NPV 91.0%. As with other dd-cfDNA studies in lung transplantation, this study was limited by the small sample size though unlike other studies samples were collected prospectively.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No RCTs assessing the clinical utility of dd-cfDNA (ie, AlloSure or Prospera) testing to diagnose lung allograft rejection were identified.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of dd-cfDNA testing to assess for lung allograft rejection has not been established, a chain of evidence to support clinical utility cannot be constructed.
Four small diagnostic accuracy studies of dd-cfDNA testing with AlloSure or Prospera utilizing biorepository (3 studies) or prospectively collected samples (1 study) were identified. At a threshold of 0.87% dd-cfDNA, the PPV and NPV for detecting acute cellular rejection in the first study were 34.1% and 85.5%, respectively. A second study reported a PPV of 43.3% and NPV of 83.6% at a dd-cfDNA cutoff of 0.85% for an aggregate rejection cohort composed of patients with ACR, AMR, and CLAD. In the third study, using a dd-cfDNA cut-off of 1.0%, PPV was 51.9% and NPV was 97.3% for acute rejection, and 43.6%, and 91.0% for acute rejection, CLAD/NRAD or infection. One study that used dd-cfDNA testing as part of a home surveillance program found a PPV 43.4% and NPV 96.5% for detection of ACR, AMR or infection, but when limited to patients with a contemporaneous reference standard surveillance bronchoscopy independent of dd-cfDNA level PPV 66.7% and NPV was 79.2%. Larger and additional prospective studies validating the dd-cfDNA threshold for active rejection are needed to develop conclusions. At present, no studies evaluating the clinical utility for AlloSure or Prospera dd-cfDNA testing were identified.
For individuals with a lung transplant who receive testing of dd-cfDNA to assess lung allograft rejection, the evidence includes 4 small diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. One study examined the diagnostic performance of dd-cfDNA testing at a threshold of 0.87% for detecting acute cellular rejection, yielding a PPV of 34.1% and a NPV of 85.5%. A second study reported a PPV of 43.3% and NPV of 83.6% for an aggregate rejection cohort composed of patients with acute cellular rejection, antibody-mediated rejection, and CLAD. In the third study, using a dd-cfDNA cut-off of 1.0%, PPV was 51.9% and NPV was 97.3% for acute rejection, and 43.6%, and 91.0% for acute rejection, CLAD/NRAD or infection. One study that used dd-cfDNA testing as part of a home surveillance program found a PPV 43.4% and NPV 96.5% for detection of ACR, AMR or infection, but when limited to patients with a contemporaneous reference standard surveillance bronchoscopy independent of dd-cfDNA level, PPV 66.7% and NPV was 79.2%. All 4 studies were limited by small sample sizes, and no clinical utility studies were identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
PopulationReference No. 7Policy Statement | [ ] MedicallyNecessary | [X] Investigational |
For individuals who have chronic heart failure who receive the sST2 assay to determine prognosis and/or to guide management, the evidence includes correlational studies and 2 meta-analyses. Relevant outcomes are OS, quality of life, and hospitalization. Most of the evidence is from reanalysis of existing RCTs and not from studies specifically designed to evaluate the predictive accuracy of sST2, and prospective and retrospective cross-sectional studies made up a large part of 1 meta-analysis. Studies have mainly found that elevated sST2 levels are statistically associated with an elevated risk of mortality. A pooled analysis of study results found that sST2 significantly predicted overall mortality and cardiovascular mortality. Several studies, however, found that sST2 test results did not provide additional prognostic information compared with N-terminal pro B-type natriuretic peptide levels. Moreover, no comparative studies were identified on the use of the sST2 assay to guide the management of patients diagnosed with chronic heart failure. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have heart transplantation who receive sST2 assay to determine prognosis and/or to predict acute cellular rejection, the evidence includes a small number of retrospective studies on the Presage ST2 Assay. Relevant outcomes are OS, morbid events, and hospitalization. No prospective studies were identified that provide high-quality evidence on the ability of sST2 to predict transplant outcomes. One retrospective study (n = 241) found that sST2 levels were associated with acute cellular rejection and mortality; another study (n = 26) found that sST2 levels were higher during an acute rejection episode than before rejection. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive a measurement of volatile organic compounds to assess cardiac allograft rejection, the evidence includes a diagnostic accuracy study. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The published study found that, for identifying grade 3 (now grade 2R) rejection, the NPV of the breath test the study evaluated (97.2%) was similar to endomyocardial biopsy (96.7%) and the sensitivity of the breath test (78.6%) was better than that for biopsy (42.4%). However, the breath test had a lower specificity (62.4%) and a lower PPV (5.6%) in assessing grade 3 rejection than a biopsy (specificity, 97%; PPV, 45.2%). The breath test was also not evaluated for grade 4 rejection. This single study is not sufficient to determine the clinical validity of the test measuring volatile organic compounds and no studies on clinical utility were identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive dd-cfDNA testing to determine acute rejection, the evidence includes diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. Evidence from 3 studies suggests that the dd-cfDNA fraction is elevated in acute rejection, but optimal fraction cut-offs for detection of acute rejection have not been established. Using dd-cfDNA thresholds ranging from 0.12% to 0.32% resulted in NPVs ranging from 82% to 98% and AUCs ranging from 0.61 to 0.86 in 3 studies. At present, no studies evaluating the clinical utility for the measurement of dd-cfDNA for heart transplant rejection have been identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive GEP to assess cardiac allograft rejection, the evidence includes 2 diagnostic accuracy studies and several RCTs evaluating clinical utility. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The 2 studies, Cardiac Allograft Rejection Gene Expression Observation (CARGO, CARGO II) examining the diagnostic performance of GEP for detecting moderate-to-severe rejection lacked a consistent threshold for defining a positive GEP test (ie, 20, 30, or 34) and reported a low number of positive cases. In the available studies, although the NPVs were relatively high (ie, at least 88%), the performance characteristics were only calculated based on 10 or fewer cases of rejection; therefore, performance data may be imprecise. Moreover, the PPV in CARGO II was only 4.0% for patients who were at least 2 to 6 months post transplant and 4.3% for patients more than 6 months post transplant. The threshold indicating a positive test that seems to be currently accepted (a score of 34) was not prespecified; rather it evolved partway through the data collection period in the Invasive Monitoring Attenuation through Gene Expression (IMAGE) study. In addition, the IMAGE study had several methodologic limitations (eg, lack of blinding); further, the IMAGE study failed to provide evidence that GEP offers an incremental benefit over biopsy performed on the basis of clinical exam or echocardiography. Patients at the highest risk of transplant rejection are patients within 1 year of the transplant, and, for that subset, there remains insufficient data on which to evaluate the clinical utility of GEP. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have a heart transplant who receive GEP with testing of dd-cfDNA) to assess cardiac allograft rejection, the evidence includes 1 retrospective analysis of the HeartCare test and 1 diagnostic accuracy study of the AlloSure dd-cfDNA component of the HeartCare test. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. The HeartCare analysis reported a 12.7% reduction in endomyocardial biopsy volume among patients undergoing routine surveillance. However, this observation is limited by lack of reporting on long-term health outcomes and incomplete assessment of diagnostic performance for combined testing, as patients with negative dd-cfDNA scores did not undergo biopsy regardless of GEP score per study protocol. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals with a renal transplant who are undergoing surveillance or have clinical suspicion of allograft rejection who receive testing of dd-cfDNA to assess renal allograft rejection, the evidence includes diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. One study examined the diagnostic performance of dd-cfDNA for detecting moderate-to-severe rejection; the NPV was moderately high (84%), and performance characteristics were calculated on 27 cases of active transplant rejection. The threshold indicating a positive test was not prespecified. A subsequent smaller single-center study that explored variation in clinical validity based on different rejection mechanisms found the strongest performance characteristics for AlloSure with antibody-mediated rejection. A retrospective single-center study of the Prospera dd-cfDNA test reported a PPV and NPV of 52% and 95%, respectively, for detection of active rejection among a combined cohort of patients undergoing surveillance or for-cause biopsies, using the 1% dd-cfDNA threshold previously proposed for the AlloSure test. A second, prospective Prospera study reported PPVs of 68% and 71% and NPVs 91% and 83% using combined dd-cfDNA fraction and absolute quantity compared with two different reference standards. Larger prospective studies validating the dd-cfDNA thresholds for active rejection are needed to develop conclusions for each test. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals with a lung transplant who receive testing of dd-cfDNA to assess lung allograft rejection, the evidence includes 4 small diagnostic accuracy studies. Relevant outcomes are OS, test validity, morbid events, and hospitalizations. One study examined the diagnostic performance of dd-cfDNA testing at a threshold of 0.87% for detecting acute cellular rejection, yielding a PPV of 34.1% and a NPV of 85.5%. A second study reported a PPV of 43.3% and NPV of 83.6% for an aggregate rejection cohort composed of patients with acute cellular rejection, antibody-mediated rejection, and CLAD. In the third study, using a dd-cfDNA cut-off of 1.0%, PPV was 51.9% and NPV was 97.3% for acute rejection, and 43.6%, and 91.0% for acute rejection, CLAD/NRAD or infection. One study that used dd-cfDNA testing as part of a home surveillance program found a PPV 43.4% and NPV 96.5% for detection of ACR, AMR or infection, but when limited to patients with a contemporaneous reference standard surveillance bronchoscopy independent of dd-cfDNA level, PPV 66.7% and NPV was 79.2%. All 4 studies were limited by small sample sizes, and no clinical utility studies were identified. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
The purpose of the following information is to provide reference material. Inclusion does not imply endorsement or alignment with the evidence review conclusions.
While the various physician specialty societies and academic medical centers may collaborate with and make recommendations during this process, through the provision of appropriate reviewers, input received does not represent an endorsement or position statement by the physician specialty societies or academic medical centers, unless otherwise noted.
In response to requests, input was received from 7 academic medical centers and 1 specialty society while this policy was under review in 2012. Input was mixed on whether AlloMap should be investigational. Four reviewers agreed with the investigational status, 1 disagreed, and 3 indicated it was a split decision/other. Reviewers generally agreed that the sensitivity and specificity have not yet been adequately defined for AlloMap and that the negative predictive value was not sufficiently high to preclude the need for biopsy. There was mixed input about the need for surveillance cardiac biopsies to be performed in the absence of clinical signs and/or symptoms of rejection.
In response to requests, input was received from 2 academic medical centers and 2 physician specialty societies while this policy was under review in 2008. Three reviewers agreed that these approaches for monitoring heart transplant rejection are considered investigational. The American College of Cardiology disagreed with the policy, stating that the College considers the available laboratory tests to have good potential to diagnose heart transplant rejection and reduce the frequency of invasive biopsies performed on heart transplant patients, although questions remained as to their role in clinical practice.
Guidelines or position statements will be considered for inclusion in ‘Supplemental Information' if they were issued by, or jointly by, a US professional society, an international society with US representation, or National Institute for Health and Care Excellence (NICE). Priority will be given to guidelines that are informed by a systematic review, include strength of evidence ratings, and include a description of management of conflict of interest.
In 2022, the American College of Cardiology, American Heart Association, and Heart Failure Society issued updated an guideline for the management of heart failure.8, The 2022 guideline replaced a 2013 guideline1, and a 2017 focused guideline update.87, The guideline states measurement of natriuretic peptide levels may be useful for diagnosis, risk stratification, and prognosis of heart failure. The use of soluble suppression of tumorigenicity-2 is not discussed specifically, though the guideline notes that "a widening array of biomarkers including markers of myocardial injury, inflammation, oxidative stress, vascular dysfunction, and matrix remodeling have been shown to provide incremental prognostic information over natriuretic peptides but remain without evidence of an incremental management benefit."
In 2010, the International Society of Heart and Lung Transplantation issued guidelines for the care of heart transplant recipients.88, The guidelines included the following recommendations (see Table 9 ).
Recommendation | COR | LOE |
“The standard of care for adult HT recipients is to perform periodic EMB during the first 6 to 12 postoperative months for surveillance of HT rejection.” | IIa | C |
“After the first post-operative year, EMB surveillance for an extended period of time (eg, every 4-6 months) is recommended in HT patients at higher risk for late acute rejection….” | IIa | C |
“Gene Expression Profiling (AlloMap) can be used to rule out the presence of ACR of grade 2R or greater in appropriate low-risk patients, between 6 months and 5 years after HT.” | IIa | B |
ACR: acute heart rejection; COR: class of recommendation; EMB: endomyocardial biopsy; HT: heart transplant; LOE: level of evidence.
The Kidney Disease Improving Global Outcomes (2009) issued guidelines for the care of kidney transplant recipients.89, The guidelines included the following recommendations (see Table 10 ).
Recommendation | SOR | LOE |
“We recommend kidney allograft biopsy when there is a persistent, unexplained increase in serum creatinine.” | Level 1 | C |
“We suggest kidney allograft biopsy when serum creatinine has not returned to baseline after treatment of acute rejection.” | Level 2 | D |
“We suggest kidney allograft biopsy every 7-10 days during delayed function.” | Level 2 | C |
“We suggest kidney allograft biopsy if expected kidney function is not achieved within the first 1-2 months after transplantation.” | Level 2 | D |
“We suggest kidney allograft biopsy when there is new onset of proteinuria.” | Level 2 | C |
“We suggest kidney allograft biopsy when there is unexplained proteinuria ≥3.0 g/g creatinine or ≥3.0 g per 24 hours.” | Level 2 | C |
LOE: level of evidence; SOR: strength of recommendation.
Not applicable.
The Centers for Medicare & Medicaid Services (2008) issued a noncoverage decision for the Heartsbreath test.90, The Centers determined that the evidence did not adequately define the technical characteristics of the test; nor did it demonstrate that Heartsbreath testing could predict heart transplant rejection, and therefore the test would not improve health outcomes in Medicare beneficiaries.
For AlloMap, HeartCare, AlloSure, Prospera, myTAIHEART, and the Presage ST2 Assay there are no national coverage determinations. In the absence of a national coverage determination, coverage decisions are left to the discretion of local Medicare carriers.
Some currently ongoing and unpublished trials that might influence this review are listed in Table 11.
NCT No. | Trial Name | Planned Enrollment | Completion Date |
AlloMap | |||
NCT01833195a | Outcomes AlloMap Registry: the Long-term Management and Outcomes of Heart Transplant Recipients With AlloMap Testing (OAR) | 2444 | Feb 2020 ( unknown) |
NCT02178943a | Utility of Donor-Derived Cell-free DNA in Association With Gene-Expression Profiling (AlloMap®) in Heart Transplant Recipients (D-OAR) | 100 | Feb 2020 (unknown ) |
HeartCare | |||
NCT05459181a | Molecular Outcome Surveillance Using AlloSure and AlloMap Guided Immunomodulation in Cardiac Transplant (MOSAIC) | 930 | Sep 2025 |
NCT03695601a | Surveillance HeartCare Outcomes Registry (SHORE) | 3450 | Jun 2027 (active, not recruiting) |
AlloSure (Kidney) | |||
NCT04566055a | Assessing AlloSure dd-cfDNA Monitoring Insights of Renal Allografts With Longitudinal Surveillance (ADMIRAL) | 1000 | Oct 2020 (active, not recruiting) |
NCT03326076a | Evaluation of Patient Outcomes From the Kidney Allograft Outcomes AlloSure Registry (KOAR) | 4000 | Dec 2025 (recruiting) |
NCT04601155a | Transition of Renal Patients Using AlloSure Into Community Kidney Care (TRACK) | 3500 | Sep 2026 (recruiting) |
AlloSure (Lung) | |||
NCT04318587a | Assessment of Donor Derived Cell Free DNA and Utility in Lung Transplantation | 50 | Sep 2023 (active, not recruiting) |
NCT05050955a | Allosure Lung Assessment and Metagenomics Outcomes Study (ALAMO) | 1500 | Dec 2026 ( recruiting) |
Prospera (Kidney) | |||
NCT04239703a | Trifecta-Kidney cfDNA-MMDx Study: Comparing the DD-cfDNA Test to MMDx Microarray Test, Central HLA Antibody Test, and Histology | 300 | Dec 2024 (recruiting) |
NCT04091984a | The PROspera Kidney Transplant ACTIVE Rejection Assessment Registry (ProActive) | 5000 | Oct 2027 (recruiting) |
NCT03984747a | Study for the Prediction of Active Rejection in Organs Using Donor-derived Cell-free DNA Detection (SPARO) | 500 | Oct 2028 (recruiting) |
Prospera (Heart) | |||
NCT04707872a | Trifecta-Heart cfDNA-MMDx Study: Comparing the DD-cfDNA test to MMDx Microarray Test and Central HLA Antibody Test | 300 | Jul 2024 (recruiting) |
NCT05081739a | Donor-Derived Cell-free DNA to Detect Rejection in Cardiac Transplantation (DETECT) | 600 | Jan 2025 (not yet recruiting) |
Prospera (Lung) | |||
NCT05170425a | Observational Registry Study With Sub-analysis (Patients Previously Randomized to LAMBDA 001) to Assess ProsperaTM Performance for Detection of CLAD After Lung Transplant (LAMBDA 002) | 1000 | Dec 2029 (not yet recruiting) |
NCT: national clinical trial.
a Denotes industry-sponsored or cosponsored trial.
Codes | Number | Description |
---|---|---|
CPT | 81595 | Cardiology (heart transplant), mRNA, gene expression profiling by real-time quantitative PCR of 20 genes (11 content and 9 housekeeping), utilizing subfraction of peripheral blood, algorithm reported as a rejection risk score |
83006 | Growth stimulation expressed gene 2 (ST , Interleukin 1 receptor like-1 | |
0018M | Transplantation medicine (allograft rejection, renal), measurement of donor and third-party-induced CD154+T-cytotoxic memory cells, utilizing whole peripheral blood, algorithm reported as a rejection risk score | |
0055U | Cardiology (heart transplant), cell-free DNA, PCR assay of 96 DNA target sequences (94 single nucleotide polymorphism targets and two control targets), plasma | |
0087U | Cardiology (heart transplant), mRNA gene expression profiling by microarray of 1283 genes, transplant biopsy tissue, allograft rejection and injury algorithm reported as a probability score | |
0088U | Transplantation medicine (kidney allograft rejection) microarray gene expression profiling of 1494 genes, utilizing transplant biopsy tissue, algorithm reported as a probability score for rejection | |
0118U | Transplantation medicine, quantification of donor-derived cell-free DNA using whole genome next-generation sequencing, plasma, reported as percentage of donor-derived cell-free DNA in the total cell-free DNA | |
0221U | Red cell antigen (ABO blood group) genotyping (ABO), gene analysis, next generation sequencing, ABO (ABO, alpha 1-3-N-acetylgalactosaminyltransferase and alpha 1-3-galactosyltransferase) gene | |
HCPCS | No codes | |
ICD-10-CM | Investigational for all relevant diagnoses | |
T86.20-T86.298 | Complications of heart transplant code range, | |
Z48.21 | Encounter for aftercare following heart transplant | |
Z94.1 | Heart transplant status | |
ICD-10-PCS | Not applicable. ICD-10-PCS codes are only used for inpatient services. There are no ICD procedure codes for laboratory tests. | |
Type of service | Laboratory | |
Place of service | Outpatient |
N/A
Date | Action | Description |
11/15/2024 | Annual Review | No Changes |
11/16/2023 | Annual Review | Policy updated with literature review through August 21, 2023; references added. Policy statements unchanged. |
11/11/2022 | Annual Review | Policy updated with literature review through August 24, 2022; references added. New investigational policy statement regarding dd-cfDNA testing in heart transplantation was added. Other changes to policy statements reflect minor editorial refinements; intent unchanged. TAI Diagnostics suspended production of the myTAIHEART test in 2020. Pico #3 myTAIHEART assay to determine prognosis and/or to predict acute cellular rejection was removed. |
11/30/2021 | Annual Review | Removed 0085T, deleted eff 12/31/2020- off cycle review 04/28/2021 |
11/12/2020 | Replace Policy | Policy updated with literature review through August 25, 2020. references added. Content from policy 2.04.130 (Molecular Testing for Chronic Heart Failure and Heart Transplant) was merged into this policy and the title was changed to "Laboratory Tests Post Transplant and for Heart Failure". |
11/05/2019 | Policy reviewed | Policy updated with literature review through August 5, 2019; no references added. Policy statements unchanged. |
5/16/2016 | Policy reviewed | Policy unchanged |
4/23/2015 | Replace policy | Policy updated with literature review through March 5, 2015. References 2-3 and 12 added. Policy statements unchanged. |
4/10/2014 | Replace policy | Policy updated with literature review through March 4, 2014; reference 9 added; policy statements edited for clarity. |
4/17/2013 | Replace policy | Policy updated with literature review through March 12, 2013. Policy statements unchanged. ICD-10 ADDED |
3/12/2012 | Policy reviewed | ICES |
6/10/2009 | Policy reviewed | ICES |
8/21/2007 | Policy reviewed | Policy unchanged |
2/21/2005 | Policy created | New Policy |