Medical Policy
Policy Num: 11.003.052
Policy Name: Molecular Markers in Fine Needle Aspirates of the Thyroid
Policy ID: [11.003.052] [Ac / B / M+ / P+] [2.04.78]
Last Review: September 24, 2024
Next Review: September 20, 2025
Related Policies: None
Population Reference No. | Populations | Interventions | Comparators | Outcomes |
1 | Individuals: · With thyroid nodule(s) and indeterminate findings on fine needle aspirate | Interventions of interest are: · Fine needle aspirate sample testing with molecular tests to rule out malignancy and to avoid surgical biopsy or resection | Comparators of interest are: · Surgical biopsy | Relevant outcomes include: · Disease-specific survival · Test accuracy · Test validity · Morbid events · Resource utilization |
2 | Individuals: · With thyroid nodule(s) and indeterminate findings on fine needle aspirate | Interventions of interest are: · Fine needle aspirate sample testing with molecular tests to rule in malignancy and to guide surgical planning | Comparators of interest are: · Surgical management based on clinicopathologic risk factors | Relevant outcomes include: · Disease-specific survival · Test accuracy · Test validity · Morbid events · Resource utilization |
3 | Individuals: · With thyroid nodule(s) and indeterminate findings on fine needle aspirate | Interventions of interest are: · Fine needle aspirate sample testing with molecular tests to rule out or rule in malignancy for surgical planning | Comparators of interest are: · Surgical management based on clinicopathologic risk factors and/or surgical biopsy | Relevant outcomes include: · Disease-specific survival · Test accuracy · Test validity · Morbid events · Resource utilization |
To determine which patients need thyroid resection, many physicians will perform a cytologic examination of fine needle aspirate (FNA) samples from a thyroid lesion; however, this method has diagnostic limitations. As a result, assays using molecular markers have been developed to improve the accuracy of thyroid FNA biopsies.
For individuals with thyroid nodule(s) and indeterminate findings on fine needle aspiration (FNA) who receive FNA sample testing with molecular tests to rule out malignancy and to avoid surgical biopsy or resection, the evidence includes prospective clinical validity studies with the Afirma GSC, a systematic review of prospective and retrospective clinical validity studies, a meta-analysis of real-world postvalidation data for the Afirma GSC platform with comparison to the validation study, and a chain of evidence to support clinical utility. Relevant outcomes are disease-specific survival, test accuracy and validity, morbid events, and resource utilization. A systematic review of 1 prospective and 6 retrospective trials demonstrated a high negative predictive value (NPV, 96%; 95% confidence interval [CI], 94% to 98%). In a multicenter validation study, the Afirma GSC was also reported to have a high NPV (96%; 95% CI, 90% to 99%). The meta-analysis of real-world Afirma GSC data indicated significantly higher NPV (as well as specificity and positive predictive value [PPV]) than in the validation study. These results are consistent with an earlier study on the Afirma GEC in the same study population and a randomized controlled trial of Afirma GSC in a similar population. In other multicenter and single-center studies, there is suggestive evidence that rates of malignancy are low in Afirma GSC or ThyroSeq v3 patients who are classified as benign or negative, with high NPVs (>90%) in a prospective trial with 31.8 months of post-testing imaging surveillance. The available evidence suggests that the decisions a physician makes regarding surgery are altered by Afirma GSC or ThyroSeq v3 results. A chain of evidence can be constructed to establish the potential for clinical utility with Afirma GSC and ThyroSeq v3 testing in cytologically indeterminate lesions, but evidence of improved outcomes must be demonstrated through at least 5 years of surveillance as recommended by the American College of Radiology. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals with thyroid nodule(s) and indeterminate findings on FNA who receive FNA sample testing with molecular tests to rule in malignancy and to guide surgical planning, the evidence includes prospective and retrospective studies of clinical validity. Relevant outcomes are disease-specific survival, test accuracy and validity, morbid events, and resource utilization. Variant analysis has the potential to improve the accuracy of an equivocal FNA of the thyroid and may play a role in preoperative risk stratification and surgical planning. Single-center studies have suggested that testing for a panel of genetic variants associated with thyroid cancer may allow for the appropriate selection of patients for surgical management for the initial resection. Prospective studies in additional populations are needed to validate these results. Although the presence of certain variants may predict more aggressive malignancies, the management changes that would occur as a result of identifying higher risk tumors, are not well-established. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals with thyroid nodule(s) and indeterminate findings on FNA who receive FNA sample testing with molecular tests to rule out malignancy and avoid surgical biopsy or to rule in malignancy for surgical planning, the evidence includes multiple retrospective and prospective clinical validation studies for the ThyroSeq test, a systematic review of retrospective and prospective studies, and 2 retrospective clinical validation studies that used a predicate test 17-variant panel (miRInform) test to the current ThyGenX and ThyraMIR. Relevant outcomes are disease-specific survival, test accuracy and validity, morbid events, and resource utilization. In a retrospective validation study on FNA samples, the 17-variant panel (miRInform) test and ThyraMIR had a sensitivity of 89%, and an NPV of 94%. A prospective clinical validation study of ThyroSeq v3 reported an NPV of 97% and PPV of 68%. Similarly, a systematic review including 3 prospective and 3 retrospective clinical validity studies reported an NPV of 92% and PPV of 70%. No studies were identified demonstrating the diagnostic characteristics of the marketed ThyGenX. No prospective studies were identified demonstrating evidence of direct outcome improvements. A chain of evidence for the ThyroSeq v3 test and combined ThyGenX and ThyraMIR testing would rely on establishing clinical validity. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Clinical input was sought to help determine whether testing for molecular markers in fine needle aspirates of the thyroid for management of individuals with thyroid nodule(s) with an indeterminate finding on the fine needle aspirates would provide a clinically meaningful improvement in net health outcome and whether the use is consistent with generally accepted medical practice. In response to requests, clinical input on 7 tests for molecular markers was received from 9 respondents, including 1 specialty society-level response, 1 physician from an academic center, and 7 physicians from 2 health systems
Clinical input supports that the following uses provide a clinically meaningful improvement in net health outcome and indicates the uses are consistent with generally accepted medical practice:
For individuals who have fine needle aspirate (FNA) of thyroid nodules with indeterminate cytologic findings (ie, Bethesda diagnostic category III [atypia/follicular lesion of undetermined significance] or Bethesda diagnostic category IV [follicular neoplasm/suspicion for a follicular neoplasm]) who receive the following types of molecular marker testing to rule out malignancy and to avoid surgical biopsy:
Afirma Gene Expression Classifier; or
ThyroSeq v2
For individuals who have FNA of thyroid nodules with indeterminate cytologic findings or Bethesda diagnostic category V (suspicious for malignancy) who receive the following types of molecular marker testing to rule in the presence of malignancy to guide surgical planning for the initial resection rather than a 2 stage surgical biopsy followed by definitive surgery:
ThyroSeq v2;
ThyraMIR microRNA/ThyGenX;
Afirma BRAF after Afirma Gene Expression Classifier; or
Clinical input does not support whether the use of RosettaGX Reveal testing in FNA of thyroid nodules provides a clinically meaningful improvement in the net health outcome or is consistent with generally accepted medical practice.
Further details from clinical input are included in the Supplemental Information section and in the Appendix.
The objective of this evidence review is to evaluate whether testing for molecular markers in fine needle aspirates of the thyroid improves the net health outcome in individuals with thyroid nodule(s) with an indeterminate finding on the fine needle aspirate.
For individuals who have thyroid nodules without strong clinical or radiologic findings suggestive of malignancy in whom surgical decision making would be affected by test results, the use of either of the following types of molecular marker testing or gene variant analysis in fine needle aspirates of thyroid nodules with indeterminate cytologic findings (ie, Bethesda diagnostic category III [atypia/follicular lesion of undetermined significance] or Bethesda diagnostic category IV [follicular neoplasm/suspicion for a follicular neoplasm]) may be considered medically necessary:
Afirma® Genomic Sequencing Classifier; or
ThyroSeq®.
The use of any of the following types of molecular marker testing or gene variant analysis in fine needle aspirates of thyroid nodules with indeterminate findings (Bethesda diagnostic category III [atypia/follicular lesion of undetermined significance] or Bethesda diagnostic category IV [follicular neoplasm/suspicion for a follicular neoplasm]) or suspicious findings (Bethesda diagnostic category V [suspicious for malignancy]) to rule in malignancy to guide surgical planning for initial resection rather than a 2-stage surgical biopsy followed by definitive surgery may be considered medically necessary:
ThyroSeq;
ThyraMIR® microRNA/ThyGenX®;
Afirma BRAF after Afirma Genomic Sequencing Classifier; or
Afirma MTC after Afirma Genomic Sequencing Classifier.
Gene expression classifiers, genetic variant analysis, and molecular marker testing in fine needle aspirates of the thyroid not meeting criteria outlined above, including but not limited to use of RosettaGX Reveal and single-gene TERT testing, are considered investigational.
In individuals who do not undergo surgical biopsy or thyroidectomy on the basis of gene expression classifier or molecular marker results, regular active surveillance is indicated.
Use of molecular marker testing based on fine needle aspirate of a thyroid nodule to rule in malignancy prior to surgical biopsy may guide surgical planning, particularly factors such as choice of surgical facility provider to ensure that the capability is available to conduct a frozen section pathologic reading during surgical biopsy so that surgical approach may be adjusted accordingly in a single surgery.
Experts recommend formal genetic counseling for individuals who are at risk for inherited disorders and who wish to undergo genetic testing. Interpreting the results of genetic tests and understanding risk factors can be difficult for some patients; genetic counseling helps individuals understand the impact of genetic testing, including the possible effects the test results could have on the individual or their family members. It should be noted that genetic counseling may alter the utilization of genetic testing substantially and may reduce inappropriate testing; further, genetic counseling should be performed by an individual with experience and expertise in genetic medicine and genetic testing methods.
See the Codes table for details.
Some Plans may have contract or benefit exclusions for genetic testing.
Benefits are determined by the group contract, member benefit booklet, and/or individual subscriber certificate in effect at the time services were rendered. Benefit products or negotiated coverages may have all or some of the services discussed in this medical policy excluded from their coverage.
Thyroid nodules are common, present in 5% to 7% of the U.S. adult population; however, most are benign, and most cases of thyroid cancer are curable surgically when detected early.
Sampling thyroid cells by fine needle aspirate (FNA) is currently the most accurate procedure to distinguish benign thyroid lesions from malignant ones, reducing the rate of unnecessary thyroid surgery for patients with benign nodules and triaging patients with thyroid cancer to appropriate surgery.
About 60% to 70% of thyroid nodules are classified cytologically as benign, and 4% to 10% of nodules are cytologically deemed malignant.1, However, the remaining 20% to 30% have equivocal findings, usually due to overlapping cytologic features between benign and malignant nodules; these nodules usually require surgery for a final diagnosis. Thyroid FNA cytology is classified by Bethesda System criteria into the following groups: nondiagnostic; benign; follicular lesion of undetermined significance or atypia of undetermined significance; follicular neoplasm (or suspicious for follicular neoplasm); suspicious for malignancy; and malignant. Lesions with FNA cytology in the atypia of undetermined significance or follicular neoplasm of undetermined significance or follicular neoplasm categories are often considered indeterminate.
There is some individualization of management for patients with FNA-indeterminate nodules, but many patients will require a surgical biopsy, typically thyroid lobectomy, with intraoperative pathology. Consultation would typically be the next step in the diagnosis. Approximately 80% of patients with indeterminate cytology undergo surgical resection; postoperative evaluation has revealed a malignancy rate ranging from 6% to 30%, making this a clinical process with very low specificity.2, Thus, if an analysis of FNA samples could reliably identify the risk of malignancy as low, there is potential for patients to avoid surgical biopsy.
Preoperative planning of optimal surgical management in patients with equivocal cytologic results is challenging, because different thyroid malignancies require different surgical procedures (eg, unilateral lobectomy vs. total or subtotal thyroidectomy with or without lymph node dissection) depending on several factors, including histologic subtype and risk-stratification strategies (tumor size, patient age). If a diagnosis cannot be made intraoperatively, a lobectomy is typically performed, and, if on postoperative histology the lesion is malignant, a second surgical intervention may be necessary for completion of thyroidectomy.
Most thyroid cancers originate from thyroid follicular cells and include well-differentiated papillary thyroid carcinoma (PTC; 80% of all thyroid cancers) and follicular carcinoma (15%). Poorly differentiated and anaplastic thyroid carcinomas are uncommon and can arise de novo or from preexisting well-differentiated papillary or follicular carcinomas. Medullary thyroid carcinoma originates from parafollicular or C cells and accounts for about 3% of all thyroid cancers.
The diagnosis of malignancy in the case of PTC is primarily based on cytologic features. If FNA in a case of PTC is indeterminate, surgical biopsy with intraoperative pathology consultation is most often diagnostic, although its efficacy and therefore its use will vary across institutions, surgeons, and pathologists. In 2016, reclassification of encapsulated follicular-variant PTC as a noninvasive follicular tumor with papillary-like nuclei was proposed and largely adopted; this classification removes the word carcinoma from the diagnosis to acknowledge the indolent behavior of these tumors.3,
For follicular carcinoma, the presence of invasion of the tumor capsule or blood vessels is diagnostic, and cannot be determined by cytology, because tissue sampling is necessary to observe these histologic characteristics. Intraoperative diagnosis of follicular carcinoma is challenging and often not feasible because extensive sampling of the tumor and capsule is usually necessary and performed on postoperative, permanent sections.
New approaches for improving the diagnostic accuracy of thyroid FNA include variant analysis for somatic genetic alterations, to more accurately classify which patients need to proceed to surgery (and may include the extent of surgery necessary), and a gene expression classifier to identify patients who do not need surgery and can be safely followed.
A number of genetic variants have been discovered in thyroid cancer. The most common 4 gene variants are BRAF and RAS single nucleotide variants (SNVs) and RET/PTC and PAX8/PPARγ rearrangements.
Papillary carcinomas carry SNVs of the BRAF and RAS genes, as well as RET/PTC and TRK rearrangements, all of which can activate the mitogen-activated protein kinase pathway.4, These mutually exclusive variants are found in more than 70% of papillary carcinomas. BRAF SNVs are highly specific for PTC. Follicular carcinomas harbor either RAS SNVs or PAX8/PPARγ rearrangements. These variants have been identified in 70% to 75% of follicular carcinomas. Genetic alterations involving the PI3K/AKT signaling pathway also occur in thyroid tumors, although they are rare in well-differentiated thyroid cancers and have a higher prevalence in less differentiated thyroid carcinomas. Additional variants known to occur in poorly differentiated and anaplastic carcinomas involve the TP53 and CTNNB1 genes. Medullary carcinomas, which can be familial or sporadic, frequently possess SNVs located in the RET gene.
Studies have evaluated the association between various genes and cancer phenotype in individuals with diagnosed thyroid cancer.5,6,7,
Telomerase reverse transcriptase (TERT) promoter variants occur with varying frequency in different thyroid cancer subtypes. Overall, TERT C228T or C250T variants have been reported in approximately 15% of thyroid cancers, with higher rates in the undifferentiated and anaplastic subtypes compared with the well-differentiated subtypes.8,TERT variants are associated with several demographic and histopathologic features such as older age and advanced TNM stage. TERT promoter variants have been reported to be independent predictors of disease recurrence and cancer-related mortality in well-differentiated thyroid cancer.9,10,11, Also, the co-occurrence of BRAF or RAS variants with TERT or TP53 variants may identify a subset of thyroid cancers with unfavorable outcomes.12,13,14,
SNVs in specific genes, including BRAF, RAS, and RET, and evaluation for rearrangements associated with thyroid cancers can be accomplished with Sanger sequencing or pyrosequencing or with real-time polymerase chain reaction (PCR) of single or multiple genes or by next-generation sequencing (NGS) panels. Panel tests for genes associated with thyroid cancer, with varying compositions, are also available. For example, Quest Diagnostics offers a Thyroid Cancer Mutation Panel, which includes BRAF and RAS variant analysis and testing for RET/PTC and PAX8/PPARγ rearrangements.
The ThyroSeq v3 Next-Generation Sequencing panel (Sonic Healthcare ) is an NGS panel of 112 genes. The test is indicated when FNA cytology suggests atypia of uncertain significance or follicular lesion of undetermined significance, follicular neoplasm or suspicious for follicular neoplasm, or suspicious for malignancy.15, In particular, it has been evaluated in patients with follicular neoplasm and/or suspicious for follicular neoplasm on FNA as a test to increase both sensitivity and specificity for cancer diagnosis. ThyGenX is an NGS panel that sequences 8 genes and identifies specific gene variants and translocations associated with thyroid cancer. ThyGenX is intended to be used in conjunction with the ThyraMIR microRNA expression test when the initial ThyGenX test is negative.
Genetic alterations associated with thyroid cancer can be assessed using gene expression profiling, which refers to the analysis of messenger RNA (mRNA) expression levels of many genes simultaneously. Several gene expression profiling tests are available and stratify tissue from thyroid nodules biologically.
The Afirma Gene Expression Classifier (Afirma GEC; Veracyte) analyzed the expression of 142 different genes to determine patterns associated with benign findings on surgical biopsy. It was designed to evaluate thyroid nodules that have an "indeterminate" classification on FNA as a method to select patients ("rule out") who are at low-risk for cancer. In 2017, Veracyte migrated the Afirma GEC microarray analysis to a next-generation RNA sequencing platform and now markets the Afirma Gene Sequencing Classifier (Afirma GSC) which evaluates 10,196 genes with 1115 core genes.
Other gene expression profiles have been reported in investigational settings, but have not been widely validated or used commercially (eg, Barros-Filho et al [2015],16, Zheng et al [2015]17,); they are not addressed in this review.
ThyraMIR is a microRNA expression-based classifier intended for use in thyroid nodules with indeterminate cytology on FNA following a negative result from the ThyGenX Thyroid Oncogene Panel.
Algorithmic testing involves the use of 2 or more tests in a prespecified sequence, with a subsequent test automatically obtained depending on results of an earlier test.
In addition to Afirma GSC, Veracyte also markets 2 "malignancy classifiers" that use mRNA expression-based classification to evaluate for BRAF variants (Afirma BRAF) or variants associated with medullary thyroid carcinoma (Afirma MTC). Table 1 outlines the testing algorithms for Afirma MTC and Afirma BRAF.
Test 1 | Test 1 Result | Reflex to Test 2 |
Thyroid nodule on fine needle aspirate | "Indeterminate" | Afirma MTC |
Afirma GSC | "Malignant" or "suspicious" | Afirma MTC |
Afirma GSC | "Suspicious" | Afirma BRAF |
Afirma GSC: Afirma Gene Sequencing Classifier; Afirma MTC: Afirma medullary thyroid carcinoma
In a description of the Afirma BRAF test, the following have been proposed as benefits of the mRNA-based expression test for BRAF variants: (1) PCR-based methods may have low sensitivity, requiring that a large proportion of the nodule have a relevant variant; (2) testing for only 1 variant may not detect patients with low-frequency variants that result in the same pattern of pathway activation; and (3) PCR-based approaches with high analytic sensitivity may require a large amount of DNA that is difficult to isolate from small FNA samples.18,
The testing strategy for both Afirma MTC and Afirma BRAF is to predict malignancy from an FNA sample with increased pretest probability for malignancy. A positive result with Afirma MTC or Afirma BRAF would inform preoperative planning such as planning for a hemi- versus a total thyroidectomy or performance of central neck dissection.
The ThyGenX Thyroid Oncogene Panel (Interpace Diagnostics; testing is done at Asuragen Clinical Laboratory) is an NGS panel designed to assess patients with indeterminate thyroid FNA results. It includes sequencing of 8 genes associated with PTC and follicular carcinomas. ThyGenX has replaced the predicate miRInform Thyroid test that assesses for 17 validated gene alterations.
ThyraMIR (Interpace Diagnostics) is a microRNA expression-based classifier intended for use in thyroid nodules with indeterminate cytology on FNA following a negative result from the ThyGenX Thyroid Oncogene Panel.
The testing strategy for combined ThyGenX and ThyraMIR testing is first to predict malignancy. A positive result on ThyGenX would "rule in" patients for surgical resection. The specific testing results from a ThyGenX positive test would be used to inform preoperative planning when positive. For a ThyGenX negative result, the reflex testing involves the ThyraMIR microRNA expression test to "rule out" for a surgical biopsy procedure given the high negative predictive value of the second test. Patients with a negative result from the ThyraMIR test would be followed with active surveillance and avoid a surgical biopsy.
Clinical laboratories may develop and validate tests in-house and market them as a laboratory service; laboratory-developed tests must meet the general regulatory standards of the Clinical Laboratory Improvement Amendments. Thyroid variant testing and gene expression classifiers are available under the auspices of the Clinical Laboratory Improvement Amendments. Laboratories that offer laboratory-developed tests must be licensed by the Clinical Laboratory Improvement Amendments for high-complexity testing. To date, the U.S. Food and Drug Administration has chosen not to require any regulatory review of this test.
In 2013, the THxID™-BRAF kit (bioMérieux), an in vitro diagnostic device, was approved by the U.S. Food and Drug Administration through the premarket approval process to assess specific BRAF variants in melanoma tissue via real-time PCR. However, there are currently no diagnostic tests for thyroid cancer mutation analysis with approval from the U.S. Food and Drug Administration. Table 2 provides a summary of commercially available molecular diagnostic tests for indeterminate thyroid pathology.
Test | Predicate | Methodology | Analyte(s) | Report |
Afirma® GSC | Afirma®GEC | mRNA gene expression | 1115 genes | Benign/suspicious |
Afirma® BRAF | mRNA gene expression | 1 gene | Negative/positive | |
Afirma® MTC | mRNA gene expression | Negative/positive | ||
ThyroSeq v3 | ThyroSeq v2 | Next-generation sequencing | 112 genes | Specific gene variant/translocation |
ThyGeNEXT® | ThyGenX®a, miRInform®a | Next-generation sequencing | 10 genes and 32 gene fusions | Specific gene variant/translocation |
ThyraMIR™ | microRNA expression | 10 microRNAs | Negative/positive | |
RosettaGX™ Reveal | microRNA expression | 24 microRNAs |
|
FNA: fine needle aspirate; GEC: Gene Expression Classifer; GSC: Gene Sequencing Classifier; mRNA: messenger RNA; MTC: medullary thyroid carcinoma; PCR: polymerase chain reaction.
a The miRInform® test is the predicate test to ThyGenX™ and is not commercially available.
This evidence review was created in January 2012 and has been updated regularly with searches of the PubMed database. The most recent literature update was performed through June 13, 2024.
Evidence reviews assess whether a medical test is clinically useful. A useful test provides information to make a clinical management decision that improves the net health outcome. That is, the balance of benefits and harms is better when the test is used to manage the condition than when another test or no test is used to manage the condition.
The first step in assessing a medical test is to formulate the clinical context and purpose of the test. The test must be technically reliable, clinically valid, and clinically useful for that purpose. Evidence reviews assess the evidence on whether a test is clinically valid and clinically useful. Technical reliability is outside the scope of these reviews, and credible information on technical reliability is available from other sources.
Promotion of greater diversity and inclusion in clinical research of historically marginalized groups (e.g., People of Color [African-American, Asian, Black, Latino and Native American]; LGBTQIA (Lesbian, Gay, Bisexual, Transgender, Queer, Intersex, Asexual); Women; and People with Disabilities [Physical and Invisible]) allows policy populations to be more reflective of and findings more applicable to our diverse members. While we also strive to use inclusive language related to these groups in our policies, use of gender-specific nouns (e.g., women, men, sisters, etc.) will continue when reflective of language used in publications describing study populations.
One purpose of molecular testing in individuals with indeterminate findings on fine needle aspirate(s) (FNA) of thyroid nodules is to rule out malignancy and eliminate the need for surgical biopsy or resection.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with indeterminate findings on FNAs of thyroid nodules who would be willing to undergo watchful waiting, depending on the results of their molecular testing. Patients with indeterminate findings after FNA of thyroid nodule presently proceed to surgical biopsy or resection.
The test being considered is molecular testing, which includes either Afirma GSC (Gene Sequencing Classifier) (predicate Afirma GEC [Gene Expression Classifier]) or RosettaGX Reveal.
The following practice is currently being used: standard surgical management through surgical biopsy or resection for biopsy.
The potential beneficial outcome of primary interest would be avoiding an unneeded surgical biopsy or resection (eg, lobectomy or hemithyroidectomy) in a true-negative thyroid nodule that is benign.
Potential harmful outcomes are those resulting from false-negative test results, which may delay diagnosis and surgical resection of thyroid cancer. For small, slow-growing tumors, it is uncertain that a delay in diagnosis would necessarily worsen health outcomes.
The time frame for evaluating the performance of the test is the time from the initial FNA to surgical biopsy or resection measured in weeks to months following an indeterminate result. Papillary thyroid cancer (PTC) is indolent, and a nodule could be observed for many years to ensure no clinical change. Specifically, the American College of Radiology Thyroid Imaging, Reporting and Data System (TI-RADS) recommends surveillance of suspicious nodules through 5 years.19,
For the evaluation of clinical validity of the molecular testing, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Lee et al (2022) performed a systematic review and meta-analysis on the diagnostic performance of molecular tests in the assessment of indeterminate thyroid nodules.20, Inclusion criteria for trials included indeterminate thyroid results via FNA that included Bethesda categories III and IV, conclusive histopathological results in a group of benign and suspicious changes, and the use of Afirma GSC, ThyroSeq v3, and ThyGeNext as index tests. Investigators identified 7 studies on Afirma GSC: 1 prospective study by Livhits et al (2021), described below, and 6 retrospective studies. Pooled data for GSC studies on 472 thyroid nodules demonstrated a sensitivity of 96.6% (95% confidence interval [CI], 89.7% to 98.9%), specificity of 52.9% (95% CI, 23.4% to 80.5%), positive predictive value (PPV) of 63% (95% CI, 51% to 74%), and negative predictive value (NPV) of 96% (95% CI, 94% to 98%). Limitations of this meta-analysis include the scarcity of available cohort analyses of the molecular tests and the lack of long-term findings.
Nasr et al (2023) performed a meta-analysis of 13 real-world postvalidation studies (N=1976 patients with indeterminate thyroid nodules) of the Afirma GSC platform and compared results to the validation study by Patel et al (2018, described below).21, Studies performed prior to publication of the validation study and commercial availability of Afirma GSC were excluded. Among 11 studies reporting histopathological results for patients who underwent surgery, sensitivity was 97.2% (95% CI, 1.7% to 99.1%; I2=0%), specificity was 87.7% (95% CI, 83.2% to 91.0%; I2=63%), PPV ranged from 49.3% (including patients with suspicious molecular testing results who did not undergo surgery; 95% CI, 41.3% to 57.4%; I2 not reported) to 64.9% (excluding patients with suspicious molecular testing results who did not undergo surgery; 95% CI, 54.4% to 74.1%; I2=79%), and NPV was 99.5% (95% CI, 98.0% to 99.9%; I2=0%). Specificity, PPV (excluding patients with suspicious results who did not undergo surgery), and NPV were significantly improved compared to the values reported in the validation study (p<.05 for each comparison).
Patel et al (2018) reported a validation study for the Afirma GSC test. The study included 210 thyroid nodules from 183 patients that had indeterminate results (Bethesda III or IV) on FNA, see Table 3.22, All FNA samples had been previously used in the validation of the Afirma GEC test as reported by Alexander et al (2012) in a 19-month, prospective, multicenter (49 academic and community sites) study.23, Patel et al (2018) used the banked samples which were reassayed with next-generation sequencing (NGS) for the Afirma GSC validation study.22, The previous central, blinded postoperative consensus histopathological diagnosis was used as the reference standard (210 samples) and all personnel were blinded to the other outcomes. The sensitivity of the Afirma GSC study was 91.1% with a specificity of 68.3% and NPV of 96.1% (see Table 4). There were 4 false negatives in patients with malignant nodules who would have been assigned for active observation. In comparison, Afirma GEC correctly identified 78 of 85 malignant nodules as suspicious (92% sensitivity; 95% CI, 84% to 97%) with specificity of 52% (95% CI, 44% to 59%). The NPV ranged from 85% for "suspicious cytologic findings" to 95% for "atypia of undetermined clinical significance." With sensitivity that was similar to the Afirma GEC test, the Afirma GSC improved specificity. There were no notable study limitations.
Livhits et al (2021) published a randomized, controlled study that compared the Afirma GSC test to the ThyroSeq v3 test in patients with thyroid nodules with indeterminate FNA results (Bethesda III or IV).24, The study reported clinical validity for both tests; the results of the Afirma GSC test are summarized in Table 3 and Table 4. The study used histopathologic review by expert thyroid pathologists as the reference standard. The study included 201 nodules in the Afirma GSC group. The sensitivity of Afirma GSC was 100%, specificity was 79.6%, and the NPV was 100%. A limitation of the study is that the pathologists who interpreted the histopathologic diagnosis were not blinded to the results of the molecular test. Patients in this trial who were managed nonoperatively were prospectively surveilled via ultrasound for 12 to 60 months, with results of surveillance reported with median follow-up of 31.8 months.25, Among the nodules initially managed nonoperatively, 44 patients were lost to follow-up without surveillance imaging and were excluded from the analysis, with surveillance data available for 195 nodules. Over the course of surveillance, 84% of nodules with benign or negative molecular testing remained stable. Among the 26 nodules with benign or negative molecular testing that exhibited growth on ultrasound, 12 underwent surgery, with 11 histopathologically diagnosed as benign; the 1 malignant nodule was diagnosed as a minimally invasive Hürthle cell carcinoma. Among 33 nodules with suspicious or positive molecular testing that were initially managed nonoperatively (due to patient preference or other reasons), 15 were ultimately resected, 6 of which were benign. In surgically-confirmed cases, the sensitivity of the Afirma GSC and ThyroSeq v3 tests was 100% and 97%, respectively; specificity was 40% and 38%, PPV was 57% and 64%, and NPV was 100% and 92%, respectively (p>.05 for all comparisons between test platforms).
Study | Study Population | Design | Reference Standard | Threshold for Positive Index Test | Timing of Reference and Index Tests | Blinding of Assessors | Comment |
Patel et al (2018)22, | 183 patients with 210 indeterminate thyroid nodules by FNA | Multicenter, non-concurrent prospective validation trial | Consensus histopathology diagnosis | Central, blinded histopathological review from Alexander et al (2012) | Assessors were blinded to the pathology | Samples were previously used to validate Afirma GEC | |
Livhits et al (2021)24, | 201 indeterminate thyroid nodules by FNA (Afirma GSC)* | Multicenter, randomized controlled trial | Histopathologic diagnosis | Classified as malignant or benign | Samples were tested after surgery | Assessors were unblinded to results of molecular testing |
FNA: Fine needle aspirate; Afirma GEC: gene expression classifier; Afirma GSC: gene sequencing classifier.
*Study included a comparator group assigned to ThyroSeq (reported below)
Study | Initial N | Final N | Excluded Samples | Prevalence of Condition | Clinical Validity (95% Confidence Interval) | |||
Sensitivity | Specificity | PPV | NPV | |||||
Patel et al (2018)22, | 210 nodules | 191 nodules | 19 with insufficient residual RNA | 91.1 (79 to 98) | 68.3 (60 to 76) | 47.1 (36 to 58) | 96.1 (90 to 99) | |
Livhits et al (2021)24, | 201 assigned to Afirma GSC | 180 nodules | 21 were excluded | 100 (88.8 to 100) | 79.6 (71.7 to 86.1) | 53.5 (39.9 to 66.7) | 100 (96.6 to 100) |
Afirma GSC: gene sequencing classifier; NPV: negative predictive value; PPV: positive predictive value; RNA: ribonucleic acid.
Meta-analyses have been performed with studies reporting on the performance of the predicate Afirma GEC in cytologically indeterminate nodules.26,27, Retrospective studies are subject to ascertainment bias because a large proportion of individuals with Afirma benign reports did not undergo surgery, which makes determining the sensitivity and specificity of the GEC assay impossible.
Supportive information on the accuracy of benign results can be obtained from studies that report long-term follow-up of individuals with indeterminate FNA cytology and Afirma benign results. There are several studies that reported long-term follow-up of Afirma GEC.28,29,30, Valderrabano et al (2019) used the benign call rate and PPV of post-marketing studies for a simulation study, concluding that the initial validation study cohort of Afirma GEC was not representative of the populations in whom the test has been used, raising questions regarding its diagnostic performance.31, Because the Afirma GSC used the same validation study, these findings would also apply to Afirma GSC.
Harrell et al (2019) reported a retrospective comparison of Afirma GEC (2011 to July 2017) and Afirma GSC (August 2017 through June 2018) for indeterminate FNA.32,Afirma GSC identified fewer indeterminate nodules as suspicious (54/139, 38.8%) compared to GEC (281/481, 58.4%) and led to a lower surgery rate, decreasing from 56% in the GEC group to 31% in the GSC group. A similar retrospective comparison was conducted by Polavarapu et al (2021), comparing Afirma GEC and Afirma GSC for indeterminate FNA between January 2013 through December 2019.33, Of the 468 indeterminate thyroid nodules included, no molecular testing was performed in 273, 71 had GEC, and 124 had GSC. Use of Afirma GSC led to a lower surgery rate (39.5%; p=.0001) compared to GEC (59.2%) and no molecular testing (67.8%). Additionally, malignancy rate was 20% with no molecular testing, 22% in GEC, and 39% in GSC (p=.022). Afirma GEC benign cell rate was 46%; sensitivity was 100%, specificity was 61%, NPV was 100%, and PPV was 28%. With Afirma GSC, benign cell rate was 60%, sensitivity was 94%, specificity was 76%, NPV was 97%, and PPV was 41%. In conclusion, Afirma GSC testing had a significant reduction in surgical rates and increase in malignancy rates. Sensitivity and NPV were high for both GEC and GSC. A 2023 retrospective analysis of 408 indeterminate thyroid nodules compared the Afirma GSC + XA (n=40), Afirma GEC + GSC (n=255), and Interpace Diagnostics ThyGeNEXT + ThyraMIR platforms (n=113).34, Patients either underwent surgery (56.4%) or were monitored for at least 6 months with ultrasound imaging. Sensitivity of the GSC + XA platform was greater than the GEC + GSC platform (80.0% vs 75.81%; p<.001) but not the ThyGeNEXT + ThyraMIR platform (47.4%; p=.08); this may be attributable to the relatively small size of the GSC + XA group. Specificity of the Afirma GSC + Xa (91.4%) and ThyGeNEXT + ThyraMIR platforms (88.3%) was greater than the GEC + GSC platform (45.1%; p<.001 for both comparisons). NPV was >85% for all cohorts and was highest with the GSC + XA platform (97.0%).
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials.
No evidence directly demonstrating improved outcomes in patients managed with the Afirma GEC was identified.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because no direct evidence of utility was identified, a chain of evidence was developed, which addresses 2 key questions:
Does use of the Afirma GEC in individuals with cytologically indeterminate thyroid nodules change clinical management (in this case, reduced thyroid resections)?
Do those management changes improve outcomes?
The clinical setting in which the Afirma GEC is meant to be used is well-defined: individuals with atypia of undetermined significance (AUS) or follicular lesion of undetermined significance (FLUS) or follicular neoplasm or who are suspicious for follicular neoplasm (SFN) on FNA, who do not have other indications for thyroid resection (ie, in whom the GEC results would play a role in surgical decision making). Decision impact studies, most often reporting on clinical management changes but not on outcomes after surgical decisions were made, have suggested that, in at least some cases, surgical decision making changed.35,36,37,) 38,39, It cannot be determined from these studies whether the changes in management improved health outcomes.
A simplified decision model was developed for use with Afirma GEC (which can also be applied to use of the Afirma GSC) in individuals with cytologically indeterminate FNA samples. It is shown in Appendix Figure 1. It is assumed that when Afirma GEC/GSC is not used, patients with cytologically indeterminate FNA results undergo thyroid resection. When Afirma GEC/GSC is used, those with Afirma suspicious lesions undergo resection, while those who have Afirma benign lesions do not. In this case, compared with the standard care plan, some patients without cancer will have avoided a biopsy, which is weighed against the small increase in missed cancers, in patients who had cancer but tested as Afirma benign.
Assuming that the rate of cancer in cytologically indeterminate thyroid nodules is approximately 20%,40, in the standard care plan, 80% of patients with cytologically indeterminate FNA samples will undergo an unnecessary biopsy. Applying the test characteristic values from Alexander et al (2012),23, it is estimated that approximately 1.6% of individuals with true cancer would be missed, but approximately 38%, instead of 80%, would undergo unneeded surgery. The study by Kim et al (2023), described previously above, reported only 1 false-negative case among 15 patients with nodules demonstrating growth on surveillance imaging over 3 years who underwent delayed surgery, suggesting that the rate of false-negative results and avoided unnecessary surgeries may be further improved with the Afirma GSC and ThyroSeq v3 platforms.25,
Whether the tradeoff between avoiding unneeded surgeries and the potential for missed cancer is worthwhile depends, in part, on patient and physician preferences. However, some general statements may be made by considering the consequences of a missed malignancy and the consequences of unnecessary surgery. Most missed malignancies will be PTCs, which have an indolent course. Thyroid nodules are amenable to ongoing surveillance (clinical, ultrasound, and with repeat FNAs), with minimal morbidity.
Thyroid resection is a relatively low-risk surgery. However, the consequences of surgery can be profound. Patients who undergo a hemi- or subtotal thyroidectomy have a risk of recurrent laryngeal nerve damage and parathyroid gland loss. The standard of care for thyroid nodules is based on an intervention that is stratified by FNA cytology results, which are grouped into categories with differing prognosis. Avoiding invasive surgery in situations where patients are at very low likelihood of having an invasive tumor is likely beneficial. Among the low-risk population, the alternative to surgical biopsy is ongoing active surveillance.
While the Kim et al (2023) study is encouraging, evidence of improved outcomes through 5 years of surveillance is needed as recommended by the American College of Radiology.19,
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Lithwick-Yanai et al (2017) described the development and initial clinical validation of the RosettaGX Reveal quantitative real-time polymerase chain reaction assay for 24 microRNA samples in a multicenter, retrospective cohort study using 201 FNA smears.41, The results of the clinical validation study are reported in Table 5.
Study | Initial N | Final N | Excluded Samples | Prevalence of Condition | Clinical Validity (95% Confidence Interval) | |||
Sensitivity | Specificity | PPV | NPV | |||||
Lithwick-Yanai et al (2017) 41, | 201 FNA smears | 189 passing QC | 12 | 85 (74 to 93) | 72 (63 to 79) | NR | 91 (84 to 96) | |
150 with consensus agreement | 98 (87 to 100) | 78 (69 to 85) | NR | 99 (94 to 100) |
FNA: fine needle aspirate; NPV: negative predictive value; NR: not reported; PPV: positive predictive value; QC: quality control.
Walts et al (2018) reported a blinded evaluation of RosettaGX Reveal in 81 archived FNA smears that had Afirma GEC results and histopathology.42, Afirma GEC had been requested following indeterminate FNA and had classified 74 nodules as suspicious and 7 as benign. The 81 patients underwent surgery based on Afirma GEC results or clinical factors. The final diagnosis from histopathology was 63 benign and 18 malignant thyroid nodules. Reveal classified 14 of the 18 malignant nodules as suspicious for a sensitivity of 77.8% and specificity of 60.3%.
No prospective clinical studies for RosettaGX Reveal were identified.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials. No evidence directly demonstrating improved outcomes in patients managed with the RosettaGX Reveal was identified.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A systematic review of 1 prospective and 6 retrospective trials demonstrated a high NPV (96%; 95% CI, 94% to 98%), with a recent meta-analysis of real-world postvalidation data indicating significantly better diagnostic performance of the Afirma GSC platform than in its validation study. In a multicenter validation study, Afirma GSC was also reported to have a high NPV (96%; 95% CI, 90% to 99%). These results are consistent with an earlier study on the Afirma GEC in the same study population and with a randomized controlled trial of Afirma GSC in a similar study population. In other multicenter and single-center studies, there is suggestive evidence that rates of malignancy are low in Afirma patients who are classified as benign. One prospective study with long-term imaging surveillance of 195 nodules initially managed nonoperatively based on negative/benign Afirma GSC or ThyroSeq v3 testing only indicated 1 false-negative case over 31.8 months of follow-up. The available evidence suggests that physician decision making about surgery is altered by Afirma GSC or ThyroSeq v3 results. A chain of evidence can be constructed to establish the potential for clinical utility with Afirma GSC and ThyroSeq v3 testing in cytologically indeterminate lesions, but evidence of improved outcomes must be demonstrated through at least 5 years of surveillance as recommended by the American College of Radiology..
For the RosettaGX Reveal test, 2 retrospective clinical validation studies have been reported. No prospective studies for patients managed with the RosettaGX Reveal were identified, so the clinical validity remains uncertain.
For individuals with thyroid nodule(s) and indeterminate findings on fine needle aspiration (FNA) who receive FNA sample testing with molecular tests to rule out malignancy and to avoid surgical biopsy or resection, the evidence includes prospective clinical validity studies with the Afirma GSC, a systematic review of prospective and retrospective clinical validity studies, a meta-analysis of real-world postvalidation data for the Afirma GSC platform with comparison to the validation study, and a chain of evidence to support clinical utility. Relevant outcomes are disease-specific survival, test accuracy and validity, morbid events, and resource utilization. A systematic review of 1 prospective and 6 retrospective trials demonstrated a high negative predictive value (NPV, 96%; 95% confidence interval [CI], 94% to 98%). In a multicenter validation study, the Afirma GSC was also reported to have a high NPV (96%; 95% CI, 90% to 99%). The meta-analysis of real-world Afirma GSC data indicated significantly higher NPV (as well as specificity and positive predictive value [PPV]) than in the validation study. These results are consistent with an earlier study on the Afirma GEC in the same study population and a randomized controlled trial of Afirma GSC in a similar population. In other multicenter and single-center studies, there is suggestive evidence that rates of malignancy are low in Afirma GSC or ThyroSeq v3 patients who are classified as benign or negative, with high NPVs (>90%) in a prospective trial with 31.8 months of post-testing imaging surveillance. The available evidence suggests that the decisions a physician makes regarding surgery are altered by Afirma GSC or ThyroSeq v3 results. A chain of evidence can be constructed to establish the potential for clinical utility with Afirma GSC and ThyroSeq v3 testing in cytologically indeterminate lesions, but evidence of improved outcomes must be demonstrated through at least 5 years of surveillance as recommended by the American College of Radiology. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
**Clinical input supports that the following uses provide a clinically meaningful improvement in net health outcome and indicates the uses are consistent with generally accepted medical practice:
For individuals who have fine needle aspirate (FNA) of thyroid nodules with indeterminate cytologic findings (ie, Bethesda diagnostic category III [atypia/follicular lesion of undetermined significance] or Bethesda diagnostic category IV [follicular neoplasm/suspicion for a follicular neoplasm]) who receive the following types of molecular marker testing to rule out malignancy and to avoid surgical biopsy:
*Afirma Gene Expression Classifier; or
*ThyroSeq v2
[X] Medically Necessary as per Clinical Input | [ ] Investigational |
The purpose of testing for molecular markers (eg, single nucleotide variants and gene rearrangements) in individuals with indeterminate findings on FNA of thyroid nodules is to rule in malignancy and to guide surgical approach or management.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with indeterminate findings on FNA(s) of thyroid nodules. Patients with indeterminate findings would presently proceed to surgical biopsy perhaps with intraoperative pathology consultation (ie, intraoperative frozen section) if available.
The test being considered is testing for molecular markers (eg, single nucleotide variants and gene rearrangements) with Afirma BRAF and Afirma MTC (medullary thyroid carcinoma) to guide surgical planning to ensure the capability for intraoperative pathologic confirmation of malignancy to adjust to definitive surgery for initial resection if appropriate.
The following practices are currently being used: standard surgical management through surgical resection, including a 2-stage surgical biopsy (ie, lobectomy) followed by definitive surgery (ie, hemithyroidectomy or thyroidectomy).
The potential beneficial outcome of primary interest is appropriate surgical planning in the preoperative period (eg, hemithyroidectomy or thyroidectomy when malignancy is predicted). This has the potential benefit of reducing the likelihood of having the patient repeating surgery if a diagnosis is not made on frozen pathology section during the initial surgery if lobectomy is done as a first procedure.
Potential harmful outcomes are those resulting from false-positive results. However, the use of intraoperative confirmation of malignancy through frozen pathology section in patients with positive molecular marker testing would mitigate any risk of inappropriately performing more extensive thyroidectomy in the absence of malignancy.
The time frame for evaluating the performance of the test varies from the initial FNA to surgical resection to weeks to months following an indeterminate result.
For the evaluation of clinical validity of the molecular testing, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Less evidence exists on the validity of gene expression profiling to rule in malignancy (specifically, the Afirma BRAF and Afirma MTC tests). Genetic variants can be used to improve the sensitivity and specificity for diagnosing indeterminate FNA of the thyroid, with the goal of identifying variants that predict malignancy in FNA samples.
Fnais et al (2015) conducted a systematic review and meta-analysis of studies reporting on the test accuracy of BRAF variant testing in the diagnosis of PTC.43, Reviewers included 47 studies with 9924 FNA samples. For all cytologically indeterminate nodules, the pooled sensitivity estimate for BRAF variant testing was 31% (95% CI, 6% to 56%). Among nodules suspicious for malignancy on FNA, the pooled sensitivity estimate for BRAF variant testing was 52% (95% CI, 39% to 64%; I2=77%).
Diggans et al (2015), described the development and validation of the Afirma BRAF test, for a subset of 213 thyroid nodule FNA samples for which histopathology was available, Afirma BRAF test results were compared with pathologic findings.18, Afirma BRAF classified all histopathologically benign samples as BRAF V600E-negative (specificity, 100%; 95% CI, 97.4% to 100%). Of the 73 histopathologically malignant samples, the Afirma BRAF test identified 32 as BRAF-positive (sensitivity, 43.8%; 95% CI, 32.2% to 55.9%).
In a study describing the development and validation of the Afirma MTC classifier, Kloos et al (2016) evaluated the MTC classifier in a sample of 10,488 thyroid nodule FNA samples referred for GEC testing.44, In this sample, 43 cases were Afirma MTC-positive, of which 42 were considered to be clinically consistent with MTC on pathology or biochemical testing, for a PPV of 97.7% (95% CI, 86.2% to 99.9%).
The presence of BRAF or telomerase reverse transcriptase (TERT) variants is strongly associated with malignancy in thyroid nodule FNA samples. BRAF or TERT variants have also been associated with more aggressive clinicopathologic features in individuals diagnosed with PTC.
Adeniran et al (2011) assessed 157 cases with equivocal thyroid FNA readings (indeterminate and suspicious for PTC) or with a positive diagnosis for PTC and concomitant BRAF variant analysis.1, The results of histopathologic follow-up correlated with the cytologic interpretations and BRAF status. Based on the follow-up diagnosis after surgical resection, the sensitivity for diagnosing PTC was 63.3% with cytology alone and 80.0% with the combination of cytology and BRAF testing. No false-positives were noted with either cytology or BRAF variant analysis. All PTCs with an extrathyroidal extension or aggressive histologic features were positive for a BRAF variant. The authors concluded that patients with an equivocal cytologic diagnosis and a BRAF V600E variant could be candidates for total thyroidectomy and central lymph node dissection.
Xing et al (2009) investigated the utility of BRAF variant testing of thyroid FNA specimens for preoperative risk stratification of PTC in 190 patients.45, A BRAF variant in preoperative FNA specimens was associated with poorer clinicopathologic outcomes for PTC. Compared with the wild-type allele, a BRAF variant strongly predicted extrathyroidal extension (23% vs 11%; p=.039), thyroid capsular invasion (29% vs 16%; p=.045), and lymph node metastasis (38% vs 18%; p=.002). During a median follow-up of 3 years (range, 0.6 to 10 years), PTC persistence or recurrence was seen in 36% of BRAF variant-positive patients and 12% of BRAF variant-negative patients, with an odds ratio (OR) of 4.16 (95% CI, 1.70 to 10.17; p=.002). The PPV and NPV for preoperative FNA-detected BRAF variant to predict PTC persistence or recurrence were 36% and 88%, respectively, for all histologic subtypes of PTC. The authors concluded that preoperative BRAF variant testing of FNA specimens might provide a novel tool to preoperatively identify PTC patients at higher risk for extensive disease (extrathyroidal extension and lymph node metastases) and those more likely to manifest disease persistence or recurrence.
Yin et al (2016) reported on a systematic review and meta-analysis evaluating TERT promoter variants and aggressive clinical behaviors in PTC.46, Eight eligible studies (N =2035 patients; range, 30 to 507) were included. Compared with wild-type, TERT promoter variant status was associated with lymph node metastasis (OR, 1.8; 95% CI, 1.3 to 2.5; p=.001), extrathyroidal extension (OR, 2.6; 95% CI, 1.1 to 5.9; p=.03), distant metastasis (OR, 6.1; 95% CI, 3.6 to 10.3; p<.001), advanced TNM stages III or IV (OR, 3.2; 95% CI, 2.3 to 4.5; p<.001), poor clinical outcome (persistence or recurrence; OR, 5.7; 95% CI, 3.6 to 9.3; p<.001), and mortality (OR, 8.3; 95% CI, 3.8 to 18.2; p<.001).
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials.
Testing for specific variants associated with thyroid cancer (eg, BRAF V600E, TERT, and RET variants, RET/PTC and PAX8/PPARγ rearrangements) is generally designed to "rule in" cancer in nodules with indeterminate cytology on FNA.47, (Of note, some gene panels, such as the ThyroSeq panel, may have a high enough NPV that their clinical use could also be considered as a molecular marker to predict benignancy; see next section.) A potential area for clinical utility for this type of variant testing would be in informing preoperative planning for thyroid surgery following initial thyroid FNA, such as planning for a hemi- versus a total thyroidectomy or performance of central neck dissection.
In a retrospective analysis, Yip et al (2014) reported on outcomes after implementation of an algorithm incorporating molecular testing of thyroid FNA samples to guide the extent of initial thyroid resection.48, The study included a cohort of patients treated at a single academic center at which molecular testing (BRAF V600E, BRAF K601E, NRAS codon61, HRAS codon 61, and KRAS codon 12 and 13 single nucleotide variants; RET/PTC1, RET/PTC3, and PAX8/PPARγ rearrangements) was prospectively obtained for all FNAs with indeterminate cytology (FLUS, follicular neoplasm, suspicious for malignancy), and for selective FNAs at the request of the managing physician for selected nodules with benign or nondiagnostic cytology. The study also included a second cohort of patients who did not have molecular testing results available. For patients treated with a molecular diagnosis, a positive molecular diagnostic test was considered an indication for an initial total thyroidectomy. Patients with FLUS and negative molecular diagnostic results were followed with repeat FNA, followed by lobectomy or total thyroidectomy if indeterminate pathology persisted. Patients with a follicular neoplasm or suspicious for malignancy results on cytology and a negative molecular diagnostic result were managed with lobectomy or total thyroidectomy.
The sample included 671 patients, 322 managed with and 349 without molecular diagnostics. Positive molecular testing results were obtained in 56 (17% of those managed with molecular diagnostics) patients, most commonly RAS variants (42/56 [75%]), followed by BRAF V600E (10/56 [18%]) and BRAF K601E (2/56 [4%]) variants, and PAX8/PPARγ rearrangements (2/56 [4%]). Compared with those managed without molecular diagnostics (63%), patients managed with molecular diagnostics (69%) were nonsignificantly less likely to undergo total thyroidectomy as an initial procedure (p=.08). However, they had nonsignificantly higher rates of central compartment lymph node dissection (21% vs. 15%, p=.06). Across both cohorts, 25% (170/671) of patients had clinically significant thyroid cancer, with no difference in thyroid cancer rates based on the type of initial surgery (26% for total thyroidectomy vs. 22% for lobectomy, p=.3). The incidence of clinically significant thyroid cancer after initial lobectomy (ie, requiring a 2-stage surgery) was significantly lower for patients managed with molecular diagnostics (17% vs. 43%, p<.001). An indeterminate FNA result had a sensitivity and specificity for the diagnostic of thyroid cancer of 89% and 27%, respectively, with a PPV of 29% and an NPV of 88%. The addition of molecular diagnostics to FNA results increased the specificity for a cancer diagnosis to 95% and the PPV to 82%.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A task force from the American Thyroid Association (2015) published a review with recommendations for the surgical management of FNA-indeterminate nodules using various molecular genetic tests.49, This review reported on the estimated likelihood of malignancy in an FNA-indeterminate nodule depending on results of the Afirma GEC test (described above) and other panels designed to rule in malignancy. Depending on the estimated prebiopsy likelihood of malignancy, recommendations for surgery included observation, active surveillance, repeat FNA, diagnostic lobectomy, or oncologic thyroidectomy.
The available evidence has suggested that the use of variant testing in thyroid FNA samples is generally associated with high specificity and PPV for clinically significant thyroid cancer. The most direct evidence related to the clinical utility of variant testing for genes associated with malignancy in thyroid cancer comes from a single-center retrospective study that reported surgical decisions and pathology findings in patients managed with and without molecular diagnostics. There is a potential clinical utility for identifying malignancy with higher certainty on FNA if such testing permits better preoperative planning at the time of thyroid biopsy, potentially avoiding the need for a separate surgery. A statement from the American Thyroid Association provides some guidelines for surgeons managing patients with indeterminate nodules. However, adoption of these guidelines in practice and outcomes associated with them is uncertain.
For individuals with thyroid nodule(s) and indeterminate findings on FNA who receive FNA sample testing with molecular tests to rule in malignancy and to guide surgical planning, the evidence includes prospective and retrospective studies of clinical validity. Relevant outcomes are disease-specific survival, test accuracy and validity, morbid events, and resource utilization. Variant analysis has the potential to improve the accuracy of an equivocal FNA of the thyroid and may play a role in preoperative risk stratification and surgical planning. Single-center studies have suggested that testing for a panel of genetic variants associated with thyroid cancer may allow for the appropriate selection of patients for surgical management for the initial resection. Prospective studies in additional populations are needed to validate these results. Although the presence of certain variants may predict more aggressive malignancies, the management changes that would occur as a result of identifying higher risk tumors, are not well-established. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
**Clinical input supports that the following uses provide a clinically meaningful improvement in net health outcome and indicates the uses are consistent with generally accepted medical practice:
For individuals who have fine needle aspirate (FNA) of thyroid nodules with indeterminate cytologic findings or Bethesda diagnostic category V (suspicious for malignancy) who receive the following types of molecular marker testing to rule in the presence of malignancy to guide surgical planning for the initial resection rather than a 2-stage surgical biopsy followed by definitive surgery:
o ThyroSeq v2;
o ThyraMIR microRNA/ThyGenX;
o Afirma BRAF after Afirma Gene Expression Classifier; or
o Afirma MTC after Afirma Gene Expression Classifier.
[X] Medically Necessary as per Clinical Input | [ ] Investigational |
The purpose of the ThyroSeq v3 test and the combined ThyGeNEXT Thyroid Oncogene Panel plus ThyraMIR microRNA classifier in individuals with indeterminate findings on FNA(s) of thyroid nodules is to predict malignancy and inform surgical planning decisions with positive results using ThyroSeq v3 or the ThyGeNEXT, and if negative, to predict benignancy using ThyraMIR microRNA classifier to eliminate or necessitate the need for surgical biopsy and guide surgical planning.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with indeterminate findings on FNA(s) of thyroid nodules. Patients with indeterminate findings presently proceed to surgical resection.
The tests being considered are either: (a) the ThyroSeq v3 test or (b) the combined ThyGeNEXT Thyroid Oncogene Panel and ThyraMIR microRNA classifier testing.
The following practices are currently being used: surgical biopsy and/or standard surgical management through surgical resection.
The potential beneficial outcomes of primary interest are using a true-negative result to avoid an unneeded surgical biopsy or using a true-positive result to guide surgical resection (eg, hemithyroidectomy or thyroidectomy).
Potential harmful outcomes are those resulting from false-positive or false-negative test results. False-positive test results can lead to unnecessary surgical biopsy or resection and procedure-related complications. False-negative test results can lead to lack of surgical biopsy or resection for thyroid cancer and delay in diagnosis.
The time frame for evaluating the performance of the test varies from the initial FNA to surgical resection to weeks to months following an indeterminate result.
For the evaluation of clinical validity of the molecular testing, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores)
Included a suitable reference standard
Patient/sample clinical characteristics were described
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Lee et al (2022) performed a systematic review and meta-analysis on the diagnostic performance of molecular tests in the assessment of indeterminate thyroid nodules (described above).20, Inclusion criteria for trials included indeterminate thyroid results via FNA that included Bethesda categories III and IV, conclusive histopathological results in a group of benign and suspicious changes, and the use of Afirma GSC, ThyroSeq v3, and ThyGeNext as index tests. Investigators identified 6 studies on Thyroseq v3: 3 prospective, including Livhits et al (2021) and Steward et al (2019), described below, and 3 retrospective. Only 2 studies on ThyGeNext were identified and were excluded from meta-analysis due to the small sample size. Pooled data for ThyroSeq studies on 560 thyroid nodules demonstrated a sensitivity of 95.1% (95% CI, 91.1% to 97.4%), specificity of 49.6% (95% CI, 29.3% to 70.1%), PPV of 70% (95% CI, 55% to 83%), and NPV of 92% (95% CI, 86% to 97%). Limitations of this meta-analysis include the scarcity of available cohort analyses of the molecular tests and the lack of long-term findings.
Nikiforova et al (2018) reported on the performance of ThyroSeq v3 with 112 genes.50, The training sample included 238 surgically removed tissue samples consisting of 205 thyroid tissue samples representing all main types of benign and malignant tumors and nontumoral conditions. The validation sample included an independent set of 175 FNA samples of indeterminate cytology (see Table 6). Using the cutoff identified in the training set, the ThyroSeq v3 sensitivity was 98% (95% CI, 93% to 99%), specificity was 82% (95% CI, 72% to 89%), with accuracy of 91% (95% CI, 86% to 94%) (see Table 7).
Steward et al (2019) conducted a multicenter validation study of ThyroSeq v3 in 256 patients with an indeterminate FNA who had surgery with histopathology (see Table 6).51, Histopathology was reviewed by a central pathology panel and both cytologists and pathologists were blinded to the molecular results. For a benign result, ThyroSeq v3 had a sensitivity of 93%, a specificity of 81%, PPV of 68%, and NPV of 97% (see Table 7). Out of 152 test-negative samples, 5 (3%) were false-negatives. There were 105 cases with positive results, defined as cancer or noninvasive follicular thyroid neoplasm with papillary-like features. Two nodules had high-risk TERT or TP53 variants (both positive for cancer), 13 had variants in BRAF V600E or NTRK3, or BRAF, or RET fusions (all positive for cancer), and 60 nodules were positive for variants in RAS, BRAF K601E, PTEN, IDH2, or DICER1 or PPARF-THADA fusion (37 [62%] positive for cancer). No major limitations in study design and conduct of this validation study were identified. Because the nodules with low cancer probability genetic alterations were removed for histological analysis, the long-term clinical impact of the genetic alterations could not be determined.
Livhits et al (2021) published a randomized controlled study that compared the ThyroSeq v3 test to the Afirma GSC test in patients with thyroid nodules with indeterminate results (Bethesda III or IV) (as described above).24, The study reported clinical validity for both tests; the results of the ThyroSeq v3 test are summarized in Tables 6 and 7. The study included 171 nodules in the ThyroSeq v3 group. The sensitivity of ThyroSeq v3 was 96.9%, specificity was 84.8%, and the NPV was 99%. Long-term surveillance follow-up of nonoperatively-managed nodules in this trial, described in the section above, continued to support high NPV.25, A limitation of the study is that pathologists that interpreted the histopathologic diagnosis were unblinded to the molecular test results. Additionally, the median length of surveillance did not reach 5 years as recommended by the American College of Radiology.
Study | Study Population | Design | Reference Standard | Threshold for Positive Index Test | Timing of Reference and Index Tests | Blinding of Assessors |
Nikiforov et al (2018)50, | 175 samples with indeterminate cytology and known surgical follow-up | Retrospective | Histopathologic diagnosis | Cutoffs determined in the training sample | Samples were tested after surgical outcome was known | Unclear |
Steward et al (2019) 51, | 256 patients (286 nodules) with an indeterminate FNA (Bethesda III, IV, or V) and underwent thyroid surgery | Multicenter (10 sites) prospective validation study | Central pathology review | Classified as malignant or NIFPT or benign | Cross-sectional | Yes |
Livhits et al (2021)24, | 171 nodules with indeterminate FNA (Bethesda III, IV) assigned to ThyroSeq v3* | Multicenter, randomized controlled trial | Histopathologic diagnosis | Classified as malignant or benign | Samples were tested after surgery | Assessors were unblinded to results of molecular testing |
FNA: fine needle aspirate; Afirma GSC: Gene Sequencing Classifier; NIFPT: noninvasive follicular thyroid neoplasm with papillary-like features.
*Study included a comparator group assigned to Afirma GSC (reported previously)
Study | Initial N | Final N | Excluded Samples | Prevalence of Condition | Clinical Validity (95% Confidence Interval) | |||
Sensitivity | Specificity | PPV | NPV | |||||
Nikiforov et al (2018)50, | 175 | 98 (93 to 100) | 81 (72 to 89) | |||||
Steward et al (2019) 51, | 286 | 57 | 29 (10%) | 30% | 93 (86 to 97) | 81 (75 to 86) | 68 (58 to 76) | 97 (93 to 99) |
Livhits et al (2021)24, | 171 | 163 | 8 | 96.9 (83.8 to 100) | 84.8 (77 to 90.7) | 63.3 (48.3 to 76.6) | 99 (94.6 to 100) |
NPV: negative predictive value; PPV: positive predictive value.
Additional studies describing the clinical validity of the ThyroSeq v2 panel in external settings (outside of the institution where it was developed) have reported on the diagnostic performance to predict malignancy in thyroid nodules that are indeterminate on FNA have been reported (see Table 8). These studies differed from the previous studies in that noninvasive follicular thyroid neoplasm with papillary-like nuclear features was classified as not malignant for calculation of performance characteristics.
Study | Population | Genes and Rearrangements Tested | Insufficient or Inadequate for Analysis | Measures of Agreement (95% CI), % | |||
Sen | Spec | PPV | NPV | ||||
Valderrabano et al (2017)52, | 190 indeterminate thyroid nodules | ThyroSeq v2 (60+ genes) | 2 | 70 (46 to 88) | 77 (66 to 85) | 42 (25 to 61) | 91 (82 to 97) |
Taye et al (2018)53, | 156 indeterminate thyroid nodules | ThyroSeq v2 (60+ genes) | 3 | 89 (52 to 100) | 43 (29 to 58) | 22 (10 to 38) | 96 (78 to 99) |
CI: confidence interval; FNA: fine needle aspiration; NPV: negative predictive value; PPV: positive predictive value; Sen: sensitivity; Spec: specificity.
Additional studies describing the clinical validity of the genes that comprise the ThyroSeq panel or other individual variants and combinations of variants to predict malignancy in thyroid nodules that are indeterminate on FNA have been reported. The results that pertain to the use of gene testing in indeterminate thyroid nodules are summarized in Table 9.
Study | Population | Genes and Rearrangements Tested | Insufficient or Inadequate for Analysis | Measures of Agreement, % | |||||||
Sen | Spec | PPV | NPV | Acc | |||||||
Moses et al (2010)54, | 110 indeterminate thyroid nodules | BRAF, KRAS, NRAS, RET/PTC1, RET/PTC3, NTRK1 | 2 | 38 | 95 | 67 | 79 | 77 | |||
Ohori et al (2010)55, | 100 patients with 117 atypia or follicular lesions of uncertain significance | BRAF, NRAS, HRAS, KRAS, RET/PTC1,RET/PTC3, PAX8/PPARγ | NR | 60 | 100 | 100 | 92 | 93 | |||
Beaudenon-Huibregtse et al (2014)56, | 53 nodules with indeterminate or nondiagnostic FNA | BRAF, HRAS, KRAS, NRAS, PAX8-PPARγ,RET-PTC1, RET-PTC3 | 48 | 89 | 81 | 64 |
Acc: accuracy; FNA: fine needle aspiration; NPV: negative predictive value; NR: not reported; PPV: positive predictive value; PTC: papillary thyroid carcinoma; Sen: sensitivity; Spec: specificity.
a FNA-indeterminate nodules.
b FNA suspicious nodules.
c Atypia of indeterminate significance.
d Follicular neoplasm or suspicious for follicular neoplasm.
e Suspicious for malignancy.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials. Randomized controlled studies were not identified; however, a retrospective, single-center study found that use of ThyroSeq v3 in a cohort of patients with indeterminate thyroid nodules reduced the surgical resection rate compared to a cohort of patients without molecular testing.57, In addition, the risk of malignancy in thyroid nodules with a positive molecular test was higher than those without molecular testing.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Labourier et al (2015) evaluated the diagnostic algorithm combining a 17-variant panel with ThyraMIR on a cross-sectional cohort of thyroid nodules comprised of 109 FNA samples with AUS/FLUS or follicular neoplasm or SFN across 12 endocrinology centers.58, A summary of the sensitivity and specificity of the combined test is listed in Table 10.
Groups | No. of Cases | Sensitivity | Specificity | PPV | NPV | Odds Ratio |
Cohort (95% CI), % | 109 | 89 (73 to 97) | 85 (75 to 92) | 74 (58 to 86) | 94 (85 to 98) | 44 (13 to 151) |
AUS/FLUS (95% CI), % | 58 | 94 (73 to 100) | 80 (64 to 91) | 68 (46 to 85) | 97 (84 to 100) | 68 (8 to 590) |
FN/SFN (95% CI), % | 51 | 82 (57 to 96) | 91 (76 to 98) | 82 (57 to 96) | 91 (76 to 98) | 48 (9 to 269) |
Adapted from Labourier et al (2015).58,
AUS: atypia of undetermined significance; CI: confidence interval; FLUS: follicular lesion of undetermined significance; FN: follicular neoplasm; FNA: fine needle aspiration; NPV: negative predictive value; PPV: positive predictive value; SFN: suspicious for a follicular neoplasm.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, or more effective therapy, or avoid unnecessary therapy, or avoid unnecessary testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials.
Direct evidence for the clinical utility for the ThyroSeq v2 test and the combined ThyGenX and ThyraMIR diagnostic testing algorithm is lacking.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A chain of evidence may be constructed to infer the potential clinical utility of the combined diagnostic testing algorithm. No studies using ThyGenX NGS panel in FNA samples were identified. However, available evidence has suggested that the use of variant testing using NGS in thyroid FNA samples is generally associated with high specificity and PPV for clinically significant thyroid cancer. There is the potential clinical utility for identifying malignancy with higher certainty on FNA if such testing permits better preoperative planning at the time of thyroid biopsy, potentially avoiding the need for a separate surgery. However, the variant analysis does not achieve an NPV sufficiently high enough to identify which patients can undergo active surveillance over thyroid surgery. In the diagnostic algorithm that reflexes to the ThyraMIR after a negative ThyGenX result, patients receiving reflex testing could identify who may undergo active surveillance over thyroid surgery. A single study using a 17-variant panel with ThyraMIR showed an NPV of 94%. Therefore, the high NPV of ThyraMIR has the potential to accurately predict benignancy and triage patients to active surveillance.
Evidence for the clinical validity of the ThyroSeq v3 NGS panel comes from a systematic review of prospective and retrospective studies and a major prospective clinical validity study. In a systematic review including 3 prospective and 3 retrospective clinical validity studies, sensitivity of ThyroSeq v3 was 95.1%, specificity was 49.6%, PPV was 70%, and NPV was 92%. In the prospective clinical validity study, the performance characteristics were sensitivity, 93%; specificity, 81%; PPV, 68%; NPV, 97%. A randomized controlled trial found similar results with ThyroSeq v3. In 2 independent validation studies with a predicate test (ThyroSeq v2) in which noninvasive follicular thyroid neoplasm with papillary-like nuclear features was categorized as not malignant, performance characteristics were lower and variable (sensitivity, 70% to 89%; specificity, 43% to 77%; PPV, 22% to 42%; NPV, 91% to 96%).
Evidence for the clinical validity of combined testing for miRNA gene expression using ThyraMIR and a targeted 17-variant panel comes from 2 retrospective studies using archived surgical specimens and FNA samples. One study combined a 17-variant panel with ThyraMIR testing on archived surgical specimens and resulted in a sensitivity of 85% and specificity of 95%. The second study combined a 17-variant panel (miRInform) with ThyraMIR testing on FNA samples and resulted in a sensitivity of 89%, a specificity of 85%, PPV of 74%, and NPV of 94%. No studies were identified that demonstrated the clinical validity of a combined ThyGenX and ThyraMIR test on FNA samples.
Direct evidence for the clinical utility for the ThyroSeq v2 test and the combined ThyGenX and ThyraMIR reflex testing is lacking. However, available evidence has suggested that testing for gene variants and rearrangements can predict malignancy and inform surgical planning decisions when the test is positive. Pooled retrospective and prospective clinical validation studies of ThyroSeq v2 have reported a combined NPV of 96% (95% CI, 92% to 95%) and PPV of 83% (95% CI, 72% to 95%) and might potentially assist in selecting patient to avoid surgical biopsy if negative and guide surgical planning if positive. The NPV of the ThyGenX to identify patients who should undergo active surveillance over thyroid surgery is unknown. In a reflex testing setting, the high NPV for a microRNA gene expression test used on the subset of patients with a negative result from a variant and gene rearrangement testing may provide incremental information in identifying patients appropriately for active surveillance, but improvements in health outcomes are still uncertain.
For individuals with thyroid nodule(s) and indeterminate findings on FNA who receive FNA sample testing with molecular tests to rule out malignancy and avoid surgical biopsy or to rule in malignancy for surgical planning, the evidence includes multiple retrospective and prospective clinical validation studies for the ThyroSeq test, a systematic review of retrospective and prospective studies, and 2 retrospective clinical validation studies that used a predicate test 17-variant panel (miRInform) test to the current ThyGenX and ThyraMIR. Relevant outcomes are disease-specific survival, test accuracy and validity, morbid events, and resource utilization. In a retrospective validation study on FNA samples, the 17-variant panel (miRInform) test and ThyraMIR had a sensitivity of 89%, and an NPV of 94%. A prospective clinical validation study of ThyroSeq v3 reported an NPV of 97% and PPV of 68%. Similarly, a systematic review including 3 prospective and 3 retrospective clinical validity studies reported an NPV of 92% and PPV of 70%. No studies were identified demonstrating the diagnostic characteristics of the marketed ThyGenX. No prospective studies were identified demonstrating evidence of direct outcome improvements. A chain of evidence for the ThyroSeq v3 test and combined ThyGenX and ThyraMIR testing would rely on establishing clinical validity. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
[ ] Medically Necessary | [X] Investigational |
The purpose of the following information is to provide reference material. Inclusion does not imply endorsement or alignment with the evidence review conclusions.
While the various physician specialty societies and academic medical centers may collaborate with and make recommendations during this process, through the provision of appropriate reviewers, input received does not represent an endorsement or position statement by the physician specialty societies or academic medical centers, unless otherwise noted.
Clinical input was sought to help determine whether testing for molecular markers in fine needle aspirates of the thyroid for management of individuals with thyroid nodule(s) with an indeterminate finding on the fine needle aspirates (FNAs) would provide a clinically meaningful improvement in net health outcome and whether the use is consistent with generally accepted medical practice. In response to requests, clinical input on 7 tests for molecular markers was received from 9 respondents, including 1 specialty society-level response, 1 physician from an academic center, and 7 physicians from 2 health systems
Clinical input supports that the following uses provide a clinically meaningful improvement in net health outcome and indicates the uses are consistent with generally accepted medical practice:
For individuals who have FNA of thyroid nodules with indeterminate cytologic findings (ie, Bethesda diagnostic category III [atypia/follicular lesion of undetermined significance] or Bethesda diagnostic category IV [follicular neoplasm/suspicion for a follicular neoplasm]) who receive the following types of molecular marker testing to rule out malignancy and to avoid surgical biopsy:
Afirma Gene Expression Classifier; or
ThyroSeq v2
For individuals who have FNA of thyroid nodules with indeterminate cytologic findings or Bethesda diagnostic category V (suspicious for malignancy) who receive the following types of molecular marker testing to rule in the presence of malignancy to guide surgical planning for the initial resection rather than a 2 stage surgical biopsy followed by definitive surgery:
ThyroSeq v2;
ThyraMIR microRNA/ThyGenX;
Afirma BRAF after Afirma Gene Expression Classifier; or
Afirma MTC after Afirma Gene Expression Classifier.
Clinical input does not support whether the use of RosettaGX Reveal testing in FNA of thyroid nodules provides a clinically meaningful improvement in the net health outcome or is consistent with generally accepted medical practice.
Further details from clinical input are included in the Appendix.
Guidelines or position statements will be considered for inclusion in ‘Supplemental Information' if they were issued by, or jointly by, a US professional society, an international society with US representation, or National Institute for Health and Care Excellence (NICE). Priority will be given to guidelines that are informed by a systematic review, include strength of evidence ratings, and include a description of management of conflict of interest.
The American Association of Clinical Endocrinologists, American College of Endocrinology, and Associazone Medici Endocrinologi (2016) updated their joint guidelines on molecular testing for cytologically indeterminate thyroid nodules, stating59,:
"Cytopathology expertise, patient characteristics, and prevalence of malignancy within the population being tested impact the negative predictive values (NPVs) and positive predictive values (PPVs) for molecular testing."
"Consider the detection of BRAF and RET/PTC and, possibly, PAX8/PPARG and RAS mutations if such detection is available."
"TERT [Telomerase reverse transcriptase] mutational analysis on FNA, when available, may improve the diagnostic sensitivity of molecular testing on cytologic samples."
"Because of the insufficient evidence and the limited follow-up, we do not recommend either in favor of or against the use of gene expression classifiers (GECs) for cytologically indeterminate nodules."
For the role of molecular testing for deciding the extent of surgery the following recommendations were made:
"Currently, with the exception of mutations such as BRAFV600E that have a PPV approaching 100% for papillary thyroid carcinoma (PTC), evidence is insufficient to recommend in favor of or against the use of mutation testing as a guide to determine the extent of surgery."
The American College of Radiology (ACR; 2017) Thyroid Imaging, Reporting, and Data System (TI-RADS) Committee published a white paper with expert consensus recommendations for FNA biopsy thresholds and imaging surveillance.19, Regarding timing of follow-up sonograms, the publication states: "We advocate timing on the basis of a nodule’s ACR TI-RADS level, with additional sonograms for lesions that are more suspicious. For a TR5 lesion, we recommend scans every year for up to 5 years. For a TR4 lesion, scans should be done at 1, 2, 3, and 5 years. For a TR3 lesion, follow-up imaging may be performed at 1, 3, and 5 years. Imaging can stop at 5 years if there is no change in size, as stability over that time span reliably indicates that a nodule has a benign behavior. There is no published evidence to guide management of nodules that enlarge significantly but remain below the FNA size threshold for their ACR TI-RADS level at 5 years, but continued follow-up is probably warranted. If a nodule’s ACR TI-RADS level increases on follow-up, the next sonogram should be done in 1 year, regardless of its initial level."
The American Thyroid Association (ATA; 2016) updated its guidelines on the management of thyroid nodules and differentiated thyroid cancer in adults.60, These guidelines made the following statements on molecular diagnostics in thyroid nodules that are atypia of undetermined significance or follicular lesion of undetermined significance on cytology and follicular neoplasm or suspicious for follicular neoplasm on cytology (see Table 11).
Recommendation | SOR | QOE |
AUS or FLUS | ||
"For nodules with AUS/FLUS cytology, after consideration of worrisome clinical and sonographic features, investigations such as repeat FNA or molecular testing may be used to supplement malignancy risk assessment in lieu of proceeding directly with a strategy of either surveillance or diagnostic surgery. Informed patient preference and feasibility should be considered in clinical decision-making." | Weak | Moderate |
"If repeat FNA cytology, molecular testing, or both are not performed or inconclusive, either surveillance or diagnostic surgical excision may be performed for an AUS/FLUS thyroid nodule, depending on clinical risk factors, sonographic pattern, and patient preference." | Strong | Low |
FN or SFN | ||
"Diagnostic surgical excision is the long-established standard of care for the management of FN/SFN cytology nodules. However, after consideration of clinical and sonographic features, molecular testing may be used to supplement malignancy risk assessment data in lieu of proceeding directly with surgery. Informed patient preference and feasibility should be considered in clinical decision-making." | Weak | Moderate |
AUS: atypia of undetermined significance; FLUS: follicular lesion of undetermined significance; FN: follicular neoplasm; FNA: fine needle aspirate; QOE: quality of evidence; SFN: suspicious for follicular neoplasm; SOR: strength of recommendation.
The guidelines also stated: "there is currently no single optimal molecular test that can definitively rule in or rule out malignancy in all cases of indeterminate cytology, and long-term outcome data proving clinical utility are needed."
National Comprehensive Cancer Network (v2.2024) guidelines on the treatment of thyroid cancer comment on the use of molecular diagnostics in thyroid cancer.61, For thyroid nodules evaluated with FNA, molecular diagnostics may be employed when lesions are suspicious for:
Follicular or oncocytic neoplasms.
Atypia of undetermined significance or follicular lesions of undetermined significance.
The guidelines state that molecular diagnostics have not performed well historically for oncocytic carcinoma. The guideline also endorses the ATA and ACR recommendations for nodule surveillance, described previously above.
Not applicable.
There is no national coverage determination. In the absence of a national coverage determination, coverage decisions are left to the discretion of local Medicare carriers.
Some currently ongoing and unpublished trials that might influence this review are listed in Table 12.
NCT No. | Trial Name | Planned Enrollment | Completion Date |
Ongoing | |||
NCT05025046a | Prospective, Blinded, Multi-center Clinical Study of NGS-based Thyroscan Genomic Classifier in the Diagnosis of Thyroid Nodules | 400 | Jun 2022 (unknown) |
NCT02681328 | Randomized Trial Comparing Performance of Molecular Markers for Indeterminate Thyroid Nodules | 328 | Dec 2025 |
Unpublished | |||
NCT03170804 | Registry for Genomic Profiling of Nodular Thyroid Disease and Thyroid Cancer | 200 | Jan 2020 (unknown) |
NCT02947035 | Molecular Testing to Direct Extent of Initial Thyroid Surgery | 100 | Jun 2023 (completed) |
NCT: national clinical trial.
a Denotes industry-sponsored or cosponsored trial.
Codes | Number | Description |
---|---|---|
CPT | 81445 | Targeted genomic sequence analysis panel, solid organ neoplasm, 5-50 genes (eg, ALK, BRAF, CDKN2A, EGFR, ERBB2, KIT, KRAS, MET, NRAS, PDGFRA, PDGFRB, PGR, PIK3CA, PTEN, RET), interrogation for sequence variants and copy number variants or rearrangements, if performed; DNA analysis or combined DNA and RNA analysis |
81345 | TERT (telomerase reverse transcriptase) (eg, thyroid carcinoma, glioblastoma multiforme) gene analysis, targeted sequence analysis (eg, promoter region) | |
81546 | Oncology (thyroid), mRNA, gene expression analysis of 10,196 genes, utilizing fine needle aspirate, algorithm reported as a categorical result (eg, benign or suspicious) | |
0018U | Oncology (thyroid), microRNA profiling by RT-PCR of 10 microRNA sequences, utilizing fine needle aspirate, algorithm reported as a positive or negative result for moderate to high risk of malignancy | |
0026U | Oncology (thyroid), DNA and mRNA of 112 genes, next-generation sequencing, fine needle aspirate of thyroid nodule, algorithmic analysis reported as a categorical result ("Positive, high probability of malignancy" or "Negative, low probability of malignancy") | |
0204U | Oncology (thyroid), mRNA, gene expression analysis of 593 genes (including BRAF, RAS, RET, PAX8, and NTRK) for sequence variants and rearrangements, utilizing fine needle aspirate, reported as detected or not detected (deleted eff 06/30/2024) | |
0245U | Oncology (thyroid), mutation analysis of 10 genes and 37 RNA fusions and expression of 4 mRNA markers using next-generation sequencing, fine needle aspirate, report includes associated risk of malignancy expressed as a percentage | |
0287U | Oncology (thyroid), DNA and mRNA, nextgeneration sequencing analysis of 112 genes, fine needle aspirate or formalinfixed paraffin-embedded (FFPE) tissue, algorithmic prediction of cancer recurrence, reported as a categorical risk result (low, intermediate, high) | |
ICD-10-CM | C73 | Malignant neoplasm of thyroid gland |
D44.0 | Neoplasm of uncertain behavior of thyroid gland | |
ICD-10-PCS | Not applicable. ICD-10-PCS codes are only used for inpatient services. There are no ICD procedure codes for laboratory tests. | |
Type of service | Pathology | |
Place of service | Laboratory/Physician’s Office |
Date | Action | Description |
---|---|---|
09/24/2024 | Annual Review | Policy updated with literature review through June 13, 2024; no references added. Minor editorial refinements to policy statements; intent unchanged. |
07/19/2024 | Code Revision | Code Changes Effective 07/01/2024, DELETE –0204U Oncology (prostate), mRNA, gene expression profiling by real-time RT-PCR of 17 genes (12 content and 5 housekeeping), utilizing formalin-fixed paraffin embedded tissue, algorithm reported as a risk score (firmaXpressionAtlas by Veracyte) |
09/11/2023 | Annual Review | Policy updated with literature review through June 14, 2023; references added. Policy statements unchanged. |
09/07/2022 | Annual Review | Policy updated with literature review through June 14, 2022; references added. Minor editorial refinements to policy statements; intent unchanged. Add 0287U, Added deletion date for 0208U |
09/17/2021 | Annual Review | Policy updated with literature review through June 17, 2021; references added. Policy statements unchanged. |
08/03/2021 | Annual Review | Added 0204U, 0208U, 0245U, 81546. Added descriptors and eff/deleted dates |
08/02/2020 | Annual Review | Policy updated with literature review through June 12, 2020; references added. Edits made to the first policy statement; intent of statements unchanged. |
06/15/2020 | Annual Review | No changes. |
06/14/2019 | Annual Review | No changes |
06/14/2018 | ||
06/14/2016 |