Medical Policy
Policy Num: 06.001.064
Policy Name: Thermography
Policy ID: [06.001.064] [Ar / B / M- / P-] [6.01.12]
Last Review: December 03, 2024
Next Review: Policy Archived
Related Policies:
13.009.004 - Temporomandibular Joint Disorder
06.001.011 - Miscellaneous (Noncardiac, Nononcologic) Applications of Fluorine 18 Fluorodeoxyglucose Positron Emission Tomography
08.001.021 - Scintimammography and Gamma Imaging of the Breast and Axilla
06.001.043 - Cardiac Applications of Positron Emission Tomography Scanning
06.001.014 - Oncologic Applications of Positron Emission Tomography Scanning
06.001.010 - Magnetic Resonance Imaging for Detection and Diagnosis of Breast Cancer
06.001.041 - Interim Positron Emission Tomography Scanning in Oncology to Detect Early Response During Treatment
Population Reference No. | Populations | Interventions | Comparators | Outcomes |
1 | Individuals: · With an indication for breast cancer screening or diagnosis | Interventions of interest are: · Thermography | Comparators of interest are: · Mammography | Relevant outcomes include: · Overall survival · Disease-specific survival · Test validity |
2 | Individuals: · With musculoskeletal injuries | Interventions of interest are: · Thermography | Comparators of interest are: · Radiography · Magnetic resonance imaging · Standard care without imaging | Relevant outcomes include: · Test validity · Symptoms · Functional outcomes |
3 | Individuals: · With temporomandibular joint disorder | Interventions of interest are: · Thermography | Comparators of interest are: · Radiography · Magnetic resonance imaging · Diagnostic scales · Standard care without imaging | Relevant outcomes include: · Test validity · Symptoms · Functional outcomes |
4 | Individuals: · With miscellaneous conditions (eg, herpes zoster, pressure ulcers, diabetic foot) | Interventions of interest are: · Thermography | Comparators of interest are: · Radiography · Magnetic resonance imaging · Standard care without imaging | Relevant outcomes include: · Test validity · Symptoms · Functional outcomes |
Thermography is a noninvasive imaging technique that measures temperature distribution in organs and tissues. The visual display of this temperature information is known as a thermogram. Thermography has been proposed as a diagnostic tool for treatment planning and for evaluation of treatment effects for a variety of conditions.
For individuals who have an indication for breast cancer screening or diagnosis who receive thermography, the evidence includes diagnostic accuracy studies and systematic reviews. Relevant outcomes are overall survival, disease-specific survival, and test validity. Using histopathologic findings compared to the reference standard, a series of systematic reviews of studies have evaluated the accuracy of thermography to screen and/or diagnose breast cancer and reported wide ranges of sensitivities and specificities. To date, no study has demonstrated that thermography is sufficiently accurate to replace or supplement mammography for breast cancer diagnosis. Moreover, there are no studies on the impact of thermography on patient management or health outcomes for patients with breast cancer. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have musculoskeletal injuries who receive thermography, the evidence includes diagnostic accuracy studies, a longitudinal prospective study, and a systematic review. Relevant outcomes are test validity, symptoms, and functional outcomes. A systematic review of studies on thermography for diagnosing musculoskeletal injuries found moderate levels of accuracy compared with other diagnostic imaging tests. There is no consistent reference standard. This evidence does not permit conclusions as to whether thermography is sufficiently accurate to replace or supplement standard testing. Moreover, there are no high-quality or randomized studies on the impact of thermography on patient management or health outcomes for patients with musculoskeletal injuries. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have temporomandibular joint (TMJ) disorder who receive thermography, the evidence includes a systematic review. Relevant outcomes are test validity, symptoms, and functional outcomes. A systematic review of studies on thermography for diagnosing TMJ disorder found a wide variation in accuracy compared to other diagnostics. There is no consistent reference standard. The evidence does not permit conclusions as to whether thermography is sufficiently accurate to replace or supplement standard testing. Moreover, there are no studies on the impact of thermography on patient management or health outcomes for patients with the TMJ disorder. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have miscellaneous conditions (eg, herpes zoster, pressure ulcers, diabetic foot) who receive thermography, the evidence primarily includes diagnostic accuracy studies. Outcomes in these studies are test validity, symptoms, and functional outcomes. Most studies assessed temperature gradients or the association between temperature differences and the clinical condition. Due to the small number of studies for each indication, diagnostic accuracy could not adequately be evaluated. The clinical utility of thermography has only been considered in a single study of diabetic foot ulcers. For other miscellaneous conditions, the clinical utility of thermography has not been investigated. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Not applicable.
The objective of this evidence review is to determine whether thermography improves the net health outcome for a variety of indications including but not limited to the diagnosis of breast cancer, musculoskeletal injuries, and temporomandibular joint disorder.
The use of all forms of thermography is considered investigational.
See the Codes table for details.
State or federal mandates (eg, Federal Employee Program) may dictate that certain U.S. Food and Drug Administration approved devices, drugs, or biologics may not be considered investigational, and thus these devices may be assessed only by their medical necessity.Benefits are determined by the group contract, member benefit booklet, and/or individual subscriber certificate in effect at the time services were rendered. Benefit products or negotiated coverages may have all or some of the services discussed in this medical policy excluded from their coverage.
Infrared radiation from the skin or organ tissue reveals temperature variations by producing brightly colored patterns on a liquid crystal display. Thermography involves the use of an infrared scanning device and can include various types of telethermographic infrared detector images and heat-sensitive cholesteric liquid crystal systems.
Interpretation of the color patterns is thought to assist in the diagnosis of many disorders such as complex regional pain syndrome (previously known as reflex sympathetic dystrophy), breast cancer, Raynaud phenomenon, digital artery vasospasm in hand-arm vibration syndrome, peripheral nerve damage following trauma, impaired spermatogenesis in infertile men, degree of burns, deep vein thrombosis, gastric cancer, tear-film layer stability in dry-eye syndrome, Frey syndrome, headaches, lower back pain, and vertebral subluxation.
Thermography may also assist in treatment planning and procedure guidance by accomplishing the following tasks: identifying restricted areas of perfusion in coronary artery bypass grafting, identifying unstable atherosclerotic plaques, assessing response to methylprednisone in rheumatoid arthritis, and locating high undescended testicles.
A number of thermographic devices have been cleared for marketing by the U.S. Food and Drug Administration (FDA) through the 510(k) process. FDA product codes: LHQ, FXN. Devices with product code LHQ may only be marketed for adjunct use. Devices with product code FXN do not provide a diagnosis or therapy. Examples of these devices are shown in Table 1.
Device Name | Manufacturer | Clearance Date | 510(K) No. |
Infrared Sciences Breastscan IR System | Infrared Sciences | Feb 2004 | K032350 |
Telethermographic Camera, Series A, E, S, and P | FLIR Systems | Mar 2004 | K033967 |
Notouch Breastscan | UE Lifesciences | Feb 2012 | K113259 |
WoundVision Scout™ | WoundVision | Dec 2013 | K131596 |
AlfaSight 9000 Thermographic System™ | Alfa Thermodiagnostics | Apr 2015 | K150457 |
FirstSense Breast Exam® | First Sense Medical | Jun 2016 | K160573 |
Sentinel BreastScan II System | First Sense Medical | Jan 2017 | K162767 |
InTouchThermal Camera | InTouch Technologies | Feb 2019 | K181716 |
Smile-100 System | Niramai Health Analytix Private Limited | Mar 2022 | K212965 |
ThermPix™ Thermovisual Camera | USA Therm | Apr 2022 | K213650 |
This evidence review was created in March 1996 and has been updated regularly with searches of the PubMed database. The most recent literature update was performed through July 26, 2024.
Evidence reviews assess whether a medical test is clinically useful. A useful test provides information to make a clinical management decision that improves the net health outcome. That is, the balance of benefits and harms is better when the test is used to manage the condition than when another test or no test is used to manage the condition.
The first step in assessing a medical test is to formulate the clinical context and purpose of the test. The test must be technically reliable, clinically valid, and clinically useful for that purpose. Evidence reviews assess the evidence on whether a test is clinically valid and clinically useful. Technical reliability is outside the scope of these reviews, and credible information on technical reliability is available from other sources.
Promotion of greater diversity and inclusion in clinical research of historically marginalized groups (e.g., People of Color [African-American, Asian, Black, Latino and Native American]; LGBTQIA (Lesbian, Gay, Bisexual, Transgender, Queer, Intersex, Asexual); Women; and People with Disabilities [Physical and Invisible]) allows policy populations to be more reflective of and findings more applicable to our diverse members. While we also strive to use inclusive language related to these groups in our policies, use of gender-specific nouns (e.g., women, men, sisters, etc.) will continue when reflective of language used in publications describing study populations.
The purpose of using thermography in individuals undergoing breast cancer screening or diagnosis is to inform decisions on diagnosis and treatment.
The following PICO was used to select literature to inform this review.
The relevant populations of interest are asymptomatic individuals being screened for breast cancer or individuals undergoing testing to diagnose breast cancer.
The intervention of interest is thermography.
The following test is currently being used to make decisions about breast cancer diagnosis: mammography.
The outcome of interest for diagnostic accuracy is test validity (ie, sensitivity, specificity). The primary outcomes of interest for clinical utility are overall survival and breast cancer-specific survival rates.
The potential beneficial outcomes of primary interest in the case of a true-negative would be the avoidance of unnecessary surgery and associated consequences (eg, morbidity, mortality, resource utilization, patient anxiety). The potential harms from a false-positive could be inappropriate assessment and improper management of patients with breast malignancies, which could result in the following: inappropriate surgical decisions, high frequency of unnecessary further testing, and unnecessary patient anxiety. The potential harms from a false-negative could be a determination that the patient does not have malignancy, which would lead to a delay in surgery and tumor diagnosis.
The timing for routine screening can be guided by national guidelines on breast cancer screening. The timing for diagnosis would be after an initial screening test or clinical examination.
For the evaluation of clinical validity of thermography for breast cancer, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores);
Included a suitable reference standard;
Patient/sample clinical characteristics were described;
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Several systematic reviews of the published literature on the diagnostic accuracy of thermography were identified. A systematic review by Vreugdenburg et al (2013) identified 8 studies on thermography for diagnosis of breast cancer that included a valid reference standard (eg, biopsy with histopathologic confirmation).1, A previous systematic review by Fitzgerald and Berentson-Shaw (2012) identified 6 studies, 1 using thermography for breast cancer screening and the others using thermography to diagnose breast cancer among symptomatic women or those with a positive mammogram.2, A summary of the characteristics of clinical validity for these systematic reviews is provided in Table 2. A summary of the clinical validity results is provided in Table 3. Study findings were not pooled due to heterogeneity in data reporting and assessment methodology utilized.
Study | Study Population | Designa | Reference Standard | Threshold for Positive Index Test | Timing of Reference and Index Tests | Blinding of Assessors | Commentb |
Vreugdenburg et al (2013)1, | For screening studies:
| Diagnostic, cross-sectional studies:
| Biopsy with histopathologic confirmation | Various | Reference Test Prior to Index Test: 1/8; Reference Test During Course of Study: 7/8 | Studies blind to reference:
| All 8 studies utilized different measurement scales and cut-off scores. Poor reporting of index and reference test timing. |
Fitzgerald et al (2012)2, | For screening studies:
| Screening studies:
| Screening studies:
| Various | In screening studies, only patients with a positive index test received the reference test. In diagnostic studies, timing of index and reference tests poorly reported. | In all studies, blinding was poorly reported. | Studies utilized various measurement scales and cut-off scores. Thermograms were scored by software, manually, or through a combination of methods. Screening study utilized more than one thermography device. Poor reporting of index and reference test timing. |
Study; Subgroup | Initial N (Range) | Final N (Range) | Excluded N | Prevalence of Condition | Clinical Validity | |||
Sensitivity | Specificity | PPV | NPV | |||||
Vreugdenberg et al (2013)1,; Diagnostic studies | NR | 1709 (29 to 769) | 565 (13 to 524)* | NR | 25-97% | 12-85% | 24-81% | 36-95% |
Fitzgerald et al (2012)2,; Diagnostic studies | 1224 (63 to 769) | NR | NR | NR | 25-97% | 12-85% | 24-83% | 36-95% |
Fitzgerald et al (2012)2,; Screening studies, at initial screening | 10,229 (NR) | NR | NR | NR | 61% | 74% | 0.01% | 1.00% |
Fitzgerald et al (2012)2,; Screening studies, at 5-yr follow-up | 10,229 (NR) | NR | NR | NR | 28% | 74% | 0.01% | 0.99% |
Several studies have been published since the systematic reviews. Morales-Cervantes et al (2018) compared the accuracy of automated or manual thermography screening in 206 women scheduled for mammography in Mexico.3, A retrospective study conducted in the U.S. by Neal et al (2018) assessed outcomes in 38 women referred for further breast imaging following abnormal thermography testing.4, Omranipour et al (2016) compared the accuracy of thermography and mammography in 132 patients in Iran who had breast lesions and were candidates for breast biopsy.5, Rassiwala et al (2014) in India reported on 1008 women being screened for breast cancer.6, Summaries of characteristics and results of clinical validity for these diagnostic studies are provided in Tables 4 and 5.
Study | Study Population | Designa | Reference Standard | Threshold for Positive Index Test | Timing of Reference and Index Tests | Blinding of Assessors | Commentb |
Morales-Cervantes et al (2018)3, | For screening study:
| Prospective cohort, NR sample allocation | Biopsy with histopathologic confirmation | Automated Thermography (Thermal Score)c
| Reference testing performed for women with mammography BI-RADS score indicating suspicion for cancer. Mammography performed after thermography. | Blinding of mammography assessor with respect to thermography not described. Double-blinding indicated for manual assessment of thermograms by oncologist. Blinding of biopsy assessor not described. | Blinding and allocation poorly described. No data reported for mammography despite inclusion as comparator. Reported results may be biased and inaccurate due to selective use of reference tests. |
Neal et al (2018)4, | For diagnostic study:
| Retrospective cohort, NR sample allocation | Biopsy with histopathologic confirmation or at least 1 year of clinical and/or imaging follow-up | Abnormal Thermography:
| Thermography testing performed prior to mammography and/or ultrasound. Reference testing performed after index tests. Histopathological reference testing offered for women with BI-RADS score 4-5. | Blinding of assessors not described. | Blinding and allocation not described. Limited data reporting. Reference testing not uniform for all patients. Small study size with retrospective design. Long-term health outcomes not described. |
Omranipour et al (2016)5, | For diagnostic study:
| Prospective cohort, NR sample selection | Core needle or surgical biopsy with histopathologic confirmation | Mammography (BI-RADS Rating):
| Reference testing performed after imaging index tests. | Mammography assessors blinded to thermography test results. Blinding of thermography and histopathology assessors not described. | Blinding and allocation poorly described. Concordance of risk classification cannot be assessed due to limited data reporting. |
Rassiwala et al (2014)6, | For screening study:
| Prospective cohort, NR sample allocation | For women with normal thermograms: clinical examination only. For women with ΔT ≥ 2.5: clinical, radiologic, and histopathologic examination. | Positive (Potentially having breast cancer)
| Reference test provided only to women with abnormal or elevated thermography index test results. | NR | Blinding and allocation not described. Reported results may be biased and inaccurate due to selective use of reference tests. |
Study; Subgroup | Initial N | Final N | Excluded Samples | Prevalence of Condition | Clinical Validity | |||
Sensitivity | Specificity | PPV | NPV | |||||
Morales-Cervantes et al (2018)3, | ||||||||
Automated Thermography* | NR | 206 | NR | 198 benign; 8 malignant | 100% | 68.68% | 11.42% | 100% |
Manual Thermography* | NR | 206 | NR | 87.50% | 56.06% | 7.44% | 99.10% | |
Mammography | NR | 206 | NR | NR | NR | NR | NR | |
Neal et al (2018)4, | ||||||||
Abnormal Thermography | 45 | 38 | 7 | 36 benign; 2 malignant | NA | NA | NR (2/38) | NA |
Mammography following Abnormal Thermography | 45 | 38 | 7 | NR | NR | 33.3% | 100% | |
Omranipour et al (2016)5, | ||||||||
Thermography | NR | 132 | NR | 45 benign; 87 malignant | 81.6% | 57.8% | 78.9% | 61.9% |
Mammography | NR | 132 | NR | 80.5% | 73.3% | 85.4% | 66.0% | |
Rassiwala et al (2014)6, | ||||||||
Thermography** | NR | 1,008 | NR | 41 malignant in 49 women with positive or abnormal thermograms | 97.6% | 99.17% | 83.67% | 99.89% |
The diagnostic accuracy of automated thermography in the study by Morales-Cervantes et al (2018) was 69.9%.3, The authors did not report on the diagnostic accuracy of manual thermography. While automated thermographic screening improved the sensitivity and specificity of the test compared to a manual, qualitative approach, reported values must be interpreted with caution as only patients with positive mammograms were subjected to diagnostic reference testing. Neal et al (2018) indicated that 95% of patients referred for follow-up imaging evaluation following abnormal thermography testing did not have breast cancer, concluding that conventional breast imaging appears sufficient to manage patients.4, According to Omranipour et al (2016),5, the diagnostic accuracy of thermography (67.7%) was lower than for mammography (76.9%; p values not reported). The reported false-negative rate was not accurately calculated in Rassiwala et al (2014) because women who had normal thermograms only had a clinical examination and did not undergo radiologic and histopathologic reference tests for confirmation, highlighting a major limitation of this study.6, For patients with positive or abnormal thermograms, 8 results were considered false-positive. One false-negative was reported, but it is unclear which subgroup this patient belonged to or how this was determined, given that patients with normal thermograms were only assessed with a clinical examination. Tables 6 and 7 display further notable limitations identified in each study. This information is synthesized as a summary of the body of evidence following each table and provides the conclusions on the sufficiency of the evidence supporting the position statement.
Study | Populationa | Interventionb | Comparatorc | Outcomesd | Duration of Follow-Upe |
Morales-Cervantes et al (2018)3, | 1, 4. Intended use population unclear; study population not representative of intended use (screening study enriched with patients with clinical symptoms). | 1, 2. Classification thresholds for manual thermographic assessment not described; BI-RADS version used unclear with no description of classification thresholds. | 1, 2. BI-RADS classification thresholds for mammography not defined; normal mammograms not compared to credible reference standard. | 1, 3, 5. Study does not directly assess a key health outcome; key clinical validity outcomes not reported; adverse events of the test not described. | |
Neal et al (2018)4, | 1. Classification thresholds for patients receiving ultrasounds after declining mammography not described; classification thresholds for thermography not evaluated. | 1. Not compared to consistent reference standard. | 1. Study does not report on key long-term health outcomes; key clinical validity outcomes not reported. | 1. Follow-up duration not sufficient for patients not evaluated by biopsy. | |
Omranipour et al (2016)5, | 1, 5. Study does not directly assess a key health outcome; adverse events of the test not described. | ||||
Rassiwala et al (2014)6, | 4. Study population not representative of intended use (age for screening). | 1, 2. Classification thresholds not defined; normal index tests not compared to credible reference standard. | 1, 4, 5. Study does not directly assess a key health outcome; reclassification of diagnostic or risk categories not reported; adverse events of the test not described. |
Study | Selectiona | Blindingb | Delivery of Testc | Selective Reportingd | Data Completenesse | Statisticalf |
Morales-Cervantes et al (2018)3, | 1. Selection not described. | 1. Blinding to index and reference tests not fully described. | 3, 4. Procedure for manual interpretation of thermograms and mammograms not described; expertise of all evaluators not described. | 1-2. Not registered; evidence of selective reporting (mammography data not reported). | 1. No description of indeterminate or missing samples. | 1-2. Confidence intervals and/or p values not reported; comparison to mammography not reported. |
Neal et al (2018)4, | 1. Selection not described. | 1. Blinding not described. | 2-3. Timing of index and comparator tests not same; procedures for interpreting all tests not described | 1. Not registered. | 3. High loss to follow-up or missing data. | 1-2. Confidence intervals and/or p values not reported; comparison to other tests not reported. |
Omranipour et al (2016)5, | 1. Selection not described. | 1. Blinding to index and reference tests not described. | 1. Timing of delivery of index and reference tests not fully described. | 1. Not registered. | 1. No description of indeterminate or missing samples. | 1. Confidence intervals and/or p values not reported. |
Rassiwala et al(2014)6, | 1. Selection not described. | 1. Blinding not described. | 1,3-4. Timing of delivery of index and reference tests not fully described; procedure for interpreting reference tests not described; expertise of evaluators not described. | 1. Not registered. | 1. Inadequate description of indeterminate or missing samples. | 1. Confidence intervals and/or p values not reported. |
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials (RCTs).
No studies have demonstrated how the results of thermography could be used to enhance the management of breast cancer patients in a manner that would improve their health outcomes.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
It is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence that the diagnostic accuracy of thermography is at least as high as mammographic techniques for breast cancer screening and diagnosis.
Systematic reviews of studies evaluating the accuracy of thermography for diagnosing breast cancer found wide ranges of sensitivities and specificities and, where data are available, relatively low diagnostic accuracy compared with mammography. To date, no study has demonstrated that thermography is sufficiently accurate to replace or supplement mammography for breast cancer diagnosis. Moreover, there are no studies on the impact of thermography on patient management or health outcomes for patients with breast cancer.
For individuals who have an indication for breast cancer screening or diagnosis who receive thermography, the evidence includes diagnostic accuracy studies and systematic reviews. Relevant outcomes are overall survival, disease-specific survival, and test validity. Using histopathologic findings compared to the reference standard, a series of systematic reviews of studies have evaluated the accuracy of thermography to screen and/or diagnose breast cancer and reported wide ranges of sensitivities and specificities. To date, no study has demonstrated that thermography is sufficiently accurate to replace or supplement mammography for breast cancer diagnosis. Moreover, there are no studies on the impact of thermography on patient management or health outcomes for patients with breast cancer. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Population Reference No. 1Policy Statement | [ ] Medically Necessary | [X] Investigational |
The purpose of using thermography in individuals who have a musculoskeletal injury is to inform a decision whether to proceed to appropriate treatment or not.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with musculoskeletal injuries.
The intervention of interest is thermography.
The following tests and practices are currently being used to make decisions about musculoskeletal injuries: standard care without imaging and other forms of imaging (eg, with radiography, magnetic resonance imaging).
The outcomes of interest for diagnostic accuracy include test accuracy and test validity (ie, sensitivity, specificity). The primary outcomes of interest for clinical utility are a reduction in pain symptoms and improvement in functional ability. The timing would be following a musculoskeletal injury.
For the evaluation of clinical validity of thermography, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores);
Included a suitable reference standard;
Patient/sample clinical characteristics were described;
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
A systematic review by Sanchis-Sanchez et al (2014) evaluated the literature on thermography for diagnosing musculoskeletal injuries.7,Six studies met the eligibility criteria (N=416); 3 included patients with suspected stress fractures (n=119) and the remainder addressed other musculoskeletal injuries. Characteristics and results of clinical validity for stress fracture diagnostic studies were reported and summaries are provided in Tables 8 and 9. A systematic review by Vardasca et al (2019) evaluated the literature on musculoskeletal applications of thermography specific to the arm and forearm. However, the review mainly focused on correlations between skin surface temperatures and physical condition or health recovery monitoring. As diagnostic accuracy data were not extracted or pooled from included studies, this review was not assessed for evidence of clinical validity.
Study | Study Population | Designa | Reference Standard | Threshold for Positive Index Test | Timing of Reference and Index Tests | Blinding of Assessors | Commentb |
Sanchis-Sanchez (2014)7, | For diagnostic studies:
|
| High-quality radiographic imaging (various) | NR; various methodologies utilized | Reported (1/6 studies) Unclear (4/6 studies, including all studies on stress fractures) NR (1/6 studies) | Reported (2/6 studies) Unclear (4/6 studies, including all studies on stress fractures) | High heterogeneity in thermography index test methodologies and diagnostic accuracy. QUADAS assessment by authors indicates moderate-to-high risk of bias in studies on stress fractures. |
Study; Subgroup | Initial N (Range) | Final N (Range) | Excluded N | Prevalence of Condition | Clinical Validity (95% Confidence Interval) | |||
Sensitivity | Specificity | PPV | NPV | |||||
Sanchis-Sanchez (2014) Stress Fractures7, | NR | 119 (17 to 84) | NR | NR | NR Range: 45.3 to 82% | 69% (49 to 85%) Range: 60 to 100% p-value:.17 | NR Positive Likelihood Ratio: 2.31 (0.63 to 8.47) Range: 1.13 to 6.25 p-value:.12 | NR Negative Likelihood Ratio: NR Range: 0.22 to 0.91 |
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
Côrte et al (2019) published pilot data from a longitudinal prospective study on the screening and prevention of muscle injuries in 28 professional Brazilian soccer players.8, Players were monitored for musculoskeletal imaging during the 2015-2016 seasons with ultrasound. In the second season, a thermographic monitoring regimen was added twice-weekly 48 hours after matches, and an injury prevention protocol was followed based on the results of thermographic imaging. The number of musculoskeletal injuries was compared for both seasons based on these management protocols. The total number of muscle injuries reported decreased from 11 in 2015 to 4 in 2016 (p=.04). Seven players were on the team roster across both seasons. There was no statistically significant reduction in muscle injury in this subgroup (p=.06). Limitations of this study are addressed in Tables 10 and 11.
Study | Populationa | Interventionb | Comparatorc | Outcomesd | Duration of Follow-Upe |
Côrte et al (2019)8, | 2. Clinical context is unclear (definition and reporting of muscle injuries are subjective). | 2. Version used unclear (therapy utilized in prevention protocol was based on physician discretion and not standardized). | 1, 2. Classification thresholds for ultrasound not defined; comparison to credible reference standard unclear. | 3, 4, 5. Key clinical validity outcomes not reported; reclassification of diagnostic or risk categories not reported; adverse events of the test not described. |
Study | Selectiona | Blindingb | Delivery of Testc | Selective Reportingd | Data Completenesse | Statisticalf |
Côrte et al (2019)8, | 1. Selection not random or consecutive. | 1. Blinding to index and reference tests not described. | 1-4. Timing of delivery of index or reference tests not described; timing of index and comparator tests not described; procedure for interpreting comparator and/or reference tests not described; expertise of evaluators not described. | 1. Not registered. | 1. No description of indeterminate or missing samples. | 1, 2. Confidence intervals and/or p values not reported; diagnostic comparison to other tests not reported. |
No high-quality or randomized studies have been published that evaluate health outcomes in patients with musculoskeletal injuries who were managed with and without thermography.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
It is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence that the diagnostic accuracy of thermography is at least as high as standard techniques for diagnosing musculoskeletal injuries.
A systematic review of studies on thermography for diagnosing musculoskeletal injuries found moderate levels of accuracy compared with other diagnostic imaging tests. There was a lack of a consistent reference standard. This evidence does not permit conclusions as to whether thermography is sufficiently accurate to replace or supplement standard testing. Moreover, there are insufficient studies on the impact of thermography on patient management or health outcomes for patients with musculoskeletal injuries.
For individuals who have musculoskeletal injuries who receive thermography, the evidence includes diagnostic accuracy studies, a longitudinal prospective study, and a systematic review. Relevant outcomes are test validity, symptoms, and functional outcomes. A systematic review of studies on thermography for diagnosing musculoskeletal injuries found moderate levels of accuracy compared with other diagnostic imaging tests. There is no consistent reference standard. This evidence does not permit conclusions as to whether thermography is sufficiently accurate to replace or supplement standard testing. Moreover, there are no high-quality or randomized studies on the impact of thermography on patient management or health outcomes for patients with musculoskeletal injuries. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Population Reference No. 2Policy Statement | [ ] Medically Necessary | [X] Investigational |
The purpose of using thermography in individuals who have temporomandibular joint (TMJ) disorder is to inform a decision whether to proceed to appropriate treatment or not.
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with TMJ disorder.
The intervention of interest is thermography.
The following tests and practices are currently being used to make decisions about TMJ disorder: standard clinical examination without imaging, diagnostic scales (eg, Research Diagnostic Criteria for Temporomandibular Disorders [RDC/TMD], Fonseca Anamnestic Index, Anamnestic Index), and other forms of imaging (eg, with radiography, arthrotomography, magnetic resonance imaging).
The outcomes of interest for diagnostic accuracy include test accuracy and test validity (ie, sensitivity, specificity). The primary outcomes of interest for clinical utility are a reduction in pain symptoms and improvement in functional ability.
For the evaluation of clinical validity of thermography for TMJ disorder, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores);
Included a suitable reference standard;
Patient/sample clinical characteristics were described;
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
A systematic review by de Melo et al (2019) evaluated the diagnostic accuracy of infrared thermography in TMJ disorder.9, Nine studies were identified utilizing a variety of comparators. The authors note that while no specific diagnostic tool is currently considered the gold standard for the diagnosis of TMJ disorder, the RDC/TMD diagnostic is commonly used with a reported sensitivity and specificity of 87% and 92%, respectively. Four out of 9 studies utilized RDC/TMD, whereas the remaining studies utilized clinical examination or other methods. Characteristics and results of clinical validity for TMJ disorder diagnostic accuracy in this systematic review are summarized in Tables 12 and 13.
Study | Study Population | Designa | Reference Standard | Threshold for Positive Index Test | Timing of Reference and Index Tests | Blinding of Assessors | Commentb |
de Melo et al (2019)9, | For diagnostic studies:
|
| RDC/TMD diagnostic, clinical examination, or other imaging methods | NR | NR High-risk of bias based on flow and timing: 4/9 studies; Unclear risk of bias based on flow and timing: 5/9 studies. | NR | Thermography index test methodologies unclear. Heterogeneity in use of comparator and/or reference standard. Assessment by authors indicates high-risk of bias in all studies. |
Study | Initial N (Range) | Final N (Range) | Excluded N | Prevalence of Condition | Clinical Validity (95% Confidence Interval) | |||
Sensitivity | Specificity | PPV | NPV | |||||
de Melo et al (2019)9, | NR | 548 (23 to 104) | NR | NR | NR; Range: 38.5 to 90% | NR; Range: 22.8 to 95.5% | NR | NR |
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No studies have been published that evaluate health outcomes in patients with TMJ disorder who were managed with and without thermography.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
It is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence that the diagnostic accuracy of thermography is at least as high as standard techniques for diagnosing TMJ disorder.
A systematic review of studies on thermography for diagnosing TMJ disorder found a wide variation in accuracy compared with other diagnostics. There was a lack of a consistent reference standard. This evidence does not permit conclusions as to whether thermography is sufficiently accurate to replace or supplement standard testing. Moreover, there are no studies on the impact of thermography on patient management or health outcomes for patients with TMJ disorder.
For individuals who have temporomandibular joint (TMJ) disorder who receive thermography, the evidence includes a systematic review. Relevant outcomes are test validity, symptoms, and functional outcomes. A systematic review of studies on thermography for diagnosing TMJ disorder found a wide variation in accuracy compared to other diagnostics. There is no consistent reference standard. The evidence does not permit conclusions as to whether thermography is sufficiently accurate to replace or supplement standard testing. Moreover, there are no studies on the impact of thermography on patient management or health outcomes for patients with the TMJ disorder. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Population Reference No. 3Policy Statement | [ ] Medically Necessary | [X] Investigational |
A number of studies have assessed a range of potential thermography applications. To date, no randomized study has examined the impact of thermography on patient management decisions or health outcomes. Examples of other studies on thermography, mainly conducted outside of the U.S., include those evaluating the association between thermographic findings and post-herpetic neuralgia in patients with herpes zoster,10,11, surgical site healing in patients who underwent knee replacements,12,predicting pressure ulcers13, and pressure ulcer healing,14,15, posttreatment pain in patients with coccygodynia,16, evaluation of allergic conjunctivitis,17, evaluation of burn depth,18,19, association between thermographic findings and burn treatment,20, detecting cervical lymph node metastasis from oral cavity cancer,21, monitoring lesions or inflammation in patients with scleroderma,22,23, detection of vascular obstruction24, or perforator vessels during surgery,25,26, diagnosis of lower extremity cellulitis,27, prediction of infrainguinal bypass surgery,28, detection of melanoma,29, detection of contact dermatitis during allergy patch testing,30,diagnosis of acute appendicitis,31,and measuring disease activity in patients with rheumatoid arthritis, osteoarthritis, or other rheumatic diseases.32,33,34,35,
Several studies evaluating the clinical validity of thermography to assess potential complications of the diabetic foot have been conducted. Thermographic images of nondiabetic feet, nonulcerated diabetic feet, and ulcerated diabetic feet have been compared.36,37,38,39,40, Another study used thermography to diagnose infections in patients admitted with diabetic foot complications.41, The only study to date to investigate the clinical utility of thermography compared with no thermography assessed diabetic foot ulcer incidence in 110 participants with a history of diabetic neuropathy and foot ulcers.42, After 12 months followup, the study found no significant difference between use of monthly thermography versus no thermography and foot ulcer incidence (62% vs. 56%; adjusted odds ratio, 0.55, 95% confidence interval [CI], 0.21 to 1.40) or time to ulcer recurrence (adjusted hazard ratio, 0.67, 95% CI, 0.34 to 1.3).
For most of these potential indications, there are 1 or 2 preliminary studies on each of the indications. Several studies evaluated the clinical validity of thermography in assessing diabetic foot and related complications. For all indications, the studies described temperature gradients or the association between temperature differences and the clinical condition. Due to the small number of studies for each indication, the diagnostic accuracy could not adequately be evaluated. The clinical utility of thermography for these miscellaneous conditions was not investigated in any study.
For individuals who have miscellaneous conditions (eg, herpes zoster, pressure ulcers, diabetic foot) who receive thermography, the evidence primarily includes diagnostic accuracy studies. Outcomes in these studies are test validity, symptoms, and functional outcomes. Most studies assessed temperature gradients or the association between temperature differences and the clinical condition. Due to the small number of studies for each indication, diagnostic accuracy could not adequately be evaluated. The clinical utility of thermography has only been considered in a single study of diabetic foot ulcers. For other miscellaneous conditions, the clinical utility of thermography has not been investigated. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Population Reference No. 4Policy Statement | [ ] Medically Necessary | [X] Investigational |
The purpose of the following information is to provide reference material. Inclusion does not imply endorsement or alignment with the evidence review conclusions.
Guidelines or position statements will be considered for inclusion in ‘Supplemental Information' if they were issued by, or jointly by, a US professional society, an international society with US representation, or National Institute for Health and Care Excellence (NICE). Priority will be given to guidelines that are informed by a systematic review, include strength of evidence ratings, and include a description of management of conflict of interest.
A position paper by the European Society of Breast Imaging (2017) and 30 other national breast radiology bodies on screening for breast cancer stated that "screening with thermography or other optical tools as alternatives to mammography is discouraged."43,
The American College of Obstetricians and Gynecologists (ACOG) practice bulletin for breast cancer risk assessment and screening in average-risk women do not mention the use of thermography for breast cancer screening.44,
The American College of Physicians (2019) issued a guidance statement for breast cancer screening in average-risk women that reviews existing screening guidelines.45, While the use of thermography was not mentioned in this statement, the authors concluded that evidence is insufficient to understand the benefits and harms of primary or adjunctive screening strategies in women who are found to have dense breasts on screening mammography.
The American College of Radiology updated 2023 guidelines for female breast cancer screening do not mention the use of thermography for breast cancer screening.46,
National Comprehensive Cancer Network guideline on breast cancer screening and diagnosis (v.2.2024 ) states that: "Current evidence does not support the routine use of thermography as screening procedures."47,
The U.S. Preventive Services Task Force ( 2024) recommendations on breast cancer screening (currently undergoing an update) do not mention thermography. Additionally, there is insufficient evidence for the use of adjunctive screening methods for breast cancer (ultrasonography or magnetic resonance imaging ) in women identified to have dense breasts on a negative screening mammogram.48,
Medicare does not cover thermography. Current Medicare coverage policy states: "Thermography for any indication (including breast lesions which were excluded from Medicare coverage …) is excluded from Medicare coverage because the available evidence does not support this test as a useful aid in the diagnosis or treatment of illness or injury. Therefore, it is not considered effective..."49,
Some currently ongoing and unpublished trials that might influence this review are listed in Table 14.
NCT No. | Trial Name | Planned Enrollment | Completion Date |
Ongoing | |||
NCT06266026 | ThermoBreast - Non-contact Breast Cancer Imaging Using AI-enhanced Thermography. An International, Multicenter, Prospective Development and Validation Trial. | 28000 | Dec 2029 |
Unpublished | |||
NCT04013711 | Quantitative Thermal Imaging to Evaluate Skin Toxicity from Radiation Treatment | 200 | Jul 2022 (completed) |
NCT03735550 | Investigation of the Effectiveness of Liquid Crystal Contact Thermography in Detecting Pathological Changes in Female Breasts Compared to Standard Diagnostic Methods of Breast Cancer | 3000 | Jan 2019 (unknown status) |
NCT03217214 | Investigation of Contact Based Method for Diagnosis of Cardiovascular Disease (INDICES) | 67 | Sep 2019 (completed) |
NCT02776995 | Tumor Monitoring Using Thermography During Radiation Therapy | 80 | Dec 2020 (unknown status) |
Codes | Number | Description |
---|---|---|
CPT | 93740 | Temperature Gradient Studies: note; there is no specific code for skin surface infrared thermography |
93799 | Unlisted cardiovascular service or procedure | |
HCPCS | No code | |
ICD-10 CM | Investigational for all diagnoses | |
ICD-10-PCS | ICD-10-PCS codes are only used for inpatient services | |
4A0ZXKZ | Measurement of temperature, external approach | |
Type of service | Radiology | |
Place of service | Inpatient/outpatientPhysician’s Office |
Date | Action | Description |
12/03/24 | Policy Archived | No expected changes in policy statement. |
10/15/24 | Annual Review | Policy updated with literature review through July 26, 2024; references added. Policy statement unchanged. |
10/12/23 | Annual Review | Policy updated with literature review through July 11, 2023; no references added. Policy statement unchanged. |
10/05/22 | Annual Review | Policy updated with literature review through July 12, 2022; references added. Policy statement unchanged. |
10/05/21 | Annual Review | Policy updated with literature review through August 8, 2021; references added. Policy statement unchanged. |
10/06/20 | Annual Review | Policy updated with literature review through August 4, 2020; references added. Policy statement unchanged. |
10/08/19 | Created | New policy |