Editorial Type:
Article Category: Research Article
 | 
Online Publication Date: 01 Mar 2009

Validity and Reliability of Devices That Assess Body Temperature During Indoor Exercise in the Heat

MS,
MA ATC,
PhD ATC FNATA FACSM,
MA ATC,
PhD ATC,
MS ATC,
BS ATC,
BS ATC,
PhD FACSM, and
PhD FACSM
Page Range: 124 – 135
DOI: 10.4085/1062-6050-44.2.124
Save
Download PDF

Abstract

Context:

When assessing exercise hyperthermia outdoors, the validity of certain commonly used body temperature measuring devices has been questioned. A controlled laboratory environment is generally less influenced by environmental factors (eg, ambient temperature, solar radiation, wind) than an outdoor setting. The validity of these temperature measuring devices in a controlled environment may be more acceptable.

Objective:

To assess the validity and reliability of commonly used temperature devices compared with rectal temperature in individuals exercising in a controlled, high environmental temperature indoor setting and then resting in a cool environment.

Design:

Time series study.

Setting:

Laboratory environmental chamber (temperature  =  36.4 ± 1.2°C [97.5 ± 2.16°F], relative humidity  =  52%) and cool laboratory (temperature  =  approximately 23.3°C [74.0°F], relative humidity  =  40%).

Patients or Other Participants:

Fifteen males and 10 females.

Intervention(s):

Rectal, gastrointestinal, forehead, oral, aural, temporal, and axillary temperatures were measured with commonly used temperature devices. Temperature was measured before and 20 minutes after entering the environmental chamber, every 30 minutes during a 90-minute treadmill walk in the heat, and every 20 minutes during a 60-minute rest in mild conditions. Device validity and reliability were assessed with various statistical measures to compare the measurements using each device with rectal temperature. A device was considered invalid if the mean bias (average difference between rectal and device temperatures) was more than ±0.27°C (±0.50°F).

Main Outcome Measure(s):

Measured temperature from each device (mean and across time).

Results:

The following devices provided invalid estimates of rectal temperature: forehead sticker (0.29°C [0.52°F]), oral temperature using an inexpensive device (−1.13°C [−2.03°F]), temporal temperature measured according to the instruction manual (−0.87°C [−1.56°F]), temporal temperature using a modified technique (−0.63°C [−1.13°F]), oral temperature using an expensive device (−0.86°C, [−1.55°F]), aural temperature (−0.67°C, [−1.20°F]), axillary temperature using an inexpensive device (−1.25°C, [−2.24°F]), and axillary temperature using an expensive device (−0.94°F [−1.70°F]). Measurement of intestinal temperature (mean bias of −0.02°C [−0.03°F]) was the only device considered valid. Devices measured in succession (intestinal, forehead, temporal, and aural) showed acceptable reliability (all had a mean bias  =  0.09°C [0.16°F] and r ≥ 0.94]).

Conclusions:

Even during laboratory exercise in a controlled environment, devices used to measure forehead, temporal, oral, aural, and axillary body sites did not provide valid estimates of rectal temperature. Only intestinal temperature measurement met the criterion. Therefore, we recommend that rectal or intestinal temperature be used to assess hyperthermia in individuals exercising indoors in the heat.

In a recent study from our laboratory,1 we assessed the validity of numerous temperature devices to estimate rectal temperature while participants exercised in an outdoor environment. Results indicated that most instruments were invalid. However, the validity of many body temperature devices during indoor exercise is unknown. Core body temperature is often measured during controlled research experiments when hyperthermia is an outcome variable or a safety criterion. Body temperature also may be measured when athletes participate in indoor sports (eg, basketball, team handball, squash, wrestling, racquetball). Similarly, many athletes use indoor practice facilities during the off-season (eg, strength and conditioning sessions, soccer, football). Accurate measurement of body temperature indoors is particularly important when exercise is intense or the facility is not air conditioned. Some authorities have speculated that inaccurate measures from temperature devices used during outdoor exercise are due to the influences of cloud cover, wind, rain, and direct sunlight on the device and the person.1 Compared with the outdoor setting, these same temperature devices may not be as adversely affected in a controlled, consistent indoor environment and, thus, may be more valid. However, they still may not provide valid estimates of core body temperature.

Although rectal temperature (RCT) is the preferred and recommended method of the National Athletic Trainers' Association (NATA) for assessing core body temperature,2 athletic trainers use a variety of body sites and devices to measure temperature.3 Common alternatives include oral, aural (tympanic), temporal, and axillary temperature. The validity of these sites in estimating core body temperature has been examined in toddlers,46 hospital patients,7,8 and participants exercising outdoors.1 However, many of these authors did not validate measurement sites when hyperthermia was exercise induced. Therefore, it is difficult to extrapolate these findings to healthy individuals exercising indoors.

As recommended by the NATA's position statement on exertional heat illnesses,2 RCT is used as the reference standard because of its validity9 and practicality in most settings. The evidence indicates that RCT is valid and reliable for individuals at rest and while exercising812 and is considered the “gold standard” for temperature measurement in hyperthermic athletes.1315 Despite a reported lag in response time (eg, versus esophageal or pulmonary artery temperature), RCT provides a valid and reliable core temperature measurement in the field for diagnosis and treatment of exertional heat stroke.13,1621 It is also an acceptable means of assessing core body temperature in an indoor setting because of its ease of use, low associated cost, and validity.22

Our purpose was to determine the validity of temperature devices commonly used by sports medicine staff and researchers during indoor exercise in the heat. Indoor exercise was performed in controlled laboratory conditions (ie, an environmental chamber). Based on our previous findings,1 we expected that devices directly and immediately influenced by changes in skin temperature, the presence of sweat, the ingestion of fluids, or another external influence would not provide the degree of validity and reliability considered acceptable for medical diagnosis or research participant safety monitoring. Specifically, we hypothesized that of the measurements we studied, intestinal temperature, measured via a telemetric pill, would be the only assessment providing a valid estimate of RCT.

Methods

After signing an informed consent form approved by the institutional review board, which also approved the study, participants completed a self-administered medical history questionnaire and were excluded if contraindications for exercise in the heat were present. All volunteers were physically active (at least 2 workouts or 4 hours of exercise per week. Before testing, we recorded sex, age, height, mass (model BWB-800A; Tanita Corp, Tokyo, Japan), and skinfold thickness at 3 sites to estimate percentage of body fat.2325

In order to aid in euhydration, participants were instructed to drink 473 mL (16 oz) of fluid both the night before testing and the morning before arrival at the laboratory. We contacted them the night before testing to ensure compliance. Three hours before their scheduled arrival at the laboratory, each volunteer ingested an intestinal thermistor (CorTemp; HQ Inc, Palmetto, FL).

Measurement Sites and Devices

Temperature measurements were taken by a team of 4 trained researchers: one responsible for oral measurements, one for axillary measures, one for aural (ie, ear canal) and temporal measures, and one for recording intestinal (INT), RCT, and forehead sticker (FST) readings. To maintain consistency and accuracy of placement for temporal, aural, and axillary measures, the same researchers measured these sites for each participant.

Rectal temperature was measured with a rectal thermistor (model 401; Yellow Springs Instruments, Yellow Springs, OH) inserted 10 cm past the anal sphincter. Rectal probe calibration was verified by comparing each probe with a certified glass thermometer in water baths of various temperatures (24.5 to 41°C [76.1 to 105.8°F]) and measuring a mean difference of −0.12 ± 0.12°C (−0.22 ± 0.22°F). The forehead was cleaned with rubbing alcohol and dried, and a sticker (Sportstemp, Greenwood Village, CO) was affixed vertically in the middle of the forehead above the left eyebrow for continuous measurement.

Temporal artery temperature was assessed via a temporal artery scanner (model 2000C; Exergen Corp, Watertown, MA) using the method described in the instruction manual (TEMINST: with no visible sweat, drag along the skin from forehead to hairline; with sweat, hold behind the ear just anterior to the mastoid process) and a modified method observed at local road races (TEMMOD, swipe from forehead to hairline and then around the back edge of the ear ending at just anterior to the mastoid process). Oral temperature was measured using an expensive (SureTemp model 679; WelchAllyn Inc, Skaneateles, NY; ORLE) and an inexpensive (model VT-801BWT; Walgreens Co, Deerfield, IL; ORLIE) digital thermometer; both were used according to the instruction manuals (tip placed below the tongue, toward the back of the mouth). Aural temperature was measured using a “tympanic” ear thermometer (Thermoscan ExacTemp IRT 4520; Braun, Boston, MA; AUR) according to the instruction manual. Axillary temperature was measured by expensive (DataTherm model 00703; RG Medical Diagnostics, Southfield, MI; AXLE) and inexpensive temperature devices (model VT-801BWT; Walgreens Co; AXLIE) placed high into the central axillary region, with the volunteer's arm adducted after being wiped free of sweat. Before data analysis, we adjusted AXLE measures following the procedure described in the instruction manual to estimate rectal temperature: add 1°C [1.8°F]. Thermal sensation was evaluated at each time point using a visual scale (0  =  unbearably cold, 8  =  unbearably hot) adapted from Toner et al.26 Participants, while observing the scale, were asked, “How hot or cold do you feel right now?”

In an attempt to examine if cost differences among devices were responsible for various validity and reliability values, an “inexpensive” and an “expensive” model were used for oral and axillary temperature measurements. Because of patent laws and trade secrets, we do not know the specific technologic differences between an inexpensive and an expensive model. However, we hypothesized that the more expensive device had technologic advances that made it more valid and reliable than its inexpensive counterpart.

Measurements at each site were started at the same time by each researcher. For each researcher, the order of measurement was as follows: Researcher A: RCT, INT, FST; Researcher B: ORLE, ORLIE; Researcher C: AUR (twice), TEMINST (twice), TEMMOD (twice); Researcher D: AXLE in left axilla and AXLIE in right axilla.

Measurements took less than 2 minutes except for those with the inexpensive thermometers (ORLIE, AXLIE). These devices took up to 5 minutes to stabilize and provide readings. At the end of all measures, researcher A repeated his or her measures to assess any change in temperature by these devices, and the continuous reading of AXLE was recorded.

Protocol

Upon arrival, participants inserted a rectal thermistor 10 cm past the anal sphincter, and we applied a forehead sticker to each person. Volunteers were evaluated with 9 different temperature devices after 20 minutes of standing in a cool environment (minute −20; temperature  =  approximately 23.3°C [74.0°F], relative humidity  =  40%) and after 20 minutes of standing in an environmental chamber (minute 0; temperature  =  36.4 ± 1.2°C [97.5 ± 2.16°F], relative humidity  =  52%). Participants then walked on a treadmill (model TR-9100; Life Stride, Franklin Park, IL) at 5.8 to 6.8 km/h (3.6 to 4.2 mph) at a 5% grade with no fan directed on them. Every 30 minutes, during 90 minutes of exercise, temperature measures were taken during a 5-minute standing break (minutes 30, 60, and 90). Just before temperature measurements, volunteers were moved off the treadmill and away from direct air movement from the fan. After 90 minutes of exercise, they exited the chamber and stood or sat in a cool environment (temperature  =  approximately 23.3°C [74.0°F]) for an additional 60 minutes; measurements were repeated every 20 minutes (minutes 110, 130, and 150). Participants were allowed to drink water ad libitum throughout the testing protocol except in the 5 minutes before temperature measurement, when no fluid ingestion was allowed.

Statistical Analyses

Temperature Device Validity

Values used for RCT were an average of the RCTs at the beginning and end of the 5-minute temperature-measuring time period. Measures from other devices that were taken twice in a given time period (INT, FST, TEMINST, TEMMOD, AUR) were also averaged when comparing values with RCT.

A 2-way (temperature device × time) repeated-measures analysis of variance was conducted to test the significance of mean differences in devices over time. To evaluate differences in a given device versus RCT, follow-up repeated-measures t tests with the Bonferroni α correction were used. Greenhouse-Geisser corrections were made when the assumption of sphericity was violated.

Validity of each device versus RCT was evaluated with a range of measurement error statistics.27,28 Mean bias and limits of agreement were calculated as described by Bland-Altman.29 Briefly, limits of agreement were calculated by multiplying the SD of the mean difference between temperature device and RCT by 1.96 (2 SDs).29 The difference between the measurements with the temperature device and RCT, with a 95% probability, is expected to lie within the limits of agreement.27 Intraclass correlation coefficients, standard error of the mean, and coefficient of variation were calculated as outlined by Atkinson and Nevill.27 Pearson correlation coefficients (r), corrected for repeated measures,30 were calculated to evaluate relative agreement of devices. Although limits of agreement, intraclass correlation coefficient, standard error of the mean, coefficient of variation, and r provide insight regarding the validity of a device, we determined that, for practicality and accuracy when used by an athletic trainer to assess degree of hyperthermia, mean bias of a given device greater than or less than 0.27°C (0.50°F) from RCT would bring a device's validity into question.1

Temperature Device Reliability

Device measures that were taken twice during a given time period (INT, FST, TEMINST, TEMMOD, AUR) were evaluated for intradevice reliability. The 1st and 2nd measurements of the same device at a given time point were compared. We calculated measurement error statistics similar to those used to assess temperature validity (ie, mean difference, limits of agreement, intraclass correlation coefficient, standard error of the mean, coefficient of variation, and r). All statistical tests were performed with SPSS (version 10 for Windows; SPSS Inc, Chicago, IL) with α set at .05.

Results

Fifteen males and 10 females (mean age  =  26.5 ± 5.3 years, height  =  174.3 ± 11.1 cm, mass  =  72.73 ± 15.95 kg, body composition  =  16.2 ± 5.5% body fat) participated in this study. Between 60 and 90 minutes of exercise, 3 volunteers discontinued exercise due to one or a combination of the following termination criteria: RCT greater than 40.00°C, voluntary exhaustion, and observed or self-reported central nervous system dysfunction. Therefore, data at minute 90 are provided for 22 participants.

The interaction of time and temperature device was significant (F7,63  =  24.80, P < .001; Figure 1). From the beginning to the end of each 5-minute temperature measuring period, RCT did not change (averages  =  37.94 ± 0.73°C and 37.93 ± 0.73°C, respectively; F1,193  =  0.21, P  =  .649).

Figure 1. Mean ± SD of each temperature device over time compared with rectal temperature (RCT). ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.) a Indicates difference from RCT at the same time point (P < .05).Figure 1. Mean ± SD of each temperature device over time compared with rectal temperature (RCT). ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.) a Indicates difference from RCT at the same time point (P < .05).Figure 1. Mean ± SD of each temperature device over time compared with rectal temperature (RCT). ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.) a Indicates difference from RCT at the same time point (P < .05).
Figure 1 Mean ± SD of each temperature device over time compared with rectal temperature (RCT). ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.) a Indicates difference from RCT at the same time point (P < .05).

Citation: Journal of Athletic Training 44, 2; 10.4085/1062-6050-44.2.124

Temperature at Rest

In ambient conditions, differences among devices were evident before exercise began (F9,216  =  30.48, P < .001). The INT, FST ORLE, ORLIE, AUR, AXLE, and AXLIE were all different from RCT (P < .001). The TEMINST and TEMMOD were the only devices providing measurements that did not differ from RCT (P  =  .148 and .181, respectively). After 20 minutes in the heat (minute 0), FST (P < .001), TEMINST (P  =  .006), TEMMOD (P < .001), ORLIE (P < .001), AXLE (P  =  .004), AUR (P  =  .040), and AXLIE (P < .001) were different from RCT. The INT and ORLE did not differ from RCT (P  =  .414 and 1.000, respectively).

Temperature During Exercise

At each exercise time point (minutes 30, 60, and 90), TEMINST, TEMMOD, ORLE, ORLIE, AXLE, AXLIE, and AUR were all lower than RCT (P < .001–.011). The FST was greater than RCT at all exercise time points (P < .001). The INT and RCT were not different at 30, 60, and 90 minutes (P  =  1.00).

Temperature Postexercise

At each postexercise time point (minutes 110, 130, and 150), TEMINST, TEMMOD, ORLE, ORLIE, AUR, AXLE, and AXLIE were all lower than RCT (P < .001). The FST was less than RCT at minutes 110 and 130 (P < .001) but not at minute 150 (P  =  .543). The INT and RCT were not different at 110, 130, and 150 min (P  =  .607, 1.00, and 1.00, respectively).

Temperature Device Validity

Mean bias, r, limits of agreement, intraclass correlation coefficient, standard error of the mean, and coefficient of variation are presented in Table 1. Mean bias and limits of agreement are represented graphically with Bland-Altman plots in Figure 2.

Figure 2. Bland-Altman plots indicating the mean bias (bold dashed line) and limits of agreement (dashed lines) for each temperature device compared with RCT. ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.)Figure 2. Bland-Altman plots indicating the mean bias (bold dashed line) and limits of agreement (dashed lines) for each temperature device compared with RCT. ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.)Figure 2. Bland-Altman plots indicating the mean bias (bold dashed line) and limits of agreement (dashed lines) for each temperature device compared with RCT. ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.)
Figure 2 Bland-Altman plots indicating the mean bias (bold dashed line) and limits of agreement (dashed lines) for each temperature device compared with RCT. ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.)

Citation: Journal of Athletic Training 44, 2; 10.4085/1062-6050-44.2.124

Continued.Continued.Continued.
Continued.

Citation: Journal of Athletic Training 44, 2; 10.4085/1062-6050-44.2.124

Continued.Continued.Continued.
Continued.

Citation: Journal of Athletic Training 44, 2; 10.4085/1062-6050-44.2.124

Table 1 Measures of Validity Using Rectal Temperature as the Criterion Standard
Table 1

Temperature Device Reliability

Mean difference between measures, r, limits of agreement, intraclass correlation coefficient, standard error of the mean, and coefficient of variation for INT, FST, TEMINST, TEMMOD, and AUR are presented in Table 2.

Table 2 Reliability of Temperature Devices That Were Taken Twice in Each Measurement Time Period
Table 2

Discussion

Our purpose was to evaluate the reliability and validity of selected body temperature-measuring devices commonly used to estimate internal body temperature. The recommended method for the evaluation of internal body temperature in the diagnosis of exertional heat stroke in a field setting is RCT.2,15 Previous research1 from our laboratory indicated that many commonly used temperature devices are invalid in an outdoor setting. We examined if controlling environmental factors (eg, sun, wind) would decrease the variability and, hence, increase the performance of these devices. Our stated hypotheses were supported in that, when compared with RCT, only the ingestible temperature device provided a viable means of measuring internal body temperature in individuals who become hyperthermic (average  =  38.80 ± 0.72°C [101.84 ± 1.30°F]) during exercise in a controlled, indoor environment.

Oral Temperature

At rest, in mild air conditions (minute −20) ORLE and ORLIE were both lower than RCT. The mean differences between ORLE and ORLIE and RCT at rest were −0.58°C and −0.85°C, respectively. Overestimation of core temperature by oral device has been reported in the range of 0.12°C to 0.53°C.8,31 After the participant stood for 20 minutes in a hot environment (36.6°C [97.9°F]), RCT did not increase (P  =  .096), but ORLE and ORLIE increased 0.78°C (P  =  .006) and 0.75°C (P  =  .019), respectively. These changes decreased the differences between ORLE or ORLIE and RCT (Figure 1). Because RCT did not change when the participant moved to a warmer environment and ORLE and ORLIE did, our data supported the hypothesis that, in this setting, oral temperature was influenced by ambient temperature.

During and after exercise, when RCT was greater than resting values, ORLE and ORLIE were lower than RCT at every time point measured. Of the devices used in this study, at peak RCT (minute 90: 38.80 ± 0.72°C [101.84 ± 1.30°F]), ORLE and ORLIE had the largest mean differences from RCT (−1.25 ± 0.72°C [−2.25 ± 1.30°F] and −1.35 ± 0.45°C [−2.43 ± 0.81°F], respectively). Others31,32 have reported differences between oral temperature and RCT during exercise of −0.55°C to −0.33°C. Differences may be attributed to the low peak RCT reached in these studies (37.57°C–37.64°C).31,32 Mairiaux et al31 reported that the difference between RCT and oral temperatures increased as RCT increased.

Compared with oral temperature measurement during outdoor exercise,1 ORLE and ORLIE measured indoors more closely agreed with RCT. For example, Casa et al1 reported mean differences of −1.20°C (−2.17°F) and −1.67°C (−3.00°F), between the same 2 devices used in this study (ORLE and ORLIE) and RCT. Despite the closer agreement in the present study (Table 1), ORLE and ORLIE were still considered invalid for estimated RCT, using the cutoff ± 0.27°C (0.50°F). A change in mean difference between these devices and RCT in a variety of settings further supports the hypothesis that oral temperature is influenced by environmental factors and is not a valid measurement for estimating core body temperature. Also in the present study, ORLE measured slightly higher than ORLIE (0.27°C [0.49°F]), but both devices were affected by environmental changes (−20 to 0 minutes) (Figure 1) and provided consistently lower measurements than RCT. Therefore, we conclude that oral temperature is an invalid measurement for estimating RCT in exercising individuals.

Axillary Temperature

Although some authors11 have observed a strong correlation between axillary temperature and RCT in resting individuals, correlation does not imply validity.33 Indeed, we found moderate correlations between AXLIE and AXLE and RCT (r  =  0.77 and 0.60, respectively), but the mean bias for both devices was greater than the cutoff (−1.25°C [−2.24°F] and −0.94°C [−1.70°F] for AXLIE and AXLE, respectively). Kistemaker et al34 also observed that axillary temperature with indoor exercise failed to increase to the same degree as RCT.

We1 have previously shown that axillary temperature during outdoor exercise is invalid. It appears that the mean bias of axillary versus RCT is greater outdoors (−2.07°C [−3.73°F] and −2.58°C [−4.65°F] for AXLIE and AXLE, respectively) than indoors in the present study (Table 1).1 This finding may reflect the influence of uncontrollable environmental factors present outdoors (eg, wind, cloud cover). Axillary temperature measured by a standard temperature probe placed on the skin over the axillary artery often reflects skin temperature and not core temperature.8 Therefore, the higher temperature in this study (36.4°C [97.5°F]) versus outdoor exercise (29.7°C [85.5°F])1 may be a result of higher skin temperature and, thus, closer agreement between axillary temperature and RCT. This suggestion is supported by the increased mean difference between RCT and axillary temperature postexercise when participants were moved to a cooler environment (∼23.3°C [74.0°F]; Figure 1).

Axillary temperature was lower than RCT at every time point during and after exercise in the heat. The mean biases between axillary (AXLIE and AXLE) temperature and RCT were greater than for the other devices tested. Although axillary temperature measured with a more expensive device (AXLE) improved the agreement with RCT, both axillary devices (AXLIE and AXLE) were considered invalid. Therefore, we recommend that axillary temperature not be used to estimate internal body temperature in exercising individuals.

Intestinal Temperature

In numerous studies,1,22,28,35 INT, measured via a telemetric pill, has been shown to be a valid estimate of core body temperature during exercise. We found that INT versus other measures had the smallest bias, limits of agreement, standard error of the mean, and coefficient of variation; it also had the highest correlation (r) and intraclass correlation coefficient (Table 1). The mean bias between RCT and INT (−0.02°C [−0.03°F]) was lower than in other studies of indoor exercise (−0.15°C to 0.26°C).22,28 Differences may be due to different environmental temperatures, exercise protocols, and degrees of hyperthermia.

Telemetric pills have been used in outdoor settings to estimate core body temperature in professional athletes during practice36 and competition.37 According to the results of an outdoor validity study conducted by our laboratory,1 INT was a valid estimate of internal body temperature. Mean bias was higher outdoors (−0.19°C [−0.34°F])1 than indoors (−0.02°C [−0.03°F]), but these levels of bias are similar to those found in other studies22,28 and are still considered valid. The small bias we found may be due to the steady-state walking exercise.

The interday reliability of INT is very high (mean difference  =  0.01°C).28 Similarly, when repeated measures are taken within a temperature assessment time period, the reliability of INT is very high (mean difference  =  0.02°C).38 Our finding of a low mean difference (0.01°C [0.02°F]) between readings supports the reliability of telemetric pills.

Byrne and Lim,39 after systematically reviewing studies comparing RCT, esophageal temperature, and INT, concluded that INT is a valid and reliable estimate of core body temperature in a variety of settings, including outdoor exercise. Our data similarly show that INT was reliable and valid when assessing hyperthermia in individuals exercising indoors. Therefore, if circumstances allow for the ingestion of a telemetric pill at least 2 to 4 hours before exercise and retention of that pill during exercise, INT will be a valid alternative for measuring core body temperature.

Aural (tympanic) Temperature

Although a direct measure of tympanic temperature (temperature probe touching the tympanic membrane) may be a valid estimation of core body temperature,40 commercial devices attempt to estimate tympanic temperature by measuring infrared radiation from the tympanic membrane and then calculating temperature with a derived algorithm. The infrared radiation arises from the ear canal,40 so these devices are measuring AUR, but this method often results in an invalid estimate of core body temperature.8,19,41,42

Observing athletes exercising outdoors in the heat, we1 previously found that AUR had a strong correlation with RCT (r  =  0.70) but a large mean measurement difference (mean bias  =  −1.00°C [−1.80°F]). Similarly, with indoor exercise, the strong correlation (r  =  0.77) does not represent the validity of AUR (mean bias  =  −0.67°C [−1.20°F]). Correlation coefficients do not determine validity.33

In this study, AUR consistently underestimated RCT at every time point (mean bias  =  −0.67°C [−1.20°F]; Figure 1). This level of underestimation is consistent with that in other published studies (mean underestimation  =  0.16°C to 1.07°C).1,8,42,43 Variations in AUR agreement with RCT can be attributed to differences in airflow,42 device used,8 and degree of hyperthermia41 and changes in skin temperature.19 Because all these factors may influence the ability of AUR to accurately estimate core body temperature, this method is not often recommended as a valid measure.1,19

Temporal Temperature

Previous authors examining temporal TEM primarily studied infants at rest.5,6 Temporal temperature in this setting may5 or may not6 be valid. We found that at rest in mild air temperatures (23.3°C [74.0°F]), TEM was not different from RCT (P  =  .148 [TEMINST] and .181 [TEMMOD]). However, at rest and during and after exercise in a hot environment, TEM was different from RCT at each time point (Figure 1). Although the mean bias of TEMINST (−0.87°C [−1.56°F]) improved slightly when using a modified technique (TEMMOD mean bias  =  −0.63°C [−1.13°F]), neither method is a valid estimate of RCT.

Using the same device we used in the present study, Low et al38 recently reported that temporal measurements failed to detect a 0.7°C increase in INT. They found that temporal temperature and INT were inversely related (slope  =  −0.34 ± 0.14°C) and poorly correlated (R2  =  0.29). Other authors1 measuring temporal temperature during outdoor exercise in the heat have observed an inverse relationship between RCT and temporal temperature. We also noted weak correlations between RCT and temporal temperature (r  =  .25 [TEMINST] and .38 [TEMMOD]); Table 1).

When compared with outdoor observations of TEM,1 indoor exercise in a controlled environment lowered the degree of discrepancy between RCT and TEM. Mean bias outdoors was −1.46°C (−2.64°F) and −1.36°C (−2.44°F) for TEMINST and TEMMOD, respectively1; in the present study, mean bias was −0.87°C (−1.56°F) and −0.63°C (−1.13°F) for TEMINST and TEMMOD, respectively. Kistemaker et al34 examined TEM when participants cycled indoors for 30 minutes at 30°C. Temporal temperature underestimated internal body temperature at rest but overestimated it during exercise. On average, TEM overestimated core temperature by 0.50 ± 0.50°C.34 In the present study, RCT was underestimated during exercise (mean bias  =  −0.87 [TEMINST] and −0.63°C [TEMMOD]). The inability of TEM to accurately estimate RCT may be due to inconsistent blood flow in the superficial temporal artery,34 different exercise durations, or different amounts of sweat present. Regardless, our data indicate that the measurement of TEM is an invalid estimate of RCT in exercising athletes.

Device reliability to measure TEM was assessed by repeating each measure during each temperature measuring period. The mean differences among subsequent readings were 0.04 ± 0.33°C (0.10 ± 0.50°F) and 0.09 ± 0.23°C (0.16 ± 0.42°F) for TEMINST and TEMMOD, respectively; (Table 2). Others38 have measured greater variability among subsequent readings (0.15°C ± 0.01). Therefore, it appears that the device used to measure TEM in this study was reliable but not valid. Because a valid measure of RCT is the main objective when assessing an exercising individual for exertional heat stroke, we do not recommend TEM as an accurate estimate of RCT.

Liquid Crystal Forehead Strip

In resting individuals, forehead temperature measured with a liquid crystal forehead strip underestimated4,7 or overestimated44 internal body temperature. With individuals exercising outdoors, we1 previously found that in the shade (ie, absence of solar radiation), FST underestimated core body temperature, but on the field (ie, with solar radiation), FST overestimated RCT. This led to the conclusion that FST is influenced by environmental conditions and, thus, not recommended as a valid estimate of RCT.

With environmental conditions controlled indoors, we found that FST overestimated RCT during rest and exercise in the heat (mean difference  =  1.28 ± 0.48°C [2.30 ± 0.86°F]). However, when before- and after-exercise measurements were taken in cooler ambient temperatures, FST underestimated RCT (mean difference  =  −0.68 ± 0.71°C [−1.22 ± 1.28°F]; Figure 1). Because skin temperature often correlates with ambient conditions and because FST is measured in various ambient temperatures, FST is likely more influenced by changes in skin temperature than RCT.

When subsequent visual readings are acquired by the same person, FST shows strong reliability (r  =  0.97) with small differences between readings (mean difference  =  0.03 ± 0.35°C [0.06 ± 0.64°F]; Table 2). We do not know if the reliability is high between temperature assessors (interrater reliability).

Despite the influence of ambient temperature, mean bias of FST was 0.29°C (0.59°F) (Table 1). Although this bias was just beyond the designated cutoff for being considered a valid estimate of RCT, it should be noted that the bias depended on ambient temperature. Thus, FST is an invalid device to estimate internal body temperature in exercising individuals.

Thermal Sensation

In order to evaluate participants' thermal sensations, we used a visual scale adapted from Toner et al.26 Ratings of thermal sensation weakly correlated with RCT (r  =  0.42, P < .001). In the cooler environment before and after exercise, thermal sensation was 3.5 ± 0.8, but it was not correlated with RCT (r  =  −0.07, P  =  .497). In the heat before and during exercise, thermal sensation was 5.8 ± 1.0 and was correlated with RCT (r  =  0.59, P < .001). The differences in thermal sensation and RCT correlations are most likely due to rapid changes in skin temperature.26

We1 previously showed that thermal sensation during outdoor exercise was moderately correlated with RCT (r  =  0.72). Therefore, it is evident that the strength of correlation between thermal sensation and RCT depends on the environment. Although thermal sensation may exhibit the same (increasing) trend as RCT, it is not a valid tool for estimating internal body temperature and should be avoided as a sole diagnostic tool for the evaluation of exertional heat stroke.

Conclusions

The purpose of our study was to evaluate the validity and reliability of commonly used devices to estimate core body temperature in hyperthermic individuals exercising indoors. Indoor exercise was performed in controlled laboratory conditions (ie, an environmental chamber). Future investigators should examine the validity and reliability of these temperature devices in other indoor settings and exercise modes.

Rectal temperature was used as the criterion standard because of its previously established validity,812 ease of use, and practicability for the athletic trainer.2 We conclude that measuring oral, forehead, aural, temporal, and axillary temperatures using the tested devices provided invalid measurements for the estimation of RCT. Although some devices may possess high reliability, the devices used to measure these sites all had an average mean difference greater than the allowed cutoff for validity (± 0.27°C [± 0.50°F]). The INT measured via a telemetric pill was the only measurement considered valid (mean bias  =  −0.02°C [−0.03°F]). Therefore, we recommend that the internal body temperature of exercising individuals be assessed with RCT. A previously ingested and retained intestinal telemetric pill is an acceptable alternative, but RCT must always remain a viable option in the event that measurement with telemetric pills is not possible (ie, malfunction, passed, not ingested, not enough time to pass through stomach).

Previous work in our laboratory1 and our current findings reinforce the concept that temperature devices commonly used by medical professionals provide invalid estimates of core body temperature in indoor and outdoor settings. At this time, INT and RCT assessment are the only valid and reliable measurements for practical means of estimating core body temperature.

Acknowledgments

The authors gratefully acknowledge the researchers and technicians who made this study such a success: Chris Casa, Tutita Casa, Mike D’Alfonso, Christy Eason, Brian Gallagher, Ashleigh Gauvain, Neal Glaviano, Rob Huggins, Camille James, Nick Kalra, Jennifer Klau, Elaine Lee, Stephanie Mazerolle, Melissa Roti, Ian Scruggs, Barry Spiering, Kristin Stroly, Jakob Vingren, Greig Watson, Linda Yamamoto, and Brad Yeargin.

References

  • 1
    Casa, D. J.
    ,
    S. M.Becker
    ,
    M. S.Ganio
    , et al
    . Validity of devices that assess body temperature during outdoor exercise in the heat.J Athl Train2007. 42
    3
    :333342.
  • 2
    Binkley, H. M.
    ,
    J.Beckett
    ,
    D. J.Casa
    ,
    D. M.Kleiner
    , and
    P. E.Plummer
    . National Athletic Trainers' Association position statement: Exertional heat illnesses.J Athl Train2002. 37
    3
    :329343.
  • 3
    Dombek, P. M.
    ,
    D.Casa
    ,
    S. W.Yeargin
    , et al
    . Athletic trainers' knowledge and behavior regarding the prevention, recognition and treatment of exertional heat stroke at the high school level [abstract].J Athl Train2006. 41
    2 suppl
    :S47.
  • 4
    Kongpanichkul, A.
    and
    S.Bunjongpak
    . A comparative study on accuracy of liquid crystal forehead, digital electronic axillary, infrared tympanic with glass-mercury rectal thermometer in infants and young children.J Med Assoc Thai2000. 83
    9
    :10681076.
  • 5
    Greenes, D. S.
    and
    G. R.Fleisher
    . When body temperature changes, does rectal temperature lag?J Pediatr2004. 144
    6
    :824826.
  • 6
    Greenes, D. S.
    and
    G. R.Fleisher
    . Accuracy of a noninvasive temporal artery thermometer for use in infants.Arch Pediatr Adolesc Med2001. 155
    3
    :376381.
  • 7
    Allen, G. C.
    ,
    J. C.Horrow
    , and
    H.Rosenberg
    . Does forehead liquid crystal temperature accurately reflect “core” temperature?Can J Anaesth1990. 37
    6
    :659662.
  • 8
    Jensen, B. N.
    ,
    F. S.Jensen
    ,
    S. N.Madsen
    , and
    K.Lossl
    . Accuracy of digital tympanic, oral, axillary, and rectal thermometers compared with standard rectal mercury thermometers.Eur J Surg2000. 166
    11
    :848851.
  • 9
    Lee, S. M.
    ,
    W. J.Williams
    , and
    S. M.Fortney Schneider
    . Core temperature measurement during supine exercise: esophageal, rectal, and intestinal temperatures.Aviat Space Environ Med2000. 71
    9
    :939945.
  • 10
    Lefrant, J. Y.
    ,
    L.Muller
    ,
    J. E.de La Coussaye
    , et al
    . Temperature measurement in intensive care patients: Comparison of urinary bladder, oesophageal, rectal, axillary, and inguinal methods versus pulmonary artery core method.Intensive Care Med2003. 29
    3
    :414418.
  • 11
    Chaturvedi, D.
    ,
    K. Y.Vilhekar
    ,
    P.Chaturvedi
    , and
    M. S.Bharambe
    . Comparison of axillary temperature with rectal or oral temperature and determination of optimum placement time in children.Indian Pediatr2004. 41
    6
    :600603.
  • 12
    Casa, D. J.
    ,
    L. E.Armstrong
    ,
    M. S.Ganio
    , and
    S. W.Yeargin
    . Exertional heat stroke in competitive athletes.Curr Sports Med Rep2005. 4
    6
    :309317.
  • 13
    Roberts, W. O.
    Assessing core temperature in collapsed athletes: what's the best method? Physician Sportsmed 1994. 22
    8
    :4955.
  • 14
    Moran, D. S.
    and
    L.Mendal
    . Core temperature measurement: methods and current insights.Sports Med2002. 32
    14
    :879885.
  • 15
    Casa, D. J.
    and
    W. O.Roberts
    . Considerations for the medical staff: Preventing, identifying, and treating exertional heat illnesses.In:
    Armstrong, L. E.
    Exertional Heat Illnesses.
    Champaign, IL
    Human Kinetics
    . 2003. 169196.
  • 16
    Brown, G. A.
    and
    G. M.Williams
    . The effect of head cooling on deep body temperature and thermal comfort in man.Aviat Space Environ Med1982. 53
    6
    :583586.
  • 17
    Livingstone, S. D.
    ,
    J.Grayson
    ,
    J.Frim
    ,
    C. L.Allen
    , and
    R. E.Limmer
    . Effect of cold exposure on various sites of core temperature measurements.J Appl Physiol1983. 54
    4
    :10251031.
  • 18
    Zehner, W. J.
    and
    T. E.Terndrup
    . The impact of moderate ambient temperature variance on the relationship between oral, rectal, and tympanic membrane temperatures.Clin Pediatr (Phila)1991. 30
    4 suppl
    :6172.
  • 19
    Deschamps, A.
    ,
    R. D.Levy
    ,
    M. G.Cosio
    ,
    E. B.Marliss
    , and
    S.Magder
    . Tympanic temperature should not be used to assess exercise induced hyperthermia.Clin J Sport Med1992. 2
    1
    :2732.
  • 20
    Cabanac, M.
    and
    M.Caputa
    . Natural selective cooling of the human brain: evidence of its occurrence and magnitude.J Physiol1979. 286
    1
    :255264.
  • 21
    Shiraki, K.
    ,
    S.Sagawa
    ,
    F.Tajima
    ,
    A.Yokota
    ,
    M.Hashimoto
    , and
    G. L.Brengelmann
    . Independence of brain and tympanic temperatures in an unanesthetized human.J Appl Physiol1988. 65
    1
    :482486.
  • 22
    Kolka, M. A.
    ,
    M. D.Quigley
    ,
    L. A.Blanchard
    ,
    D. A.Toyota
    , and
    L. A.Stephenson
    . Validation of a temperature telemetry system during moderate and strenuous exercise.J Therm Biol1993. 18
    4
    :203210.
  • 23
    Jackson, A. S.
    and
    M. L.Pollock
    . Practical assessment of body composition.Physician Sportsmed1985. 13
    5
    :7690.
  • 24
    Pollock, M. L.
    ,
    D. H.Schmidt
    , and
    A. S.Jackson
    . Measurement of cardiorespiratory fitness and body composition in the clinical setting.Compr Ther1980. 6
    9
    :1227.
  • 25
    Heyward, V. H.
    and
    L. M.Stolarczyk
    . Applied Body Composition Assessment.
    Champaign, IL
    Human Kinetics
    . 1996. 12.
  • 26
    Toner, M. M.
    ,
    L. L.Drolet
    , and
    K. B.Pandolf
    . Perceptual and physiological responses during exercise in cool and cold water.Percept Mot Skills1986. 62
    1
    :211220.
  • 27
    Atkinson, G.
    and
    A. M.Nevill
    . Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine.Sports Med1998. 26
    4
    :217238.
  • 28
    Gant, N.
    ,
    G.Atkinson
    , and
    C.Williams
    . The validity and reliability of intestinal temperature during intermittent running.Med Sci Sports Exerc2006. 38
    11
    :19261931.
  • 29
    Bland, J. M.
    and
    D. G.Altman
    . Statistical methods for assessing agreement between two methods of clinical measurement.Lancet1986. 1
    8476
    :307310.
  • 30
    Bland, J. M.
    and
    D. G.Altman
    . Calculating correlation coefficients with repeated observations, part I: correlation within subjects.BMJ1995. 310
    6977
    :446.
  • 31
    Mairiaux, P.
    ,
    J. C.Sagot
    , and
    V.Candas
    . Oral temperature as an index of core temperature during heat transients.Eur J Appl Physiol Occup Physiol1983. 50
    3
    :331341.
  • 32
    Edwards, R. J.
    ,
    A. J.Belyavin
    , and
    M. H.Harrison
    . Core temperature measurement in man.Aviat Space Environ Med1978. 49
    11
    :12891294.
  • 33
    Bland, J. M.
    and
    D. G.Altman
    . Measuring agreement in method comparison studies.Stat Methods Med Res1999. 8
    2
    :135160.
  • 34
    Kistemaker, J. A.
    ,
    E. A.Den Hartog
    , and
    H. A.Daanen
    . Reliability of an infrared forehead skin thermometer for core temperature measurements.J Med Eng Technol2006. 30
    4
    :252261.
  • 35
    Kolka, M. A.
    ,
    L.Levine
    , and
    L. A.Stephenson
    . Use of an ingestible telemetry system to measure core temperature under chemical protective clothing.J Therm Biol1997. 22
    4–5
    :343349.
  • 36
    Godek, S. F.
    ,
    A. R.Bartolozzi
    ,
    R.Burkholder
    ,
    E.Sugarman
    , and
    G.Dorshimer
    . Core temperature and percentage of dehydration in professional football linemen and backs during preseason practices.J Athl Train2006. 41
    1
    :817.
  • 37
    Edwards, A. M.
    and
    N. A.Clark
    . Thermoregulatory observations in soccer match play: professional and recreational level applications using an intestinal pill system to measure core temperature.Br J Sports Med2006. 40
    2
    :133138.
  • 38
    Low, D. A.
    ,
    A.Vu
    ,
    M.Brown
    , et al
    . Temporal thermometry fails to track body core temperature during heat stress.Med Sci Sports Exerc2007. 39
    7
    :10291035.
  • 39
    Byrne, C.
    and
    C. L.Lim
    . The ingestible telemetric body core temperature sensor: a review of validity and exercise applications.Br J Sports Med2007. 41
    3
    :126133.
  • 40
    Sato, K. T.
    ,
    N. L.Kane
    ,
    G.Soos
    ,
    C. V.Gisolfi
    ,
    N.Kondo
    , and
    K.Sato
    . Reexamination of tympanic membrane temperature as a core temperature.J Appl Physiol1996. 80
    4
    :12331239.
  • 41
    Armstrong, L. E.
    ,
    C. M.Maresh
    ,
    A. E.Crago
    ,
    R.Adams
    , and
    W. O.Roberts
    . Interpretation of aural temperatures during exercise, hyperthermia, and cooling therapy.Med Exerc Nutr Health1994. 3
    1
    :916.
  • 42
    Hansen, R. D.
    ,
    W. H.Daley
    , and
    B.Leelarthaepin
    . The effect of facial airflow on the estimation of exercise core temperature by infrared tympanic thermometry.Aust J Sci Med Sport1993. 25
    1
    :2631.
  • 43
    Roth, R. N.
    ,
    V. P.Verdile
    ,
    L. J.Grollman
    , and
    D. A.Stone
    . Agreement between rectal and tympanic membrane temperatures in marathon runners.Ann Emerg Med1996. 28
    4
    :414417.
  • 44
    Patel, N.
    ,
    C. E.Smith
    ,
    A. C.Pinchak
    , and
    J. F.Hagen
    . Comparison of esophageal, tympanic, and forehead skin temperatures in adult patients.J Clin Anesth1996. 8
    6
    :462468.
Copyright: the National Athletic Trainers' Association, Inc
Figure 1
Figure 1

Mean ± SD of each temperature device over time compared with rectal temperature (RCT). ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.) a Indicates difference from RCT at the same time point (P < .05).


Figure 2
Figure 2

Bland-Altman plots indicating the mean bias (bold dashed line) and limits of agreement (dashed lines) for each temperature device compared with RCT. ORLE indicates oral temperature with expensive thermometer; ORLIE, oral temperature with inexpensive thermometer; AXLE, axillary temperature with expensive thermometer; AXLIE, axillary temperature with inexpensive thermometer; INT, intestinal temperature; AUR, aural temperature; TEMINST, temporal temperature measured with the method described by the instructional manual; TEMMOD, temporal temperature measured in a modified method; FST, forehead sticker temperature. (See text for further descriptions.)




Contributor Notes

Matthew S. Ganio, MS, contributed to conception and design; acquisition and analysis and interpretation of the data; and drafting, critical revision, and final approval of the article. Christopher M. Brown, MA, ATC, contributed to conception and design, acquisition and analysis and interpretation of the data, and critical revision and final approval of the article. Douglas J. Casa, PhD, ATC, FNATA, FACSM, contributed to conception and design; acquisition and analysis and interpretation of the data; and drafting, critical revision, and final approval of the article. Shannon M. Becker, MA, ATC, contributed to conception and design, acquisition and analysis and interpretation of the data, and critical revision and final approval of the article. Susan W. Yeargin, PhD, ATC; Brendon P. McDermott, MS, ATC; and Lindsay M. Boots, BS, ATC, contributed to conception and design, acquisition of the data, and critical revision and final approval of the article. Paul W. Boyd, BS, ATC, contributed to acquisition of the data and critical revision and final approval of the article. Lawrence E. Armstrong, PhD, FACSM, and Carl M. Maresh, PhD, FACSM, contributed to conception and design, acquisition of the data, and critical revision and final approval of the article.

Address correspondence to Matthew S Ganio, MS, University of Connecticut, 2095 Hillside Road, U-1110, Storrs, CT 06269. Address e-mail to matthew.ganio@uconn.edu
  • Download PDF