Instrumented Static and Reactive Balance in Collegiate Athletes: Normative Values and Minimal Detectable Change
Wearable sensors are increasingly popular in concussion research because of their objective quantification of subtle balance deficits. However, normative data and minimal detectable change (MDC) values are necessary to serve as references for diagnostic use and tracking longitudinal recovery. To identify normative and MDC values for instrumented static- and reactive-balance tests, an instrumented static mediolateral (ML) root mean square (RMS) sway standing balance assessment and the instrumented, modified push and release (I-mP&R), respectively. Cross-sectional study. Clinical setting. Normative static ML RMS sway and I-mP&R data were collected on 377 (n = 184 female) healthy National Collegiate Athletic Association Division I athletes at the beginning of their competitive seasons. Test-retest data were collected in 36 healthy control athletes based on standard recovery timelines after concussion. Descriptive statistics, intraclass correlation coefficients (ICCs), and MDC values were calculated for primary outcomes of ML RMS sway in a static double-limb stance on firm ground and a foam block, and time to stability and latency from the I-mP&R in single- and dual-task conditions. Normative outcomes across static ML RMS sway and I-mP&R were sensitive to sex and type of footwear. Mediolateral RMS sway demonstrated moderate reliability in the firm condition (ICC = 0.73; MDC = 2.7 cm/s2) but poor reliability in the foam condition (ICC = 0.43; MDC = 11.1 cm/s2). Single- and dual-task times to stability from the I-mP&R exhibited good reliability (ICC = 0.84 and 0.80, respectively; MDC = 0.25 and 0.29 seconds, respectively). Latency from the I-mP&R had poor to moderate reliability (ICC = 0.38 and 0.55; MDC = 107 and 105 milliseconds). Sex-matched references should be used for instrumented static- and reactive-balance assessments. Footwear may explain variability in static ML RMS sway and time to stability of the I-mP&R. Moderate-to-good reliability suggests time to stability from the I-mP&R and ML RMS static sway on firm ground can be used for longitudinal assessments.Context
Objective
Design
Setting
Patients or Other Participants
Main Outcome Measure(s)
Results
Conclusions
Wearable inertial sensors have recently become a practical addition to clinical balance assessments because of their ability to objectively quantify static, dynamic, and reactive balance.1–5 These measures are derived from triaxial acceleration and angular velocity data obtained during standard clinical tests, and they can increase the sensitivity and clinical utility of common assessments. For example, instrumenting the Balance Error Scoring System (BESS) with inertial sensors improved the sensitivity in acute concussion evaluation (63%) compared with using the subjective clinical score alone (35%).6 Objectively measuring these subtle deficits is important; instrumented measures of dynamic and reactive balance are associated with the risk of future musculoskeletal injuries in collegiate athletes.7,8 This growing evidence suggests that wearable sensors can improve the utility of clinical balance assessments by providing sensitive, objective information to inform the diagnosis and care for athletes. Although much of the focus of instrumented clinical balance assessments is on identifying deficits postconcussion, instrumenting baseline assessments of concussion allows for comparisons of instrumented balance performance preinjury to postinjury or from injured to normative data on healthy athletes.
When these clinical balance assessments are used in concussion recovery, baseline measurements allow comparisons with postinjury values specific to the athlete, indicating a within-athletes effect of injury. However, such baseline comparisons are not always possible, as obtaining baseline data can significantly burden sports medicine staff and athletes.9,10 Concussions might occur before completion of baseline testing, and although comparisons with baseline values provide more individualized results, comparisons with normative reference data have been a viable alternative for simple reaction time, postural control during static and dynamic balance, and when baseline measures are unavailable.11–13 Notably, few results have been published regarding normative values for reactive balance. The push and release is an assessment of reactive balance that has previously been used to identify balance deficits and postural control in fall-risk populations.14 However, the instrumented, modified push and release (I-mP&R) adds trials in the forward, left, and right directions and a cognitive dual-task condition, and all trials are completed with eyes closed to increase the difficulty of a balance assessment for elite athletes. Further, the I-mP&R demonstrates the ability to complement current baseline assessments of concussion in athletes.15 Thus, there is a need to identify normative values of reactive balance to further the utility and inclusion of reactive balance in concussion assessment. In regard to static and dynamic balance, previous cross-sectional and longitudinal studies of instrumented balance tests often contained small or limited samples of athletes.6,16,17 Further, it may be necessary to establish athlete-specific reference values, as prior work has suggested athletes need to be compared with their athletic peers as opposed to nonathletic controls. Although Parrington et al provided normative reference values for static balance using an instrumented BESS protocol, their sample contained only 82 healthy normative athletes, which did not allow for normative data to be stratified by sex, sport, or other important demographic characteristics that have been shown to influence balance recovery, such as height.18,19 Consequently, reference values for instrumented balance in normal, healthy athletes across various demographic and sport contexts are necessary to enable comparisons of individual athletes to their normative peers.
Further, the test-retest reliability and minimal detectable change (MDC) remain unclear for common instrumented static- and reactive-balance assessments. The MDC is an essential clinical measurement representing the change in an outcome measure that falls outside of measurement error.20 This MDC enables clinicians to identify whether a change in balance exceeds the normal variability in balance between assessments. For example, the MDC for the BESS is 7.3 to 9.4 errors, which is significantly more than the typical postconcussion change of 5 errors.21,22 Consequently, the clinical scoring of the BESS has little to no clinical utility in assessing changes in errors over time during serial concussion assessment. Establishing the test-retest and MDC for instrumented assessments of balance is necessary to provide valuable information to clinicians regarding how changes across serial assessments should be interpreted.
The purpose of this study was 2-fold: (1) to determine normative reference values for instrumented static mediolateral (ML) root mean square (RMS) sway balance and the I-mP&R in healthy National Collegiate Athletic Association (NCAA) Division 1 athletes and (2) to determine test-retest reliability and MDC values over a common injury recovery timeline. Specifically, we aimed to identify the test-retest reliability and MDC in both static ML RMS sway and I-mP&R during the timeline associated with the acute-to-asymptomatic time frame after concussion.
METHODS
Before participation, all athletes provided informed written consent, and all protocols were approved by the local institutional review board. Inclusion criteria for all participants were being a current NCAA Division I athlete and over 18 years old. Exclusion criteria for all participants were lower extremity surgery within the last 2 years, any planned upcoming lower extremity surgery that would cause the athlete to miss a significant amount of practice or competition, or concussion (self-reported) within the past year.
Participants—Normative Data
Baseline testing was completed on 377 (n = 184 female) healthy collegiate athletes (Table 1) who completed static ML RMS sway and reactive-balance testing (I-mP&R) before their respective competitive seasons. All procedures were conducted in an applied athletic training room setting; therefore, participants’ footwear included sneakers, socks, or none (barefoot), consistent with the applied nature of balance tests in clinical practice. Athletes with flip-flops or cleats were asked to perform the assessments in socks or barefoot, and footwear was kept constant across all tasks within each participant.

Participants—Test-Retest Reliability
A separate, but overlapping, cohort of 36 (n = 21 female) healthy collegiate athletes (Table 2) were enrolled in a longitudinal study of balance as controls for a teammate who had experienced a concussion. Participants completed at least 2 balance assessments (not including a baseline balance assessment). The timeline of when these balance assessments (static ML RMS sway and I-mP&R) occurred was dictated by the recovery of the concussed teammate: acute (<72 hours after the concussion) and after resolution of symptoms. Because of the heterogeneity of recovery, this between-tests interval varied between 1 and 29 days (Table 2) but accurately represented the true intertest interval relevant to clinical practice.

Procedures
All procedures were part of a larger study investigating reactive balance in collegiate athletes.23 A subset of the procedures relevant to this study is briefly described below.
Demographic Data
Demographic data (age, sex, race, ethnicity, height, weight, and body mass index) and additional data on the sport, years of experience in primary sport, lower extremity injury history (within the past 2 years), and concussion history were collected for each participant before completing any balance tests.
Instrumented Methods—ML RMS Sway and I-mP&R
Before any balance testing, participants donned 5 inertial measurement units (IMUs; APDM) as previously reported.23 Sensors were placed on top of the metatarsals of the athlete’s left and right feet, the anterior shank of the right lower leg (about one-third of the way down), the lumbar region of the spine (about L3-L4), and the midpoint of the sternum. For the I-mP&R, the administrator wore an IMU on their right hand to determine the release point.4,8,15
The static ML RMS sway assessment was completed on firm ground and an Airex foam pad surface with feet together in a double-limb stance, following the standard clinical protocol for the BESS.24 Each trial required the participant to have their hands on their hips and their eyes closed. Each trial lasted 30 seconds to obtain reliable instrumented measures.25,26 Based on prior work, we used a double-limb–stance static-balance assessment, and the ML RMS sway was extracted as the primary outcome from Mobility Lab (version 2; ADPM, Inc) based on its sensitivity to detecting concussion compared with other instrumented outcome measures from the BESS.6,18,25
The I-mP&R
The I-mP&R was administered as described previously.4,15,23 Briefly, trials were completed in 4 directions (forward, backward, left, and right) in 2 conditions (with and without a simultaneous cognitive dual-task) for a total of 8 trials. The order of single and dual task for each participant was randomized but kept constant to avoid a confounding effect of trial order when the participant completed multiple testing sessions (ie, test-retest participants). During the forward and backward trials, the participants’ foot placement was standardized using a foot plate that was 8 inches (20.32 cm) long, 5.75 inches (14.61 cm) wide on the toe side, and 4 inches (10.16 cm) wide on the heel side4 (this plate was removed before the trial), and in the left and right trials, participants’ feet were together. Participants leaned in a plank-like, straight body position for each direction into an administrator’s hands until the administrator identified an inflection point where the participant’s center of mass was outside their base of support. The participant was then asked to close their eyes, and the administrator released the participant. The participant was instructed to regain their balance and avoid a fall by whatever means necessary, including taking a step or steps, after release. For the dual-task condition, participants were asked to begin the dual task after they closed their eyes but before the administrator released the participant. Four different cognitive dual tasks were randomly assigned to each direction for each participant, including serial subtraction by 3s, ABCs by every other letter, FAS test (the participant is asked to speak as many words as possible in 1 minute that begin with F, A, or S), and animal or fruit recital.15,23 The performance on the cognitive task was not recorded. Kinematic data from the IMUs were analyzed through a custom MATLAB script (MathWorks, Inc) to yield outcomes of time to stability and response latency.4,8 Time to stability reflects the time taken to regain balance and is akin to a measure of reactive-balance performance. Time to stability was defined using the time from the release of support to stabilization, where stabilization was defined using thresholded acceleration (a < 1.07 × g) and rotational velocity (ω < 14°/s) with both feet being still (Supplemental Material, available online at https://dx.doi.org/10.4085/1062-6050-0403.23.S1; Morris et al4). Latency reflects the time from release to the first initiation of movement, similar to measurement of reaction time, and was defined using the time from the release of support to the first foot movement. A trial was considered invalid if the data recording ended before the recovery of stability, there was a hardware malfunction, or the participant was moving at the end of the trial; these invalid trials were treated as missing data (more details about the data processing and specific criteria are available in the Supplemental Material, Morris et al4). The maximum latency and median of time to stability across all 4 directions were used as summary metrics based on prior interrater reliability studies.4 Single-task or dual-task conditions with more than one missing direction (eg, ≥50% missing trials) were considered invalid and removed from the analyses.
Statistical Analysis
Participant demographics and characteristics were summarized using means and standard deviations for continuous variables and using frequencies and percentages for nominal variables. We summarized ML RMS sway, time to stability, and response latency outcomes by participant characteristics using means, standard deviations, and quartiles (minimum, first quartile, median, third quartile, and maximum). As an exploratory analysis, we evaluated differences in mean outcomes by participant groups using analysis of variance and a significance level of 0.05.
We assessed test-retest reliability using intraclass correlation coefficients (ICCs). The ICC was calculated using a 2-way mixed-effects, absolute agreement, multiple raters/measurements model.27 The ICC values were interpreted as less than 0.5 = poor, 0.5 to 0.75 = moderate, 0.75 to 0.9 = good, and greater than 0.9 = excellent. The MDC values were defined as , where the standard error of measurement (SEM) is
, and SD represents the SD of the given outcome variable at the first assessment. All analyses were conducted in SAS (version 9.4; SAS Institute, Inc).
RESULTS
Athlete characteristics are described in Table 1. Data were collected on 377 (n = 184 female) athletes with a mean age of 19.3 ± 1.6 years, and with an average of 10.5 years of competitive experience in their respective sports. Almost 40% of athletes competed in a contact sport, and roughly 30% and 40% of participants had a history of concussion and musculoskeletal injury, respectively, in the past 2 years. Eighty-five trials (5.6%) from single-task time to stability, 134 trials (8.9%) from dual-task time to stability, 94 trials (6.2%) from single-task latency, and 71 trials (4.7%) from dual-task latency were deemed invalid. Missing data led to the exclusion of 11 normative participants’ single-task time-to-stability outcomes, 25 participants’ dual-task time-to-stability outcomes, 15 participants’ single-task latency outcomes, and 9 participants’ dual-task latency outcomes. For our test-retest reliability analyses, we collected and reported data on a separate but overlapping healthy matched control cohort of 36 (n = 21 female) athletes with an average age of 19.6 ± 1.5 years and an average of 11.2 ± 3.4 years of competitive experience (Table 2). Of these athletes, 52.8% competed in contact sports, 47.2% had a history of concussion, and 30% had suffered a musculoskeletal injury in the past 2 years. There were 15 trials (5.2%) from single-task time to stability, 25 trials (8.7%) from dual-task time to stability, 7 trials (2.4%) from single-task latency, and 9 trials (3.1%) from dual-task latency deemed invalid. Missing data led to the exclusion of 1 test-retest participant’s single-task time-to-stability outcome, 3 participants’ dual-task time-to-stability and latency outcomes, and 3 participants’ ML RMS sway outcomes.
Normative Reference Values
Women had smaller ML RMS sway than men in the firm condition (mean 5.2 ± 1.8 vs 5.9 ± 2.0 cm/s2, respectively) but not in the foam condition (Table 3). Shorter participants also exhibited less ML RMS sway compared with taller participants (Table 3) in the firm condition. The type of footwear affected ML RMS sway in both firm and foam conditions. Participants wearing shoes (5.3 ± 1.8 cm/s2) exhibited less sway than those with socks (6.2 ± 1.7 cm/s2) or barefoot (6.5 ± 2.1 cm/s2) in the firm condition and in the foam condition (17.0 ± 5.6 vs 19.8 ± 7.3 vs 19.0 ± 5.6 cm/s2, respectively).

For the I-mP&R, women took longer to recover stability compared with men during both single- (1.05 ± 0.22 vs 0.95 ± 0.18 seconds, respectively) and dual-task (1.17 ± 0.25 vs 1.07 ± 0.24 seconds, respectively) conditions (Table 4). In the dual-task condition, time to stability also differed by height; the shortest athletes exhibited the longest time to stability (Table 4). Athletes wearing shoes (1.09 ± 0.25 seconds) also recovered stability in the dual-task condition slightly faster than athletes wearing socks (1.13 ± 0.26 seconds) or barefoot (1.18 ± 0.22 seconds).

Latency values from the I-mP&R were affected by sex (dual task) and height (single and dual task; Table 5). In the dual-task condition, women had shorter (ie, faster) response latencies compared with men (243 ± 63 vs 271 ± 68 milliseconds).

For both ML RMS sway and I-mP&R, we suspected that the significant differences across height could be accounted for by sex differences. When data were stratified by sex, the effect of height was retained only for ML RMS sway on firm ground in women and for response latency during the single-task condition for men; no other outcomes retained a height-related difference (Supplemental Material).
Test-Retest Reliability
Demographic characteristics for participants included in the test-retest aim are described in Table 2. There was an average of 8.6 days (range, 1–29 days) between assessments. For ML RMS sway (Table 6), sway in the firm condition had moderate test-retest reliability (ICC = 0.73, MDC = 2.7 cm/s2) whereas sway in the foam condition had poor reliability (ICC = 0.43, MDC = 11.1 cm/s2). For the I-mP&R, time to stability in the single task (ICC = 0.84, MDC = 0.25 seconds) and dual task (ICC = 0.80, MDC = 0.29 seconds) exhibited good test-retest reliability. Response latency had poor reliability during single-task (ICC = 0.38, MDC = 107 milliseconds) but moderate reliability during dual-task (ICC = 0.55, MDC = 105 milliseconds) conditions.

DISCUSSION
The purpose of this study was 2-fold: (1) to determine normative reference values for instrumented ML RMS sway and the I-mP&R in healthy NCAA Division I athletes, and (2) to determine the test-retest reliability and MDC values over a common injury recovery timeline. Specifically, we aimed to identify the test-retest reliability and MDC in both ML RMS sway and I-mP&R during the timeline associated with the acute-to-asymptomatic time frame after concussion. Our results support the use of instrumented assessments to gather objective data that can improve the clinical utility of commonly used and emerging balance tests. The large sample size gathered for this study allowed a focused examination of the influence of sex, footwear, and body height and weight on postural stability measures. In addition, examination of test-retest reliability provided insight into clinically relevant changes that may occur in the absence of any injury. Overall, these results provide valuable information about normal instrumented outcomes of static- and reactive-balance assessments and how stable such values are over time that can aid future researchers and clinicians seeking to interpret instrumented measures of static and reactive balance in injured athletes and longitudinal studies.
Normative Values
Overall, our findings indicate that ML RMS sway and I-mP&R normative outcomes are sensitive to sex differences, and, in some cases, to the type of footwear and the height of an athlete. In our results for the ML RMS sway, the 50th percentiles in the firm condition were 5.1 cm/s2 and 5.6 cm/s2 for women and men, respectively. These findings further support those from a previous study of ML RMS sway in athletes in the firm condition (50th percentile = 4.7 cm/s2).18 However, authors of this study did not specify the type of footwear the control athletes were wearing, and our results suggest a potential effect due to footwear. Wearing shoes was associated with less sway. Therefore, these differences between our outcomes and those of the previous study might be explained when considering footwear type and the larger sample size in our study. Specifically, we found that in the firm condition of the ML RMS static sway, stratification by sex and the type of footwear may need to be considered when comparing athletes. In the foam condition, only footwear may need to be considered, as wearing shoes was associated with smaller sway compared with wearing socks or barefoot. Similarly, it may be necessary to stratify the I-mP&R results by sex and footwear as well, when comparing athletes. There were significant sex differences in both single- and dual-task conditions in time to stability and dual-task latency. However, consideration of footwear may be important only when comparing the dual-task time with stability outcomes from the I-mP&R. Further studies are needed to confirm these findings.
Test-Retest Reliability
Regarding test-retest reliability, our results indicate that ML RMS static sway and I-mP&R can be reliable tools to use during the recovery of concussion, but their reliability depends on the condition and the specific outcome. The most reliable and useful measure for ML RMS static sway was the firm condition. The I-mP&R had greater reliability and utility when using the time-to-stability measures instead of latency.
Root mean squares of acceleration data have previously demonstrated good validity and retest reliability (ICC = 0.71).25,28 Our results with ML RMS static sway further support these findings, with an ICC value for ML RMS of 0.73 for healthy control athletes using only the firm, double-limb stance condition. Conversely, ML RMS sway in the foam condition was less reliable (ICC = 0.43). Therefore, we propose that the firm condition might have better utility in a concussion-recovery setting. These results complement previous work suggesting that only the double-limb–stance, firm-ground condition of the BESS may be useful for instrumented outcomes.6,18 Time to stability from both the single- and dual-task I-mP&R exhibited good reliability (ICC = 0.84 and 0.80, respectively) over the time frame of concussion recovery with MDC values of 0.25 and 0.29 seconds, respectively. Although prior work established interadministrator ICC values of 0.73 for the I-mP&R, such results were obtained over a single testing session and using different administrators.4 Thus, the higher ICCs reported here are better indicators of the test-retest reliability and MDC that would be observed in clinical testing (ie, several days between assessments). Further, the MDC values for time to stability are similar to clinically meaningful effects; previous authors found a 36% increase in 6-month prospective musculoskeletal injury risk for every 0.25 seconds increase in dual-task time to stability at a baseline assessment.8 The similarity between the MDC values established here and the clinically relevant values reported previously supports the use of the I-mP&R as a recovery tool in sports medicine settings and contrasts it with other common clinical tests. For example, the MDC for the BESS error count is 7 to 9 errors, which is 140% to 180% larger than the typical clinically relevant change after concussion of 5 BESS errors.22 Thus, instrumentation of the double-limb stance of the BESS may provide better utility for assessing static balance in clinical settings than the current subjective scoring of errors. Similarly, for dynamic balance, the MDC for the single-task timed tandem-gait test is 5.5 seconds,29 which is 200% to 300% larger than the typical postconcussion change of approximately 2 seconds.30 These current measures have large MDC values that highlight the need to use objective assessments that can effectively identify relevant changes postconcussion. Although the clinically relevant concussion-related change in instrumented static and reactive balance remains unclear due to few longitudinal studies, these MDC values provide valuable information that will allow future researchers to conclude whether longitudinal changes are clinically meaningful or if they may be normal fluctuations over time.
In contrast to time to stability, however, response latency exhibited poor to moderate reliability. This poor reliability was characterized by shorter latencies at the second assessment, indicating potential learning effects over time. Our data suggest that repeated I-mP&R assessments may yield comparable time-to-stability values but not response latencies over serial administration.
Limitations and Directions for Future Research
It is important to note that excluding tests with ≥50% missing trials of the I-mP&R improved reliability and MDC values of the dual-task time to stability. These trials were excluded due to an error in test administration or hardware malfunction, as the administrator prematurely ended the data recording before the participant regained stability, thereby prohibiting a calculation of the time to stability. This error can occur when there are small corrective movements, indicating the athlete has not yet regained stability, after the administrator perceives the athlete as stable and ends the recording. Although these errors are relatively uncommon (8.3% of data for dual-task time to stability), they are consistent across studies seeking to instrument reactive-balance paradigms.4,19 These errors highlight a limitation of instrumented assessments: the need for immediate and rapid data processing to enable administrators to identify whether the trial was successfully completed, or, if necessary, to redo the trial. These real-time objective measurements of balance may increase testing reliability by limiting, or eliminating, missing data due to administrator error. These administrator errors also highlight the problematic nature of subjective assessments of balance and the need for instrumented, objective balance assessments; visual observation, even by a highly trained administrator, cannot accurately identify the precise moment that balance is recovered. The association between the type of footwear and instrumented measures of ML RMS sway and I-mP&R observed here highlights another limitation, as this association was observed between rather than within participants. Without a within-participants design, we cannot directly conclude that footwear has a significant effect on these metrics. Future researchers should use a design to investigate within-participants footwear changes to identify the true effect of footwear. Lastly, administration of the I-mP&R throughout the test-retest assessment may not be the same across all participant time points. There is a potential for interadministration differences that may affect our reliability scores. However, based on previous studies, we have chosen the most reliable outcome metrics to report that limit the effect of different administrators.
This study adds evidence to support the clinical utility of ML RMS sway and I-mP&R as reliable and valid assessments by providing large normative data and clinically useful MDC values to use during baseline and longitudinal assessments of recovery, respectively. We found that sex and, in some cases, footwear play a significant role when comparing values for ML RMS static sway and I-mP&R across athletes. We were unable to look at whether the effects of sex may have been confounded by sport; only 4 of the 17 sports in our sample contained both men’s and women’s teams. Future analysis on the effects of sex within a specific sport may be warranted. The utility of instrumented assessments of balance depends on the specific conditions and outcomes that are used. Instrumented outcomes of static balance on firm ground were reliable, but static balance on foam provided poor test-retest reliability and may not be useful for serial assessments. Similarly, reactive-balance measures of time to stability may be useful in both single- and dual-task conditions, but response latency may have limited clinical value over repeated assessments. Overall, these clinical reference values (normative and MDC) can help clinicians incorporate instrumented balance assessments into clinical practice.
Contributor Notes