Normative Performance on the Balance Error Scoring System by Youth, High School, and Collegiate Athletes
Annually, more than 1 million youth athletes in the United States receive or are suspected of receiving a concussion. The Balance Error Scoring System (BESS) is the most commonly used clinical balance evaluation designed to provide a better understanding of the motor-control processes of individuals with concussion. Despite the widespread use of the BESS, a fundamental gap exists in applying this tool to young athletes, as normative values are lacking for this population. To determine age- and sex-specific normative values for the BESS in youth, high school, and collegiate athletes. Cross-sectional study. Local youth sport organizations, high schools, and colleges. Student-athletes (N = 6762) completed preseason baseline concussion testing as part of a comprehensive concussion-management program. Groups were youth males aged 5 to 13 years (n = 360), high school males aged 14 to 18 years (n = 3743), collegiate males aged 19 to 23 years (n = 497), youth females aged 5 to 13 years (n = 246), high school females aged 14 to 18 years (n = 1673), and collegiate females aged 19 to 23 years (n = 243). Errors according to the BESS specifications. Performance on the BESS was worse (P < .01) in youth athletes than in high school and collegiate athletes. In the youth and high school cohorts, females exhibited better scores than males (P < .05). Sex was not a factor for collegiate athletes. Data from the youth cohort were further subdivided into 4-year bins to evaluate potential motor-development differences. The error count was highest for 5- to 9-year-old males and decreased with age. Performance on the BESS depended on sex and age, particularly in youth athletes. These sex- and age-specific normative values provide a reference to facilitate and unify clinical decision making across multiple providers caring for youth athletes with concussions.Context:
Objective:
Design:
Setting:
Patients or Other Participants:
Main Outcome Measure(s):
Results:
Conclusions:
An estimated 30 to 45 million children participate in nonscholastic athletic programs annually in the United States.1 Concussion is a common injury among youth, adolescent, and collegiate athletes,2 with an estimated 1.6 to 3.8 million sport-related concussions occurring annually in the United States.3 Whereas epidemiologic and observational studies of concussion have been prevalent among high school and collegiate athletes,4 specific methods to evaluate concussive injuries among youth athletes are lacking, leaving this youngest, and potentially most vulnerable population, underrepresented and unstudied.5 Concussive injuries induce a complex pathophysiologic response, resulting in a myriad of symptoms and short-lived neurologic impairment involving the cognitive, motor, visual, and vestibular domains of function.6 Postural instability is a hallmark of acute concussion. Using subjective clinical evaluation, researchers7 have proposed that postural-stability declines may resolve in high school and collegiate athletes within 3 to 10 days postinjury; however, investigators8 using more sophisticated biomechanical measures of balance have indicated that deficits may persist for weeks or months. Declines in postural stability after concussion are caused by disruptions in the integration of sensory information from the visual, vestibular, and somatosensory inputs and resultant inadequate motor response.9 The incidence of postural instability and its importance in motor-control processes have led to the recommendation that balance assessment be a cornerstone of concussion evaluation from healthy baseline to return-to-play decision making for high school and collegiate athletes.10 A gap in managing youth concussion is the lack of understanding about how children perform on standard clinical balance assessments designed for concussion assessment.
The Balance Error Scoring System (BESS) is a clinical measure that was originally designed to characterize postural stability in high school and collegiate athletes in order to detect differences between concussed and healthy athletes up to 3 to 5 days postinjury.11 In its entirety, the BESS involves testing individuals in 3 stances (double legged, single legged, and tandem) on firm and foam surfaces. After the 3rd International Conference on Concussion in Sport held in 2008, a modified version of the BESS, in which the foam surface is omitted, was adopted for use in the Sport Concussion Assessment Tool 2 (SCAT2) and became the most commonly used sideline measure of balance.12 Whereas subsequent consensus statements on concussion in sport have acknowledged that the modified BESS is not well studied and using the foam surface improves its sensitivity,13 both updated versions of the SCAT (SCAT3 and SCAT5) continue to use only the stances on a firm surface.6,14 Nonetheless, given the ceiling effect evident with the modified BESS and the presumed prevalence of in-clinic use of the entire BESS, it is critical to understand the relationship among healthy children's, adolescents', and young adults' BESS performances.
In the absence of individual healthy baseline values, normative data are compared with postinjury balance-performance values to determine the residual balance impairment or degree of recovery. Whereas normative BESS data are available for high school and collegiate athletes,15–17 studies17–19 quantifying BESS performance in youth athletes, particularly those less than age 14 years, have produced results that were limited and conflicting. Successfully managing youth concussion requires age- and sex-specific normative values due to the rapidly developing central nervous system processes underlying postural stability.
The need for age-specific normative values for the BESS is clear, as authors20,21 of biomechanical studies have observed developmental and sex differences in postural control throughout childhood, adolescence, and adulthood. Females generally exhibit less postural sway than males up to approximately age 11 years, suggesting more rapid maturation of the integration of afferent information to maintain a steady posture.20 However, clinical balance assessments, such as the BESS, may not have sufficient resolution to detect the subtle differences in the maturation of balance control between males and females that are evident with biomechanical assessments. Olson18 reported that sex and age did not affect BESS scores when comparing 11- to 13-year-old individuals with 17- to 18-year-old individuals. Others19,22 have reported that sex, but not age, influenced BESS scores. Whereas normative BESS performance data have been obtained in smaller cohorts of youth,18,22 high school,18,23 and collegiate23,24 athletes, to date no cross-sectional data exist that systematically characterize BESS performance across a wide range of ages among males and females. The lack of cross-sectional data reflects a fundamental gap that complicates the effective management of concussion in the pediatric athlete. Therefore, the purpose of our study was to determine age- and sex-specific normative values for the BESS among a large cohort of healthy youth, high school, and collegiate athletes.
METHODS
Sample
We performed a retrospective chart review of 6762 athletes (4600 males, 2162 females; age range = 5–23 years) who completed baseline testing during the 2013–2014 athletic seasons. Participant demographics are provided in Table 1. Balance assessments were completed as part of routine preseason baseline testing at sport clubs, high schools, and colleges conducted by personnel from the Cleveland Clinic's Concussion Center. All participants were neurologically healthy, and none presented with active musculoskeletal impairments that affected postural stability or precluded their participation in sport. Athletes recovering from musculoskeletal injuries, such as sprains, strains, and fractures, and those rehabilitating after surgery were excluded from the sample. The Cleveland Clinic Institutional Review Board approved this retrospective analysis.

Data Collection
Before data collection, experienced raters on the study staff (S.J.O., S.M.L., J.C., J.L.A.) performed hands-on training for each athletic trainer to ensure optimal consistency of administration. The BESS was then administered by 48 certified athletic trainers using standardized testing procedures.25 Each participant completed the six 20-second trials according to the BESS protocol: double-legged, single-legged, and tandem stances on firm and foam (Balance Pad; Airex AG, Sins, Switzerland) surfaces in a quiet room (Figure 1). Participants placed their hands on their iliac crests and were instructed to stand as still as possible and close their eyes. When they closed their eyes, the 20-second trial was initiated. Errors were recorded using standardized procedures, and the maximum score per stance was 10.6 In addition to the BESS, participants completed a neurologic test battery of iPad (Apple, Inc, Cupertino, CA) applications that provided quantitative assessments of cognitive, fine-motor, and visual performance, which are beyond the scope of this paper.



Citation: Journal of Athletic Training 53, 7; 10.4085/1062-6050-129-17
Statistical Analysis
Data were stratified according to the following sex and age cohorts: males aged 5 to 13 (n = 360), 14 to 18 (n = 3743), and 19 to 23 (n = 497) years and females aged 5 to 13 (n = 246), 14 to 18 (n = 1673), and 19 to 23 (n = 243) years. These age groups were chosen because they roughly corresponded to academic grade level (elementary and middle school, high school, and college, respectively; Table 1). One 2-way multivariate analysis of variance (MANOVA) was performed using sex, age, and sex × age as independent variables and 9 dependent variables: total BESS score summed across all 6 conditions (total BESS), summed BESS scores for the 3 stances on a firm surface (firm BESS), summed BESS scores for the 3 trials on a foam surface (foam BESS), and BESS scores for each of the 6 conditions. Effect sizes are presented as η2 and interpreted according to Cohen26 as small (0 < η2 < 0.01), medium (0 < η2 < 0.07), or large (η2 ≥ 0.07). Given the unequal sample sizes and variances, which were determined by the Levene test, we used the Pillai trace criterion to determine differences in the MANOVA.27 If the MANOVA indicated that sex, age, or sex × age was a predictor across the 9 dependent variables, the post hoc analysis included nine 2-way analyses of variance (ANOVAs) of each dependent variable, with no equal variance assumptions.28 The resulting P values were corrected for multiple comparisons using the false discovery rate.29 If warranted, additional post hoc analysis after the 2-way ANOVAs included Welch ANOVAs using the Tamhane correction30 to account for multiple comparisons with unequal sample sizes and variances. An α level of .05 was set for all corrected P values.
Data were also stratified for males and females in the following 4 age cohorts: 5 to 9, 10 to 13, 14 to 18, and 19 to 23 years, dividing youth athletes into smaller age brackets (Table 1). To compute percentile scores across the sex and age cohorts, we converted standardized values for each normal distribution of scores to normal random variables (X) using the following equation: X = μ + Z × σ, where μ is the mean and σ is the standard deviation of scores within each BESS condition and Z is the standardized value from the standard normal distribution for the desired percentile. Normal random variables were calculated at each 10th percentile between the 10th and the 90th percentiles and at the 2.5th, 5th, 95th, and 97.5th percentiles for the total BESS, firm BESS, foam BESS, and scores across each BESS condition. All statistical analyses were performed using SPSS (version 19; IBM Corp, Armonk, NY).
RESULTS
The median (minimum to maximum) scores for the total BESS, firm BESS, and foam BESS for the entire sample (N = 6762) were 13 (0–50), 3 (0–22), and 10 (0–30), respectively. The greatest number of errors occurred during single-legged stance on a foam surface (7 [0–10]), followed by tandem stance on a foam surface (3 [0–10]) and single-legged stance on a firm surface (2 [0–10]). During the double-legged stance on a firm surface, a ceiling effect was evident (0 [0–6]), which was similar to that for the double-legged stance on a foam surface (0 [0–10]) and tandem stance on a firm surface (0 [0–10]). These error-count trends were consistent across sex and age stratifications as depicted in Figures 2 and 3.



Citation: Journal of Athletic Training 53, 7; 10.4085/1062-6050-129-17



Citation: Journal of Athletic Training 53, 7; 10.4085/1062-6050-129-17
Young Males Exhibited the Greatest BESS Errors
The multivariate comparisons of total, firm, and foam BESS scores revealed an age effect across all cohorts, specifically youth (5–13 years), high school (14–18 years), and collegiate (19–23 years) athletes, and across all 6 balance stances (Pillai trace = 0.022, F12,13504 = 12.44, P < .001; η2p = 0.011; medium effect size). Table 2 provides between-subjects effects. When we accounted for sex, youth males committed 31%, 59%, and 20% more errors than high school males and 51%, 98%, and 35% more errors than collegiate males for the total BESS, firm BESS, and foam BESS, respectively (Figure 2 and Table 3). In general, less variability and slightly different trends were observed among females. Youth females exhibited 4% and 8% lower error rates than high school females for the total BESS and foam BESS, respectively, and performed similarly to collegiate females on the foam BESS. The inverse was observed for performance on the firm BESS, with 9% and 20% more errors committed by youth females than their high school and collegiate female counterparts, respectively.


When we further stratified youth athletes into smaller age brackets, a performance gap was evident between 5- to 9-year-old males and all other athletes for the total, firm, and foam BESS. This trend was not observed among the youngest female cohort, as 5- to 9-year-old females performed as well as or better than their older female counterparts. Post hoc univariate ANOVAs demonstrated differences between 5- to 9-year-old and 10- to 13-year-old athletes during the total (F3,6756 = 33.56, P < .001), firm (F3,6756 = 45.59, P < .001), and foam (F3,6756 = 14.43, P = .02) BESS and during the single-legged stance on a firm surface (F3,6756 = 37.06, P < .001) and the tandem stance on firm (F3,6756 = 18.32, P < .001) and foam (F3,6756 = 8.09, P = .02) surfaces.
Females Had Fewer BESS Errors Than Males
In the MANOVA, we observed an effect of sex for the total, firm, and foam BESS without adjusting for age (Pillai trace = 0.015, F6,6751 = 17.47, P < .005; η2p = 0.015; medium effect size; Table 2 and Figure 3). When adjusting for age, youth males committed more errors than youth females on the total (P < .001), firm (P < .001), and foam (P < .001) BESS and each of the BESS conditions except for the double-legged stance on a firm surface (P = .23). Within the high school cohort, males committed more errors than females on the total BESS (P < .001), firm BESS (P < .001), single-legged stance on a firm surface (P < .001), and tandem stance on firm (P < .001) and foam (P = .004) surfaces. Within the collegiate cohort, we observed no sex effect when examining scores for the total, firm, or foam BESS or across any BESS condition.
Percentile Scores Based on Age and Sex Cohorts
Percentile scores for the total, firm, and foam BESS and each BESS condition are presented according to age and sex cohorts (Table 4). Given the performance gap observed in the youth male cohort, 5- to 13-year-old athletes were subdivided into 2 age brackets: 5- to 9-year-old and 10- to 13-year-old athletes. Whereas this performance gap was not evident in our sample of youth females, percentile scores are still reported separately for 5- to 9-year-old and 10- to 13-year-old athletes, as variability within the youngest cohort was higher than for the high school and collegiate athletes.

DISCUSSION
Evaluating postural stability is essential for effective management of athletes with concussion. The BESS is the most common clinical balance assessment used to determine deficits in postural stability after a concussion. Interpreting the BESS is critical to informing clinical decisions about the concussion diagnosis, recovery, and return to play. However, the absence of age- and sex-specific normative values complicates its use in pediatric patients. In the absence of baseline assessments, our results underscore the need to use age- and sex-specific norms for youth, high school, and collegiate athletes. They indicated that age and sex affected BESS performance in each defined cohort. In general, (1) BESS performance in collegiate athletes was superior to that in high school followed by youth athletes; (2) BESS performance in females was superior to that in males, particularly in the youth and high school populations; and (3) ceiling or floor effects were evident in 3 of the 6 BESS conditions (ie, double-legged stance on firm and foam surfaces and single-legged stance on a foam surface).
Improved BESS performance as a function of age is likely due to the maturation of 2 physiological processes: the integration of sensory systems (ie, visual, vestibular, and somatosensory) and the development of automatic motor processes.31 Motor processes mature early in childhood (at 3–4 years of age), whereas sensory integration does not reach adult levels until 14 to 16 years of age.20,21 Therefore, maturation of the sensorimotor systems responsible for maintaining balance is likely complete in older adolescents. The age-related reduction in BESS error-count variability is consistent with data21 demonstrating that postural stability matures with age; youths committed more errors than high school and collegiate athletes on the total, firm, and foam BESS and across all 6 BESS conditions. Further dividing the broader age cohorts (youth, high school, and collegiate) into smaller age ranges demonstrated the effect of maturation, as the youngest males (5–9 years of age) committed more total BESS errors than did older cohorts. Heterogeneity in BESS errors was observed in the youngest bracket (5–9 years of age), likely due to variability in the rate of development and in the maturation of the motor control necessary for balance maintenance. The neuronal domains of postural control challenged by the BESS appeared to stabilize in our sample in the early teen years, as evidenced by the reduction in total errors and error-score variability in the high school and collegiate cohorts. The normative values across age groups in our dataset were consistent with previous findings15,19 indicating improved postural stability as children and adolescents age.
Analyzing normal performance on a stance-by-stance basis serves multiple purposes. Namely, the degree to which each stance taxes the balance system can be compared. One would expect that youths would demonstrate worse performance for all stances, as their control of postural stability is still maturing. However, whereas age-specific differences were statistically significant across all stances, differences in performance on both double-legged stances (foam and firm surfaces) were likely not clinically meaningful, as the mean error scores across the 3 age cohorts ranged from 0.02 to 0.04 on a firm surface and 0.11 to 0.34 on a foam surface. All ages included in the sample likely had sufficient sensory integration and motor control to not exhibit “errors” or discrete losses of balance in these relatively simple stances. In fact, 99.5% and 96.6% of all participants incurred either zero errors or 1 error on the double-legged stance on firm and foam surfaces, respectively.
Sex differences have also been reported20,21 in the maintenance of postural stability and vary with age due to sensorimotor function and anthropometric and psychological factors. Vestibular system maturity has been shown to occur earlier in females than in males.21 Postural stability also depends on anthropometric properties that vary between sexes throughout childhood and adolescence, including body mass and center of mass (ie, a point in space that represents the weighted average of each body segment in space).32 It is inversely proportional to the center of mass, which is lower among females due to a smaller waist-to-hip ratio, greater maximal thigh girth, and narrower shoulder width.33 Researchers have established that standing on the foam pad decreases postural stability by disrupting the reliability of the somatosensory input34; however, because males are likely to have a larger mass than females, the increased deflection of the balance pad could theoretically provide an advantage in the BESS foam conditions, particularly for those weighing more than 90 kg.35 Lastly, psychological factors, including motivation and attentiveness, may explain sex differences among age cohorts.21 Males were less attentive and more agitated during postural-stability tasks, which could result in worse performance during balance testing.20,21 Our normative values across sexes are consistent with previous findings on the BESS,22,24,36 except for those of Olson,18 who did not find sex differences between youth and high school athletes. This absence of a sex effect may have been due to the smaller age ranges and smaller sample sizes of males and females within each group than in our study.
Youths and adolescents exhibit continual and often rapid physical growth relative to young adults. During this time of development, sex is an important factor that affects the development of postural control.37 Our normative dataset indicated that the developmental factors affecting postural stability (ie, maturation of sensory information, anthropometrics, and psychological factors) did not affect performance on the BESS in either sex after the age of 14 years. However, differences were evident in male and female youths, signifying the need to further stratify the youngest cohort into 5- to 9-year-old and 10- to 13-year-old age brackets when reporting normative values. Therefore, BESS error scores are presented as a function of percentiles of the normative dataset (see Table 4). Whereas we did not examine the psychometric properties of the BESS, Finnoff et al38 reported relatively low interrater and intrarater reliability for the BESS scores and resultant interrater and intrarater minimal detectable changes of 9.4 and 7.3 points, respectively. In addition, researchers39–41 have found evidence of a practice or learning effect from previous exposure to the BESS or retention of improvement over time (or both) in youth, high school, and collegiate athletes up to 60 days after baseline assessment, potentially resulting in premature clearance before full recovery. As such, minimal detectable change values and possible practice or learning effects highlight the subjectivity and limitations of the BESS and should be considered when interpreting results as a whole and when determining clinically relevant differences in BESS performance.
Although the BESS is portable and quick to administer, our normative data indicated that it has floor and ceiling effects. During both the double-legged stances on firm and foam surfaces and the tandem stance on a firm surface, BESS errors were infrequent, with 99.5%, 96.6%, and 76.5% of all athletes producing no errors or 1 error, respectively. Conversely, during the single-legged stance on a foam surface, 53.2% of all athletes committed 8 to 10 (maximum number) errors, and 20% of all athletes committed 10 errors. The large proportion of nonconcussed athletes committing 8 to 10 errors at baseline during the single-legged stance on a foam surface raises concerns over the utility of this BESS condition in identifying impaired balance after a suspected concussion. A minimum or maximum error score in a given condition threatens the sensitivity of the measure and limits its resolution in detecting balance differences that may otherwise be present due to age, sex, or injury status.
The modified version of the BESS (without the use of a foam pad) was recommended for assessing postural stability by the 3rd, 4th, and 5th International Conferences on Concussion in Sport and has been incorporated into the widely used SCAT (versions SCAT2, SCAT3, and SCAT5).6,12,14 In discussing the utility of the modified BESS and its use in the SCAT3, Guskiewicz et al13 acknowledged that most of the research on the BESS has included firm and foam surfaces with variable interrater and intrarater reliability and that additional study was required to determine if the firm conditions alone were adequate to evaluate balance. For example, Finnoff et al38 reported the lowest interrater reliability (intraclass correlation coefficient = 0.44) for the tandem stance on a firm surface relative to all other BESS stances, but they could not determine the intraclass correlation coefficient for the double-legged stance on a firm surface due to a lack of between-subjects variability. Our results revealed floor effects in 2 of the 3 stances, and the mean score for the 3 firm stances combined was only 3.9, or 27% of the total BESS score of 14.2 ± 6.9. Based on previous literature and our results, the modified BESS alone may be insufficient for detecting deficits in postural stability or adequately informing the appropriate timing of the return-to-play phase of concussion management.
Whereas our normative data were an attempt to improve the utility of the BESS by stratifying according to sex and age (youth, high school, and collegiate) cohorts, BESS performance should always be interpreted in the context of all clinical information, including the athlete's medical history, to avoid clinical decision-making errors when preinjury baseline data are unavailable.
CONCLUSIONS
Accurately assessing and interpreting postural stability is critical for effectively managing athletes with concussion. Given the developmental differences between age and sex cohorts, using the normative reference values we reported will enable a more reliable interpretation of postural-stability assessment for clinical decision making. Researchers should focus on improving the interpretation of the BESS by determining the minimal clinically important difference in scores when comparing healthy and injured populations and on developing biomechanical measures to reduce the subjectivity associated with balance testing.

Stances performed during the Balance Error Scoring System (BESS) on a firm surface: A, double-legged stance; B, single-legged stance; C, tandem stance. Stances performed during the BESS on a foam surface: D, double-legged stance; E, single-legged stance; F, tandem stance.

Errors (mean ± standard deviation) in Balance Error Scoring System (BESS) performance across age cohorts combined and stratified across male and female populations. A, Age cohorts combined. B, Youth cohort. C, High school cohort. D, Collegiate cohort. a P < .001. b P < .05.

Errors (mean ± standard deviation) in Balance Error Scoring System (BESS) performance across sex cohorts combined and stratified across youth, high school, and collegiate cohorts. A, Sex cohorts combined. B, Males. C, Females. a P < .001. b P < .05.
Contributor Notes