Retest Reliability of Force-Time Variables of Neck Muscles Under Isometric Conditions
Proper conditioning of the neck muscles may play a role in reducing the risk of neck injury and, possibly, concussions in contact sports. However, the ability to reliably measure the force-time–based variables that might be relevant for this purpose has not been addressed. To assess the between-days reliability of discrete force-time–based variables of neck muscles during maximal voluntary isometric contractions in 5 directions. Cohort study. University research center. Twenty-six highly physically active men (age = 21.6 ± 2.1 years, height = 1.85 ± 0.09 m, mass = 81.6 ± 9.9 kg, head circumference = 0.58 ± 0.01 m, neck circumference = 0.39 ± 0.02 m). We used a custom-built testing apparatus to measure maximal voluntary isometric contractions of the neck muscles in 5 directions (extension, flexion, protraction, left lateral bending, and right lateral bending) on 2 separate occasions separated by 7 to 8 days. Variables measured were peak force (PF), rate of force development (RFD), and time to 50% of PF (T50PF). Reliability indices calculated for each variable comprised the difference in scores between the testing sessions, with corresponding 95% confidence intervals, the coefficient of variation of the typical error of measurement (CVTE), and intraclass correlation coefficients (ICC [3,3]). No evidence of systematic bias was detected for the dependent measures across any movement direction; retest differences in measurements were between 1.8% and 2.7%, with corresponding 95% confidence interval ranges of less than 10% and overlapping zero. The CVTE was lowest for PF (range, 2.4%–6.3%) across all testing directions, followed by RFD (range, 4.8%–9.0%) and T50PF (range, 7.1%–9.3%). The ICC score range for all dependent measures was 0.90 to 0.99. Discrete variables representative of the force-generating capacity of neck muscles under isometric conditions can be measured with an acceptable degree of reliability. This finding has possible applications for investigating the role of neck muscle strength-training programs in reducing the risk of injuries in sport settings.Abstract
Context:
Objective:
Design:
Setting:
Patients or Other Participants:
Intervention(s):
Main Outcome Measure(s):
Results:
Conclusions:
Researchers1–3 have proposed conditioning of the neck muscles as a simple, cost-effective method for decreasing the risk of cervical spine injuries and concussions in individuals participating in contact sports. However, these recommendations are based on anecdotal observations or modeling work and, thus, require validation through prospective, interventional studies. As a prerequisite to achieving this goal, 2 methodologic issues related to the measurement of neck muscle capacity need to be considered. The first issue concerns the selection of appropriate outcome variables to assess neck muscle function in the context of contact sports. To date, most investigators4–11 of isometric neck testing have reported only on peak force (PF) or moments as an outcome. However, PF may not be completely relevant to investigating the role of neck muscle strength in injury prevention because in real play an athlete may not attain maximal muscle force before contact. This may be due to poor relative awareness of the impending collision and a limited time in which to generate sufficient muscle force before impact. Therefore, quantifying the early force-generating capacity of neck muscles would seem to be more meaningful because these variables might provide insight into the short-term damping response of the neck. In addition to PF, such relevant variables include the rate of force development (RFD) and the time needed to reach a percentage of PF that might result in meaningful increases in neck stiffening.3 The second issue concerns our ability to reliably measure these aforementioned variables. In their recent review of the reliability of neck strength measures, Dvir and Prushansky12 noted an overreliance on interpretation of relative reliability indices (ie, Pearson r or intraclass correlation coefficients [ICCs]), with scores that might have been inflated by use of heterogeneous participant samples. Specifically, relative reliability measures are based on the ratio of the between-subjects variability to the total variability observed.13 Thus, if the participants' score range is wide (ie, scores exhibit large variability among participants), then the numerator of this ratio inherently would include a large number. In strength testing this may happen, for example, if a mixed-sex sample is used or if the participants differ in their physical conditioning status. In addition, the reliability of force-time measures other than PF has not been reported in testing of the neck muscles.12 Therefore, the purpose of our study was to determine the retest reliability of PF, RFD, and time to reach 50% of PF (T50PF) in athletes performing maximal voluntary isometric contractions (MVICs) of the neck muscles in 5 directions. The results will be relevant to investigating the potential role of neck muscles in modifying the mechanics of the head to imposed loads and to quantifying the effects of strengthening programs on the force-generating capacity of the neck.
METHODS
Participants
Male athletes were recruited within the university community. Prospective participants were screened using a self-report questionnaire on the set of exclusion criteria proposed by Sommerich et al14: neck injury or pain; head injury; recurrent episodes of fainting or dizziness; surgical interventions to the head, neck, or shoulder regions; current use of medications to control high blood pressure; and high risk for carotid or coronary artery disease. Twenty-six individuals were tested (age = 21.6 ± 2.1 years, height = 1.85 ± 0.09 m, mass = 81.6 ± 9.9 kg, head circumference = 0.58 ± 0.01 m, neck circumference = 0.39 ± 0.02 m). All athletes participated in regular physical activity 4 to 8 times per week in team or individual sports at the competitive university, national, or elite level; none of these sports involved specific conditioning of the neck muscles as part of the training routines. Participants provided written informed consent before testing, and the University Research Ethics Board approved the study.
Testing Device
A custom-built neck-strength testing device was developed based on the work of Vasavada et al11 (Figure 1). The device includes the following relevant features: (1) a 6–degrees-of-freedom load cell (model MC5-6-2500; AMTI, Watertown, MA) to record the 3-dimensional components of MVICs; (2) a hockey helmet and face cage (model 8500; Bauer Hockey Corp, St Jerome, QC, Canada) to couple the participant's head and neck to the load cell and a reinforced chin strap to permit application of maximal exertions of flexion; (3) a 3-point attachment system for the helmet to stabilize the head in the sagittal plane within a 70° flexion-extension range of motion; and (4) a seating system with 4-point restraint for shoulder and trunk stabilization, which is fully adjustable to accommodate participants of different dimensions.



Citation: Journal of Athletic Training 45, 5; 10.4085/1062-6050-45.5.453
The accuracy of the load cell calibration provided by the manufacturer was verified before testing using known weights measuring between 0.9 and 8.9 kg placed along the orthogonal directions of the load cell. The root mean square error for all force and moment channels was less than 1 N and 0.1 Nm, respectively, and the coefficient of determination (R2) values were greater than 0.99, indicating a linear response within the measurement range.
Experimental Methods and Procedures
Before testing, participants completed a 5-minute warm-up session consisting of movement of the head and neck through partial and full range of motion with passive stretching at end range, 3 to 5 self-resisted submaximal isometric contractions in the directions of testing (extension, flexion, protraction, left lateral bending, right lateral bending), and 1 to 2 self-resisted MVICs. Participants reported that none of these maneuvers caused pain or discomfort. Thereafter, participants were fitted with the hockey helmet, seated in the device, and restrained. They were instructed to assume a comfortable, neutral position of the head and neck, and this self-determined neutral position was recorded using a 3-Space Isotrak digitizer system (Polhemus Inc, Colchester, VT). Next, the helmet was coupled firmly to the fixed frame, and its position was adjusted to correspond with the recorded, uncoupled, neutral position. Participants performed 2 to 3 submaximal 3-second contractions in each of the 5 directions until comfortable with the experimental tasks. Next, they performed 1 MVIC practice trial in each direction.
For the experimental session, participants were instructed to reach their maximum forces as fast as possible and to hold the force levels to the end of the trial. Participants performed 4 MVICs in 5 directions: extension, flexion, protraction, left lateral bending, and right lateral bending. Protraction was defined as the maximal forward gliding or anterior translation of the head without sagittal-plane rotation.15 This movement effort was achieved by instructing participants to push against the anterior padding of the helmet. To achieve flexion efforts, participants were instructed to push their chins against the chin pad in an effort to rotate their heads in the sagittal plane. Each trial lasted for 4 seconds, and a 30-second rest period was provided between trials. The order of directions was randomized fully both within and between participants using a computerized random-number generator.
For testing, participants were instructed to keep both hands on their thighs and to rest their feet on a cardboard box. The latter enabled the examiner to audibly and visually detect whether the legs were contributing to the recorded force because pushing down on the box would collapse it and pushing to the sides would translate the box across the surface. When such contribution of the lower limbs was detected, the trial was discontinued and repeated after a 30-second rest period. Real-time visual feedback on the direction and magnitude of force application and oral feedback on performance were provided for all trials.
Signal Acquisition and Processing
Analog force signals were amplified using a Modular 600 multi-channel amplifier (RDP Group, Pottstown, PA) with a peak-to-peak range of ±10 V, frequency response of 0 to 1 kHz, common-mode rejection rate of 110 dB at 60 Hz, and input impedance of 100 MΩ; analog signals were converted to digital form through a 16-bit converter (model PCI-6036E; National Instruments Corporation, Austin, TX) at a sampling rate of 2048 Hz with a dynamic range of ±5 V and analyzed offline using LabVIEW (version 8.6; National Instruments Corporation). All signals were zero-offset and low-pass filtered using a second-order, zero–phase-shift Butterworth filter with a 15-Hz cutoff before extraction of the following dependent variables (Figure 2): (1) PF (N), which was defined as the maximum force value over the trial; (2) RFD (Ns−1), which was defined as the maximal value of the slope of the force-time curve, calculated using a 50-millisecond sliding window from onset to PF; and (3) T50PF (milliseconds). The onset of force was defined as the instant the force-time curve exceeded a value of 2 SDs above baseline levels and remained above this value for 100 milliseconds. All onsets were verified by visual inspection.



Citation: Journal of Athletic Training 45, 5; 10.4085/1062-6050-45.5.453
Retest Methods
To determine retest reliability, a second testing session was completed 7 to 8 days later and scheduled at the same time of day to control for diurnal effects. Previously recorded positions were used to standardize the participant's posture within the device across the 2 testing sessions, and the same order of MVICs was used. The same investigator (S.A.) performed all measurements and provided all oral instructions in both testing sessions.
Statistical Analysis
For each outcome variable, the 3 best scores in each of the 5 testing directions were used to calculate an average participant score for that particular direction. These average scores were used in the analyses. We used only 3 of the 4 scores obtained in each direction to calculate the average score to minimize the effects of outliers on the average score value.
All data were evaluated for heteroscedasticity16–18 and normality of distribution (Shapiro-Wilk normality tests, P ≤ .05). To correct for positive findings of heteroscedasticity and nonnormality, data for all measures were log transformed and multiplied by 100.18,19 The corrective effects of this transformation on distribution characteristics were verified.
For indications of systematic bias, the difference in average scores between testing sessions (ie, day 2 score − day 1 score) and the corresponding 95% confidence intervals were calculated18,20:

Measurement precision was assessed using the typical error (TE) of measurement,18 expressed as a coefficient of variation (CVTE) to permit comparison across the different variables18–20:

We also calculated the smallest detectable difference (SDD) value, which was used to determine the smallest change necessary for declaration of differences between measurements from the 2 testing sessions.18,20 The SDD was calculated by multiplying TE by 2.77.18 Retest correlation was calculated using ICC (3,3).13,21
All statistical analyses were performed with SPSS (version 15; SPSS Inc, Chicago, IL), Excel (2007; Microsoft Corporation, Redmond, WA), and MATLAB (version 7.5; The MathWorks Inc, Natick, MA).
RESULTS
The results for each of the dependent variables are summarized in the Table. Percent differences in measurements across the 2 testing sessions were low across all directions (range, −1.8% to 2.7%), and the corresponding 95% confidence interval ranges were less than 10% and overlapped zero, which was indicative of no differences in measurements between the testing sessions. The CVTE across all directions ranged from 2.4% to 6.3% for PF and increased modestly for RFD (range, 4.8%–9.0%) and T50PF (range, 7.1%–9.3%). Accordingly, SDD values were lowest for PF (range, 6.6%–17.4%) and higher for RFD (range, 13.3%–25.0%) and T50PF (range, 19.8%–25.7%). Retest correlations were high across all variables and testing directions, with ICC (3,3) values ranging from 0.90 to 0.99.

DISCUSSION
We used a custom-built strength-testing apparatus that is specific to sport populations and provides evidence of retest measurement reliability for discrete variables that might be meaningful for the training of athletes involved in collision sports. The testing apparatus addresses some of the limitations in technology and methods regarding measurement of neck muscle capabilities.12,22 Specifically, the device uses a commercially available hockey helmet to couple the head and neck to the load cell. We believe this improves participants' comfort and willingness to exert ballistic efforts, especially if the helmet is worn routinely during sport participation. In addition, the use of an adjustable sport protective helmet that includes a chin-strap system allows for the independent measurement of forces in the directions of flexion and protraction. However, the type of helmet used does not allow for valid recording of head rotational efforts because no support is provided to the sides of the mandibles and skull, against which participants can push.
When comparing the values we obtained with previously reported data10,11,23–25 in healthy populations, we observed that our results agreed with the observations that the largest maximal force occurs with efforts in extension. In addition, lateral-bending PFs in our investigation were within 10% of those measured for flexion, which is in agreement with the results of others.24,26,27 We found only 1 study in which investigators reported neck muscle strength indices other than PF. Specifically, Valkeinen et al28 reported RFD values for men and women of different ages performing isometric extension and flexion efforts in a neutral head posture. When comparing our results with those of the age-matched and sex-matched groups, the RFD values in our study were 10% higher for flexion and 30% higher for extension. These differences might be attributed to the methods for calculating RFD values, which Valkeinen et al28 did not specify, and to possible differences in physical conditioning between the 2 participant populations.
Regarding the reliability indices we reported, the quantification of the percent difference of measurements between testing sessions was meant to identify systematic bias that might be introduced by factors such as test-retest learning, motivational differences, or insufficient recovery time.29 We found no indications of systematic bias for any of the dependent variables between the testing sessions. We attributed this finding to the high physical activity level of our participants, which we believe facilitated fast learning of task performance.
The TE is a measure of the within-subjects variation in performance between testing sessions and, as such, provides an estimate of the precision of the measured variables. Therefore, the magnitude of the TE directly influences the ability to detect statistically meaningful changes in performance. Regarding variables measured in our study, the low magnitude of the TE and the corresponding SDDs for PF would allow changes in performance of 18% to be declared as significant differences. This is well within the expected range of change in neck PF after intervention programs, which has been reported4–6 to range between 24% and 64%. However, the magnitude of the TEs and corresponding SDDs obtained for RFD and T50PF, in particular, were higher and would prevent changes in measured performance magnitudes of less than 20% from being declared different. Improving the precision of measurement for these variables would be pertinent for ensuring that changes in performance result from an intervention and also for validating the modeling predictions of Viano et al3 that small increases in neck stiffness have a substantive effect on damping the initial mechanical response of the head impact loads during a player-to-player collision. To address this issue, future researchers should evaluate the effectiveness of including a familiarization session before testing.22,23 However, RFD values obtained from different muscle groups after training have been documented30–32 to improve between 17% and 33%, and improvements are expected to be accentuated in individuals who are unaccustomed to or untrained in this type of effort.33 Thus, whereas lower TE and corresponding SDDs for RFD and T50PF clearly would be beneficial, the magnitudes we obtained probably would suffice to detect changes in performance after an intervention program, especially if participants are not accustomed to performing MVICs of the neck musculature.
Relative reliability, which is commonly assessed by some form of retest correlation (Pearson r or ICC), allows for the assessment of rank-order maintenance among participants between testing sessions.13,18,34 The ICC values we obtained for all dependent measures were considered high and were in agreement with previous investigations in which the authors10,23,26,27 reported this measure for PF values of isometric neck exertions assessed in a seated position using fixed-frame dynamometry. Some investigators12,16,18,22 have debated the usefulness of the ICC in the assessment of reliability, particularly because of the inherent sensitivity of the ICC to score heterogeneity. We tried to minimize the effect of this confounding factor by studying a population sample that was highly homogeneous in physical characteristics and amount of athletic training. In addition, ICC values might be affected by the inclusion of systematic error components and by the use of single trial values versus the average of several trials.13,34 We used an ICC model that does not take systematic bias into account because we found no evidence that such error existed.13,34 In addition, the use of the average scores, as opposed to the single best score in the ICC calculation, is justified by our interest in measuring variables that would be relevant to contact sport settings, in which athletes exert maximal muscle efforts over a short period but repetitively over the course of a game.
CONCLUSIONS
Measures of neck muscle force generation, which might be important for decreasing the chances of injury to the head and neck in contact sports, can be assessed with an acceptable degree of reliability using the standardized measurement device and testing protocol detailed in our study. Future research using our methods will aim to clarify the association between neck muscle strength and the incidence of neck injuries and concussions in at-risk athletic populations and to determine the responsiveness of measurements in athletes participating in strength-training programs as part of a multifaceted approach to the prevention of head and neck injuries in contact sports.

Standardized position of the participant in the neck strength testing device. Adapted from Almosnino S, Pelland L, Pedlow SV, Stevenson JM. Between-day reliability of electromechanical delay of selected neck muscles during performance of maximal isometric efforts. Sports Med Arthrosc Rehabil Ther Technol. 2009;1:22. doi:10.1186/1758-2555-1-22. Published September 23, 2009. BioMed Central Ltd, London, United Kingdom.

A typical force versus time curve of maximal effort. Peak force, the time to 50% peak force and the rate of force development, calculated as the change in force (dF) over a unit of time (dt), are indicated.
Contributor Notes