Treatment Success of Hip and Core or Knee Strengthening for Patellofemoral Pain: Development of Clinical Prediction Rules
Patellofemoral pain (PFP) is a common injury that interferes with quality of life and physical activity. Clinical subgroups of patients may exist, one of which is caused by proximal muscle dysfunction. To develop clinical prediction rules that predict a positive outcome after either a hip and core- or knee-focused strengthening program for individuals with PFP. Secondary analysis of data from a randomized control trial. Four university laboratories. A total of 199 participants with PFP. Participants were randomly allocated to either a hip and core-focused (n = 111) or knee-focused (n = 88) rehabilitation group for a 6-week program. Demographics, self-reported knee pain (visual analog scale) and function (Anterior Knee Pain Scale), hip strength, abdominal muscle endurance, and hip range of motion were evaluated at baseline. Treatment success was defined as a decrease in visual analog scale score by ≥2 cm or an increase in the Anterior Knee Pain Scale score by ≥8 points or both. Bivariate relationships between the outcome (treatment success) and the predictor variables were explored, followed by a forward stepwise logistic regression to predict a successful outcome. Patients with more pain, better function, greater lateral core endurance, and less anterior core endurance were more likely to have a successful outcome after hip and core strengthening (88% sensitivity and 54% specificity). Patients with lower weight, weaker hip internal rotation, stronger hip extension, and greater trunk-extension endurance were more likely to have success after knee strengthening (82% sensitivity and 58% specificity). The patients with PFP who have more baseline pain and yet maintain a high level of function may experience additional benefit from hip and core strengthening. The clinical prediction rules from this study remain in the developmental phase and should be applied with caution until externally validated.Context:
Objectives:
Design:
Setting:
Patients or Other Participants:
Intervention(s):
Main Outcome Measure(s):
Results:
Conclusion:
Patellofemoral pain (PFP) is characterized by aching pain in the peripatellar area that is exacerbated by activities such as climbing stairs, squatting, jumping, running, and sitting with the knees flexed for prolonged periods of time.1 This is the most often reported musculoskeletal overuse injury, with incidence rates from 9% to 15% in active individuals including runners, military recruits, and triathletes.2–5 In a prospective study6 of runners, knee pain was the most frequently reported running-related injury. The economic burden of running-related injuries, of which PFP is the most common, has been estimated to be €172 (US $202) in health care utilization costs per running-related injury and €1849 (US $2172) per 1000 hours of running.6 The chronic pain associated with PFP often interferes with work, daily activities, and exercise, leading to reductions in both quality of life and overall physical activity.7 Reduced physical activity is a significant problem because it leads to concomitant health concerns such as obesity and cardiovascular disease. Furthermore, PFP is a known precursor to the development of knee osteoarthritis.8,9 Thus, we have health care and economic reasons to improve the prevention and treatment of PFP.
Most individuals with PFP have successful short-term outcomes after rehabilitation,10,11 but the majority of patients continue to exhibit recurring bouts of knee pain over the long term. For example, follow-up studies5,12,13 ranging from 5 to 20 years after rehabilitation have shown that 25% to 91% of patients with PFP reported symptoms that affected their daily life or physical activity. The multifactorial causes make identifying and treating the source of the symptoms in a targeted way very difficult, and the resulting body of literature displays conflicts. Part of the reason for the inconsistent outcomes may be due to subgroups of PFP, such as that caused by proximal muscle, quadriceps muscle, or foot dysfunction, and some participants might not have the dysfunction that the intervention is attempting to correct.7,10,14,15 Another reason may be the paucity of research concerning the development of clinical prediction rules (CPRs) that would help clinicians make evidence-informed decisions regarding optimal rehabilitation.
A CPR is a tool used to identify the clinical characteristics of patients who are likely to respond positively to a specific type of rehabilitation intervention.16,17 For other musculoskeletal injuries, such as low back pain, CPRs have been developed and successfully applied to improve patient outcomes.17,18 The first step in developing a CPR is to derive a rule by proposing a hypothesis and testing the factors that may have predictive ability. The second step is to validate its accuracy in narrowly focused and then in broadly focused populations and settings. The third step is to verify the ability of the rule to change clinician behavior, improve patient outcomes, or reduce health care costs.16 In this study, we focus on step 1, deriving a rule based on sound theoretical hypotheses that a set of clinical factors will predict the outcome of an intervention for PFP.
The hypothesis on which proximal strengthening is based is that dynamic malalignment during movement is caused by poor control of the pelvis, hip, and lower extremity.19 Abnormal alignment, including a contralateral pelvic drop, increased femoral adduction with internal rotation, and dynamic knee valgus, causes unusual stresses on the patellofemoral joint that eventually lead to an inflammatory response and pain. The cause of dynamic malalignment is proposed to be dysfunction (ie, strength, neuromuscular control, fatigability) of the proximal hip and core (defined as the abdominal and trunk muscles that stabilize or move the spine) muscles.15,20 Authors of a systematic review21 reported that proximal-strengthening interventions (in 8 studies) improved pain and function. Whereas emerging evidence has show positive outcomes for patients who performed a hip- and core-focused strengthening protocol,21–23 we do not know which clinical factors could identify the subgroup of patients who may benefit most from this treatment strategy versus those who may benefit more from a knee-focused strengthening approach. Furthermore, although a knee-focused muscle-strengthening program is considered the criterion standard, only 80% of knee-focused interventions improved pain and only 75% improved function.21 Identifying the clinical factors that predict patient success in a knee-focused strengthening program will also be useful to clinicians.
Clinical prediction rules for PFP have been developed for foot orthoses,14,24,25 spinal manipulation,26,27 and patellar-taping28 interventions. Researchers29–32 have identified prognostic factors that can be measured at baseline to recognize individuals who are likely to have poor outcomes. To date, no investigators have included hip or pelvic measurements in the models to predict treatment success after rehabilitation, despite growing evidence that proximal dysfunction is characteristic of a potential PFP subgroup.33 Therefore, the primary purpose of our study was to develop a CPR that would incorporate a set of clinically measurable factors that predict a positive outcome after a hip- and core-strengthening program for individuals with PFP. The second purpose was to develop a CPR that would predict a positive outcome after a traditional knee-focused strengthening program. The goal of this study was to increase knowledge of the characteristics that may identify subgroups of patients with PFP.
METHODS
We conducted a secondary analysis of data from a published randomized controlled trial comparing the effectiveness of a hip- and core-focused versus a knee-focused strengthening program for adults with PFP. An overview of the trial's methods is presented here, with details available elsewhere.23 The time point for determining treatment success was the end of the 6-week intervention.
Participants
Inclusion and exclusion criteria were consistent with those used for PFP-related research.22,34 A total of 199 participants were randomly allocated to either a hip- and core-focused (n = 111) or a knee-focused (n = 88) rehabilitation group by a blinded investigator. Baseline demographics of the group are available elsewhere.23
Intervention
Participants completed a 6-week rehabilitation program consisting of visits to an athletic trainer up to 3 times per week along with a home exercise program. Details of the intervention have been previously published23; however, a brief description follows. The hip and core rehabilitation program began with non–weight-bearing exercises to establish good volitional control of the hip musculature and progressed to exercises focusing on strengthening those muscles in a functional position. The program also included exercises that focused on increasing abdominal muscle activation and balance control. The knee-focused program started with non–weight-bearing quadriceps strengthening and progressed to closed chain, double-legged squatting activities. Activating the hip or core musculature during these activities was not emphasized. Participants progressed through the protocol at individual paces as guided by the treating athletic trainer.
Definition of Treatment Success
A 10-cm visual analog scale (VAS) was used to measure self-reported “worst pain during the previous week's physical activity,” and the Kujala Anterior Knee Pain Scale (AKPS)35 measured self-reported functional ability. A successful outcome was defined as a decrease in the VAS score by ≥2 cm or an increase in the AKPS score by ≥8 points or both.36
Baseline Predictor Variables
A range of baseline predictor variables have been included in previous CPR studies. Although the inclusion of demographic and symptom-related predictors was fairly consistent, other clinical measures were selected to align with the theoretical model on which the intervention was based.14,24–28,37 We also examined prognostic studies30–32 on treating PFP for baseline variables that predicted treatment success. On the basis of previous studies showing predictive relationships,24,25,27,30–32 along with the theoretical framework on which the hip and core and knee exercise interventions were based,19 the following baseline predictor variables were included: demographic factors; characterization of knee symptoms; strength of the knee extensors; strength of the hip muscles; endurance of the anterior, lateral, and posterior core muscles; and flexibility and range of motion of the hip (Table 1). In addition, baseline VAS and AKPS scores were included as predictive variables. We used a handheld dynamometer and strapping technique to assess hip-muscle and quadriceps strength.23,38 Core endurance (ability to hold the plank position, seconds) was assessed using the front-plank (anterior), side-bridge (affected side, lateral), and horizontal-extension tests.22,39 Hip-abductor–muscle and iliotibial band flexibility was assessed using the modified Ober test.40 Hip-extension flexibility was assessed using the Thomas test,41 and the thigh angle was measured using a digital inclinometer.42 Passive range of motion of the hip in internal and external rotation was measured with the participant in a seated position with the legs hanging off the examination table. A goniometer was used to assess the angle of the lower leg in degrees from vertical.

Statistical Analysis
Many previous CPRs in the rehabilitation literature have methodologic limitations or involve inappropriate statistical analyses or both. Therefore, they must be interpreted with caution and often are not carried into the validation phase of development.43 A common statistical approach is to dichotomize the continuous variables and determine cutoff scores using receiver operating characteristic (ROC) curves. The benefit of this approach for clinicians is the ease of measuring a patient's characteristics and then comparing these measures with a known score that may predict treatment success.43 However, the early dichotomization of continuous variables can weaken the predictive ability of the model and has even yielded contrasting results when compared with retaining variables as continuous.44 Thus, using the logistic regression approach with continuous variables to derive the prediction rule is recommended because it maximizes the classification accuracy and avoids the misclassification errors that are often associated with early dichotomization of the continuous variables.43 Our analyses were based on these recommendations.43
Two predictive models were developed: 1 to predict success after hip and core strengthening and 1 to predict success after knee strengthening. The first step of the analysis was to investigate bivariate relationships between the outcome of treatment success and the predictor variables. The association of the outcome with the dichotomous variable of sex was analyzed using a χ2 test, and associations with all continuous variables were analyzed using independent-samples t tests. The distributions of continuous variables were summarized separately for successes and failures using means and standard deviations. Sex was summarized by group-specific frequencies.
Logistic regression was used to predict a successful outcome, and a forward stepwise variable selection method was used to build the final parsimonious model defined by the smallest Akaike Information Criterion. All 2-way interactions between the variables retained in the model were investigated for inclusion in the model along with the main effects. All variables significant at P = .2 were investigated for inclusion in the parsimonious models. The ROC curve, area under the ROC curve, sensitivity, and specificity at selected cutoffs were assessed using leave-one-out cross-validation (LOOCV). The ROC curve was used to determine the cutoff value that minimized the misclassification error of the model, and this cutoff was used to calculate the sensitivity and specificity. The LOOCV method is an internal-validation method used to substitute external validation when an external-validation dataset is not available. The LOOCV cannot fully substitute external validation but produces more realistic estimates of various quantities such as the area under the ROC curve, sensitivity, and specificity. The statistical analysis was performed using the open-source software R 3.1.1 (https://www.r-project.org). Two-tailed Wald tests were used for statistical significance testing as defined by P < .05. Positive likelihood ratios (+LR) were determined for both the hip- and core-focused and knee-focused groups by dividing the sensitivity by (1–specificity) for both predictive models.
RESULTS
Outcomes and Compliance
According to an a priori definition of treatment success, 89 participants in the hip and core group were successful and 22 were not successful at the completion of the 6-week intervention. In the knee group, 88 participants were analyzed; 68 participants were successful and 20 were not. Participants were compliant if they self-reported completing their exercises 6 days a week for 6 weeks. For the hip and core group, 80.3% of participants were compliant (mean = 4.82 ± 1.90 d/wk). For the knee group, 81.7% of participants were compliant (mean = 4.90 ± 1.82 d/wk).23 The moderate (6-month) and long-term (24-month) outcomes of the rehabilitation intervention were also excellent. Participants in both groups who had successful outcomes were able to maintain their improvements in pain and function 6 months after the rehabilitation program while maintaining their level of physical activity.45 Furthermore, the recurrence of PFP symptoms was only 5.10% over the 24 months postrehabilitation.45
Predictive Model for the Proximal-Strengthening Approach
Of the predictors entered in the bivariate regression, age, pain at baseline, self-reported function at baseline, and endurance of the lateral, posterior, and anterior core muscles were carried forward into the stepwise logistic regression due to their univariate associations or clinical reasoning or both. The final model was built to predict success after hip and core strengthening and consisted of 6 simple effects and 1 interaction (age and posterior trunk endurance; Table 2). These patients were more likely to achieve treatment success if they exhibited greater self-reported pain, higher self-reported function, greater endurance of the lateral trunk muscles, and less endurance of the anterior trunk muscles at baseline. The presence of a significant interaction between age and posterior core endurance creates certain difficulties with the interpretation of their combined effects on success, but the presence of these 2 variables along with their interaction significantly improves the predictive properties of the logistic regression model. The cross-validated area under the ROC curve was 78.8%. Using Table 2, we built a linear predictor score:
.

Then we can calculate the probability of success as
.
If P > .765, then we expect a patient to have successful treatment. This prediction tool showed 88% sensitivity and 54% specificity after internal validation (on the basis of the LOOCV analysis).
Predictive Model for the Knee-Focused Strengthening Approach
For the knee-focused group, the predictors that were carried forward into the stepwise logistic regression were age, weight, sex, hip internal-rotation strength at baseline, hip-extension strength at baseline, posterior core endurance at baseline, and iliotibial band (ITB) flexibility at baseline. The final model was built to predict success after the knee-focused strengthening and consisted of 7 simple main effects (Table 3). The factors associated with treatment success were lower weight, less hip internal-rotation strength, greater hip-extension strength, less posterior core endurance, and less ITB flexibility. Although age and sex were not formally significant at the .05 level, removing these factors did not improve the quality of the model, so we left them in during this preliminary development phase. The cross-validated area under the ROC curve was 74.7%. The predictive equation for the knee group was
.

Then we can calculate the probability of success as
.
If a patient's resulting P was greater than .70, he or she would be predicted to have a successful treatment using a knee-focused muscle-strengthening approach. This prediction tool showed 82% sensitivity and 58% specificity after internal validation (based on the LOOCV analysis).
DISCUSSION
As previously reported, both the hip- and core- and knee-focused exercise programs resulted in reduced pain, improved function, and greater strength in patients with PFP.23 However, participants in the proximally focused program demonstrated an earlier resolution of symptoms, greater overall strength gains, and greater improvements in core endurance compared with the quadriceps-focused (standard-of-care) group.23 Thus, it was not surprising to find a different CPR model than that for the knee-focused group. The predictive model for the hip and core group revealed that individuals with PFP who had more pain and functional ability and those who had greater lateral trunk muscle endurance but less anterior trunk muscle endurance were most likely to have successful outcomes after the hip-focused exercise program. Moreover, this model was much better at predicting who would be successful (sensitivity = 88%) than predicting who would not (specificity = 54%).
The predictive model for the knee-focused group revealed that patients with PFP who had less weight, less hip internal-rotation strength, greater hip-extension strength, less posterior core endurance, and less flexible ITBs were most likely to have successful outcomes after quadriceps strengthening. The model was again better at predicting true successes (sensitivity = 82%) than true failures (specificity = 58%).
The +LR is a useful tool for clinicians to use to evaluate the odds of a change in outcome if a certain test is positive. A rule of thumb for interpreting likelihood ratios is that if the +LR is 2, one can expect an approximate 15% change in the probability of a successful outcome.46 If a patient with PFP in the proximal group had a score on the predictive model above 0.76, the +LR was 1.9, indicating that the probability for success with proximal strengthening was only slightly increased as compared with not using the predictive model. For a patient in the quadriceps group whose score from the predictive model was above 0.70, the +LR was 1.93. In our study, the success rate in the proximal group was 80%.23 Applying this +LR chance raised that to 95% for the proximal group. In the quadriceps group, the success rate was 77%, and applying the +LR from the predictive model raised it to 92%. Although it could be argued that a 77% to 80% success rate is already very high, and thus the clinical effect of the CPR is minimal, we feel that this is a marked improvement considering the known negative outcomes of prolonged or recurring PFP (eg, reduced activity, patellofemoral osteoarthritis).
Previous authors showed trends toward better success for those with less pain and better reported function after an orthosis intervention24,25,47 and exercise therapy.30,32 However, in these studies, self-reported pain was dichotomized into “low pain” or “high pain” on the basis of an ROC curve analysis, and the cutoff values for low pain varied from 22 mm24 to 53 mm.25 Thus, we chose to maintain pain level as a continuous variable in order to maximize its predictive power. We found it interesting that the hip and core model demonstrated that more pain at baseline predicted treatment success, though this factor did not predict success in the knee-focused group. Regardless, it is also possible that patients with PFP who had low levels of self-reported pain demonstrated improvements but were not able to meet our a priori definition of treatment success due to less room for improvement on the basis of a low level of initial pain. We also found that higher patient-reported function predicted success after the hip- and core-focused exercise program. This finding agrees with the results of a previous study25 that also showed higher function predicted success after an orthotic intervention. We found it interesting that symptom duration was not a predictive factor in our study; this factor has been reported by previous researchers,14,24,25,27 although the direction of the relationship (shorter or longer duration leads to greater success) was not clear.
Increased pain severity and duration were associated with poor prognoses in other musculoskeletal conditions.48 However, assessing pain chronicity using only duration has been shown to be of limited value in patients with low back pain.49 Contrary to common thought, a relationship between pain intensity and disability was not present in patients with PFP.50 Whereas pain is well known as a multidimensional phenomenon, including both the sensory and motivational-affective domains,51 most clinical researchers have considered only the sensory domain (severity). Thus, 1 explanation for the lack of success in establishing repeatable, clinically relevant CPRs is that the fundamental approach toward evaluating and treating this condition may be flawed. Traditional clinical research uses a biomedical approach to identify a condition and correct it via exercise or other interventions. However, adopting a biopsychosocial model to evaluate and rehabilitate PFP has recently been suggested.52 Factors that assess the motivational-affective domain of pain, such as allodynia,53 hyperalgesia,54 catastrophizing,50 and kinesiophobia,55,56 have all been related to PFP. The authors of 2 studies55,56 reported that high levels of catastrophizing and kinesiophobia predicted poor treatment outcomes for patients with PFP. Contrast this finding with the breadth of clinical measures that have not been related to treatment success,33 and it seems clear that a broader patient evaluation, including both psychosocial and clinical measurements, is needed to identify additional factors that may better predict treatment outcome.
We chose to measure factors that were anchored in the clinical theories of the causes of PFP. However, few of the predominant theoretical factors were included in the models. The theory that proximal muscle dysfunction leads to dynamic malalignment during movement, and thus increased stresses around the patellofemoral joint, formed the basis of the proximal-strengthening intervention.15,19 It is interesting that none of the hip-strength variables were retained in the final predictive model for the proximal group. In the quadriceps model, a person with weaker hip internal rotators and stronger hip extensors had a better chance of success. Whereas many investigators19,57–60 supported the theory of proximal dysfunction being related to PFP, prospective studies57,61,62 have not shown a relationship between hip strength and PFP. Based on a recent systematic review,57 the researchers concluded that only moderate evidence was available from cross-sectional investigations to support the premise that weak hip musculature was associated with PFP in men and women. One explanation of these results is that various methods of assessing hip-muscle function have been used. Pooling males and females when comparing hip strength may be improper, given that hip weakness was not found in males with PFP.63 Additional studies using multiple measures of hip-muscle function are necessary to fully understand this topic.
Core endurance has been related to PFP and included in exercise programs.64 Our proximally focused exercise program incorporated weight-bearing stability and balance exercises designed to engage the core, and cues and instructions to focus on activating and bracing the core were given. However, isolated “core-stability” exercises were not included. The contrasting results of less anterior core endurance and more lateral core endurance being related to a more successful outcome in the proximal-strengthening group are difficult to interpret. The side-plank test was used to measure lateral core endurance, as has been done in previous work.22,39 However, electromyography has demonstrated that this exercise also activates the gluteus medius muscle.65 So perhaps this position does not isolate the function of the abdominal core muscles as well as the anterior-plank test does, thereby confounding the relationship between core muscle function and the probability of success. Furthermore, in the quadriceps group, higher posterior core endurance was related to a more successful outcome. The relatively small number of unsuccessful patients also potentially limits the strength of the predictive model and could lead to spurious relationships.
Patients in the knee-strengthening group who had a successful outcome had lower weights than those who did not succeed. These factors, among others that were not included in the final predictive model (ie, weight, height, sex, duration of symptoms, hip internal and external range of motion, ITB flexibility, and hip-flexor flexibility), have not consistently predicted outcomes in previous studies.33 Height and weight were neither prognostic31 nor related to successful outcomes after an orthosis intervention.24,25 Similarly, our results could not describe a clear relationship using age as a factor due to the interaction with posterior core endurance (in the proximal model). Other authors also reported conflicting results, with older age predicting success after orthoses24,25 and yet younger age predicting success with exercise.29 Thus, future research is necessary to better understand the complex associations among these variables and their relationship to treatment success.
The benefit of CPRs is giving clinicians evidence to support clinical decisions as to which intervention to suggest for a particular patient. This is particularly true when a condition is multifactorial and subgroups of patients may exist, as is the case with PFP. However, if a patient does not appear to have a strong probability of a successful outcome from either the hip and core- or knee-strengthening intervention, several options are available. The clinician could consider other exercise-based strategies, such as a distal-based therapeutic exercise protocol along with foot orthoses32 or gait retraining.66 Or non–exercise-based interventions, such as taping, lumbopelvic manipulation, or pain-management strategies, could be considered. Finally, the clinician may want to refer the patient for further diagnostic testing to rule out a diagnosis such as a plica or internal derangement.
LIMITATIONS
Methodologic factors such as small sample size, early categorization of continuous data, and inconsistency in the definition of treatment success have been noted in previous studies. However, a post hoc analysis demonstrated that our sample size was more than adequate for the number of predictive factors examined.23 Moreover, we chose to maintain the variables as continuous throughout the analysis, which strengthens the opportunity for a successful predictive model. Predictive model building is an exploratory procedure; we acknowledge the possibility of error. We selected this approach so as not to miss any possibly significant relationships at this exploratory stage. We validated the findings of the predictive model using the LOOCV, which is a weaker yet acceptable form of validation at this stage. An important next step is external validation of the model using a different patient sample and dataset. Similar to many previous investigators, we did not include a control group in our study design. Whereas this does potentially limit the validity of the prediction rule's accuracy, we had ethical concerns regarding a control group that did not exercise, given the known benefit of exercise therapy for PFP.
CONCLUSIONS
Our purpose was to develop CPRs that would identify a set of clinically measurable characteristics to predict success after proximal strengthening and contrast them with a set of factors that predicted success after a knee-focused strengthening protocol. The predictors of success were different in each group. This finding may offer preliminary support of the theory that patients with PFP represent subgroups with different causes, though further validation of these models is necessary. In a large sample of individuals with PFP, those with a higher level of pain and function and those who had greater lateral core endurance but less anterior core endurance were most likely to have successful outcomes after the hip and core-strengthening program. Patients with less body weight, weaker hip internal-rotation strength and stronger hip-extension strength, and greater trunk endurance were more likely to be successful after the knee-strengthening program. It is important to emphasize that this analysis was in the rule-development phase of the process of establishing the CPRs and the results should be applied to patient care decisions with caution until further validation can be done.
Contributor Notes