Machine Learning Algorithm Identifies Candidates for Hip and Knee Arthroplasty Without In‑Person Evaluation

Close up of doctor's hands on keyboard of laptop, electronic health record concept

Aging populations—and early retirement of late-career surgeons during the COVID pandemic—have resulted in a growing supply–demand mismatch for total joint arthroplasty. In the U.S., these trends are exacerbated by consolidating hospitals, clinics, providers, service lines, and physician organizations via mergers and acquisitions.

Within complex healthcare systems, it’s crucial to decrease what business experts call “transactional friction.” Each patient must be matched with the optimal provider at the correct time in their disease course to achieve optimal therapeutic outcome. For example, it may be premature to have a patient see an orthopedic surgeon about hip arthroplasty if they don’t have radiographically confirmed advanced arthritis.

With this context in mind, Andrew K. Simpson, MD, MBA, MHS, director of Minimally Invasive Spine Surgery in the Department of Orthopaedic Surgery at Brigham and Women’s Hospital, and colleagues created the first machine learning algorithm that identifies who may or may not be a potential candidate for hip or knee arthroplasty based solely on information that could be readily available in the electronic health record. The team provides details in the Archives of Orthopaedic and Trauma Surgery.


Key to this retrospective study was a set of electronic medical records generated during the COVID-19 pandemic between March 1 and July 31, 2020, at Mass General Brigham. Surgeons evaluated 158 new patients via telemedicine alone and made specific recommendations without in-person physical examination.

All patients had osteoarthritis and were being considered for primary total hip arthroplasty (THA), total knee arthroplasty (TKA), or unicompartmental knee arthroplasty (UKA).

Using data from 70% of the patients (n=112), five machine learning algorithms (stochastic gradient boosting, random forest, support vector machine, neural network, and elastic-net penalized logistic regression) were trained to predict the primary outcome: the indication for operative intervention based on the telemedicine encounter.

Algorithm Components

103 patients were indicated for surgery (43% THA, 57% TKA or UKA). Variables associated with an indication for operative intervention were:

  • Radiographic degree of arthritis
  • Prior trial of intra-articular injection
  • Trial of physical therapy
  • Current opioid use
  • Current tobacco use

Algorithm Performance

In the remaining 30% of patients (n=46), the stochastic gradient boosting algorithm performed better than the other four. Its metrics were:

  • Area under the receiver operating characteristic curve: 0.83 (95% CI, 0.67–0.95)
  • Calibration intercept (measures whether the model is over- or underestimating the probabilities; a perfect score is 0): 0.13 (95% CI, −0.65 to 0.92)
  • Calibration slope (measures whether the predictor effects in the training and test set are the same; a perfect score is 1): 1.03 (95% CI, 0.52–1.86)
  • Brier score (assesses overall model performance; scores closer to zero are better): 0.15 (95% CI, 0.09–0.22), relative to 0.23 for a null model

Guidance for Clinicians

The degree of radiographic arthritis may be an intuitive factor in determining whether a patient needs surgery, but the other components of the algorithm deserve emphasis. They are modifiable risk factors or treatment options, and these study results suggest they should be pursued before arthroplasty is considered.

Leave a Reply