Artificial intelligence (AI) shows promise to be a legitimate boon to the field of orthopedics, but it’s being diluted with meaningless applications and misguided methodologies. To characterize the hype surrounding AI, a commentary or review appeared for every two original AI-based reports between 2018 and 2021.
Prem N. Ramkumar, MD, MBA, clinical fellow and principal investigator of the Orthopaedic Machine Learning Laboratory at Brigham & Women’s Hospital, and colleagues want orthopedic surgeons to better understand AI but also critically assess AI-related reporting in the peer-reviewed literature. They recently published an invited editorial in Arthroscopy: The Journal of Arthroscopic & Related Surgery.
Common but critical errors for those engaging in AI-related research include failure to (1) ensure the question is important and previously unknown or unanswered; (2) establish that AI is necessary to answer the question; and (3) recognize model performance is more commonly a reflection of the data than the AI itself. Without appropriate guardrails surrounding the utilization of artificial intelligence in orthopedic research, there exists the risk of repackaging registry data and low-quality research in a recursive peer-reviewed loop.
“AI” is an umbrella term for a machine that can simulate human behavior. Defined most simply, AI is computer automation.
“Machine learning” is a subset of AI that relies on pattern recognition. By reinforcing complex associations with positive and negative feedback, computer algorithms can be built to yield predictions that iteratively improve with newly introduced data.
AI and machine learning are not interchangeable terms.
Unlike regression models, which retrospectively analyze relatively homogeneous data formats, AI can analyze disorganized data and prospectively predict outcomes.
Misperceptions About AI-based Research
A frequent point of confusion when a journal article reports the use of AI is the focus of the study. The authors may say they were evaluating an AI-based process, but in most cases, they were using AI as a tool to evaluate data, just as in any other research report.
AI is only as good as the data that fuel it. If a report refutes the ability of an AI-based process to supersede the predictive power of traditional regression analysis, the data is often low volume or low quality. Researchers engaging in AI-related research should have the foundational awareness that one cannot appraise its power by simply evaluating its predictive performance in isolation.
Some authors speak of applying unique ethical principles to the use of AI, which is unnecessary. The ethics of AI don’t differ from those of any other method for statistical analysis.
Hallmarks of High-Quality Publications About AI
Guidelines about reporting AI-related research are forthcoming. In the meantime, readers should expect authors to:
- State where the full code is available online, to maximize generalizability and transparency
- Report how the algorithm performs with subsequent algorithm “rationale,” such as SHAP (SHapley Additive exPlanations) analysis for weighting risk factors heatmaps for analyzing features on medical imaging
- Validate the results in an external population before making an online prediction tool available to the general public
Misuses in AI Research
Researchers familiar with machine learning can easily use it to “repackage” previous studies, using existing registry data, to generate a clinical prediction tool. Unless the tool is externally validated in multiple populations and is prospectively applied for evaluation, the study has not accomplished much.
When evaluating a study that boasts the use of AI or machine learning, readers should ask themselves whether the question has been answered before—and whether AI was necessary to answer the question.
Forceful application of TRIPOD guidelines to AI-based studies is inappropriate, as these guidelines are primed for regression modeling.
Meaningful Applications of AI
AI is an exciting potential tool for personalizing patient care, quantifying important metrics, and reducing the burden of certain administrative tasks. For example, research to date suggests it can:
- Predict an individual patient’s future injury risk
- Project patient-reported outcomes
- Interpret advanced imaging
- Identify implants
- Autocomplete certain fields in electronic health records
- Generate billing codes
- Report value-based metrics
- Enable remote monitoring through sensors on smartphones and wearable devices
- Assess an individual’s preoperative risk profile so reimbursement can be predetermined accordingly, reducing time spent on prior authorization
Physicians spend more than a decade training in the art, science, and humanity of diagnosis and treatment. AI should never be used to replace those activities.