New statistical tool enhances prediction accuracy

Stephanie Baum
scientific editor

Robert Egan
associate editor

An international team of mathematicians, led by Lehigh University statistician Taeho Kim, has introduced an innovative method that could significantly improve how scientists make predictions, especially in fields like health, biology, and the social sciences.
The new approach is designed to make predictions that better agree with actual outcomes. Based on this idea, researchers named it the Maximum Agreement Linear Predictor, or MALP. This prediction approach achieves higher agreement in predictions by optimizing the concordance correlation coefficient (CCC), which measures how well pairs of observations fall on the 45-degree line of a scatter plot, combining both precision—how tightly points cluster—and accuracy—how close they are to the line.
The work is on the arXiv preprint server.
Traditional methods, such as the well-known least-squares approach, tend to focus solely on minimizing average errors. While effective, they can fall short when what matters most is the alignment through agreement, says Kim, assistant professor of mathematics.
"Sometimes, we don't just want our predictions to be close—we want them to have the highest agreement with the real values," he says. "The issue is, how can we define the agreement of two objects in a scientifically meaningful way? One way we can conceptualize this is how close the points are aligned with a 45 degree line on a scatter plot between the predicted value and the actual values. So, if the scatter plot of these shows a strong alignment with this 45 degree line, then we could say there is a good level of agreement between these two."
Whenever we mention agreement, people often first recall Pearson's correlation coefficient, since it's typically introduced early in basic introductory statistics courses. It remains a powerful approach. Yet Pearson's correlation merely assesses the linear relationship between two continuous variables and the strength of any linear association, not just alignment with a 45-degree line. For example, it could indicate a strong correlation for points following a line with a slope of 50 degrees or 75 degrees—as long as the data closely align to some straight line, Kim says.
"In our case, we are specifically interested in alignment with a 45-degree line. For that, we use a different measure: the concordance correlation coefficient, introduced by Lin in 1989. This metric focuses specifically on how well the data align with a 45-degree line. What we've developed is a predictor designed to maximize the concordance correlation between predicted values and actual values."
The team tested the MALP approach using both computer simulations and real-world data sets, such as measurements from eye scans and body fat tests. To demonstrate the method in action, the researchers applied MALP to eye scan data from an ophthalmology study comparing two types of optical coherence tomography (OCT) devices: the older Stratus OCT and the newer, more advanced Cirrus OCT.
As clinics transition to the Cirrus system, doctors need a reliable way to convert measurements so they can compare results across time and equipment. Using high-quality scans from 26 left eyes and 30 right eyes, the team tested how well MALP could predict Stratus OCT measurements from Cirrus OCT measurements, comparing it with the traditional least-squares approach.
The results showed that MALP produced predictions that agreed more closely with actual Stratus readings, while the least-squares method was slightly better at minimizing average error—highlighting the trade-off between agreement and raw accuracy.
The team also tested MALP on a body fat data set containing measurements from 252 adults, including weight, abdomen size and other body dimensions. Because direct methods of measuring body fat, such as underwater weighing, are accurate but costly, researchers often rely on estimates from easier measurements.
Using these measurements to predict body fat percentage, MALP was compared with the standard least-squares method. The results echoed the eye scan study: MALP delivered predictions that more closely matched actual values, while the least-squares approach produced slightly smaller average errors—underscoring the balance between agreement and error reduction.
Kim and his colleagues found that MALP often provided predictions that better matched the actual data compared with traditional methods. However, the choice between MALP and conventional methods should depend on the goal and context of individual projects. If minimizing errors is most important, the classic methods still perform well; if agreement is key, MALP is the better choice.
The findings could have major implications for improving prediction tools in various fields—from medicine and public health to economics and engineering. For data scientists and researchers working on predictive models, MALP offers a promising new tool, especially when error minimization isn't just about being close, but about being in full agreement with the truth.
"We need to investigate further," Kim says. "Currently, our setting is within the class of linear predictors. This set is large enough to be practically used in various fields, but it is still restricted mathematically speaking. So, we wish to extend this to the general class so that our goal is to remove the linear part and so it becomes the Maximum Agreement Predictor."
More information: Taeho Kim et al, Maximum Agreement Linear Predictors, arXiv (2023).
Journal information: arXiv
Provided by Lehigh University