Research workflow. The framework consisted of four major steps: calculating aboveground biomass using forest inventory and analysis field data, identifying remote sensing data layers, and data extraction, random forest model training, and application. Credit: Scientific Reports (2025). DOI: 10.1038/s41598-025-15585-6
A tree can sequester quite a bit of carbon, and forests can sequester and store quite a bit more, but knowing exactly how much is important for many reasons, from planning and management decisions to assessing a forest's health. However, estimating how much carbon is stored in a forested landscape is a tedious and time-consuming process.
Researchers from the University of Connecticut are working on a way to speed up the process using remote sensing data. Their findings are in Scientific Reports.
The team from the Department of Natural Resources and the Environment in the College of Agriculture, Health and Natural Resources, includes Ph.D. student Shashika Himandi Gardeye Lamahewage, assistant professor Chandi Witharana, associate professor Robert Fahey, and associate extension educator Thomas Worthley.
Aboveground biomass (AGB) basically describes the components of the trees above ground level, explains lead author Lamahewage, who says the conventional methods for measuring AGB are involved and often impractical.
"You can think of it as if we were going to count each grain of sand on the beach. It's not really possible, both on the state or regional level," says Lamahewage. "Forests are also not homogeneous, and we usually end up with an inaccurate estimate because it takes so long and by the time you are done measuring, the trees have grown."
Knowing how much AGB is stored in a forest has broader implications for things like carbon modeling, forest management, and conservation decisions, says Witharana.
"For example, say a company would like to clear-cut 50 acres of forest and convert it to a solar farm. It is important for the planners and permitting agencies to estimate whether this is worthwhile. Either you leave the forest as it is so that it can continue to sequester carbon, or you clear it and put up a solar farm that gives green electricity. How much carbon could this forest sequester in a certain given time? We need a quantifiable estimate of the biomass."
Knowing how much carbon is already stored and how much a forest is capable of storing is vital for making management decisions. In highly forested areas, like Connecticut, current methods rely on a network of sparse plots where the U.S. Forest Service gets a rough estimate.
"The plots are spatially scattered, so you can get some broad estimates at county level, but we need more spatially detailed estimates," says Witharana.
The researchers sought to build a fast and efficient way to estimate AGB. For this, they developed a model to correlate existing field measurements of tree structure data with remote sensing data, including Landsat and Sentinel-2 satellite images, and LiDAR data. Then they analyzed the relationship between these sources, Witharana explains.
"Once you figure out the most accurate or most powerful variables then we can teach our model. When the model is strong, the idea is that we can easily estimate biomass of a given area using remote sensing," says Witharana.
The researchers focused on Connecticut because there is a wealth of LiDAR data available from the state. This is valuable, says Lamahewage, since LiDAR point cloud data helps gain an understanding of the structure of the forests.
"One of our objectives was to utilize as much publicly available data as possible," says Lamahewage. "We trained our machine learning models based on one year of data, therefore we only had around 100 samples."
It was a tricky but important trade-off between the small sample size in the publicly available data, but the researchers made it work. Lamahewage says that since they had a small training sample, they had to tune the model extensively, almost like finding a clear signal on a radio, to ensure the model performed optimally.
They identified 67 variables between the plot data and remote sensing observations to use to train the model. For this they used machine learning to analyze the non-linear relationships between the variables. The model identified 28 variables as highly valuable for estimating biomass, says Lamahewage, and of those, 68% were LiDAR-derived variables.
"Another interesting fact is from Sentinel-2 data, our model picked up a short-wave inference that tells us indirect information about the tree's physiology and gives us information about tree health as well," says Lamahewage.
Through this extensive tuning process, Lamahewage developed nine modules trained in different hyperparameter settings to achieve the best performance. Witharana says they can now use the model to connect the goldmines of data from the U.S. Forest Service and State of Connecticut to give higher resolution and spatially accurate AGB estimations.
"This study is the starting point," says Witharana. "We are connecting dots and making useful, open-source products."
Witharana says future studies will apply the model to larger publicly available data sets from plots in New Hampshire and New York to further reduce uncertainties, improve the model, and develop new models.
Lamahewage says it is important to push forward with this work and shows the value of open source data, even relatively small datasets, for combating massive challenges like climate change.
"When I talk to peer researchers, they sometimes let go of this kind of research because of the lack of field data," says Lamahewage. "This is also an effort of seeing how much of our publicly available data can build accurate models that can potentially be expanded to larger areas, in this case for biomass mapping or carbon mapping or modeling.
"It is quite unfortunate to see that people are letting go of this research or decades-worth of valuable field data because of the constraints of accessing it."
More information: Shashika Himandi Gardeye Lamahewage et al, Aboveground biomass estimation using multimodal remote sensing observations and machine learning in mixed temperate forest, Scientific Reports (2025).
Journal information: Scientific Reports
Provided by University of Connecticut