Salazar Coronel, WilianCarbajal Llosa, Carlos MiguelChuchon Remon, Rodolfo Juan2026-04-072026-04-072026-03-26Salazar-Coronel, W., Carbajal-Llosa, C., & Chuchon-Remon, R. (2026). Soil organic carbon content mapping along the coast of northern Peru: an ensemble machine learning approach. Frontiers in Soil Science, 6, 1745154. https://doi.org/10.3389/fsoil.2026.17451542673-8619http://hdl.handle.net/20.500.12955/3082Introduction: Soil organic carbon (SOC) content plays a fundamental role in regulating the global carbon cycle and mitigating climate change. It is also a key marker of soil health and a vital plant component. Its distribution in space varies in dry ecosystems, where climate and land use affect it. This study aimed to estimate and map SOC in the Motupe River Basin, northern Peru, by applying machine learning algorithms and ensemble methods. Methods: Four predictive models were evaluated: Support Vector Regression (SVR), Random Forest (RF), Artificial Neural Network (ANN), and Extreme Gradient Boosting (XGBoost), together with two ensemble approaches—simple averaging and weighted — integrating topographic, climatic, edaphic, and vegetation indices variables. Spatial autocorrelation was minimized by spatial block cross-validation. Uncertainty was measured with bootstrapping and the Prediction Interval Ratio (PIR) derived from 90% prediction intervals. Results and discussion: Best performance was achieved by XGBoost (R² = 0.83), weighted ensemble (R² = 0.70), and RF (R² = 0.63). The most influential predictors were EVI, GNDVI, temperature, TRI, and pH. SOC contents showed relatively higher concentrations (>0.7%) in areas with greater vegetation density, within a semi-arid context where SOC levels are generally low. In contrast, lower areas exhibited reduced SOC contents (< 0.6%). The uncertainty analysis indicated that SOC predictions had high to moderate confidence (PIR < 0.2) in the middle-and upper zones of the basin, and moderate confidence (0.1–0.2) in the lower areas. The results suggest that machine learning and ensemble methods improve SOC prediction, benefiting the sustainable management of soil fertility and quality in arid and semi-arid ecosystems of northern Peru.application/pdfenginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Machine learningAprendizaje automáticoSoil organic carbonCarbono orgánico del sueloTopographic indicesIndices topográficosVegetation indicesIndices de vegetaciónDigital soil mappingCartografía digital del sueloEnsemble modelingModelado ensembleSoil organic carbon content mapping along the coast of northern Peru: an ensemble machine learning approachinfo:eu-repo/semantics/articlehttps://purl.org/pe-repo/ocde/ford#4.01.04http//doi.org/10.3389/fsoil.2026.1745154Fertilidad del suelo; Soil fertility; Zona árida; Arid zones; Cuencas hidrográficas; Watersheds