Trees, Forests and People 14 (2023) 100440 Contents lists available at ScienceDirect Trees, Forests and People journal homepage: www.sciencedirect.com/journal/trees-forests-and-people Classifying the risk of forest loss in the Peruvian amazon rainforest: An alternative approach for sustainable forest management using artificial intelligence Gianmarco Goycochea Casas a,*, Juan Rodrigo Baselly-Villanueva b, Mathaus Messias Coimbra Limeira a, Carlos Moreira Miquelino Eleto Torres a, Hélio Garcia Leite a a Department of Forest Engineering, Federal Univdaersity of Viçosa, Av. Purdue, s/n, Viçosa Campus, Zip Code 36570-900, Viçosa, MG, Brazil b Estación Experimental Agraria San Roque, Dirección de Desarrollo Tecnológico Agrario, Instituto Nacional de Innovación Agraria (INIA), Calle San Roque 209, San Juan Bautista, Maynas, Loreto 16430, Peru A R T I C L E I N F O A B S T R A C T Keywords: Peruvian Amazonian rainforests are constantly threatened by forest loss. Understanding changes in forest cover Kohonen neural network and assessing the level of risk is a permanent concern for numerous scientists and forest authorities. There are Forest conservation many conservation programs for Peruvian forests that involve collaborative efforts and employ diverse meth- Forest prevention odologies for forest monitoring. In this study, we propose an alternative approach to decision-making for forest Risk classification preservation, aiming to classify the risk of forest loss in districts within the Peruvian Amazon rainforest. This classification enables sustainable forest management. To accomplish this, we utilized unsupervised learning artificial intelligence through Kohonen’s neural network. The network was trained using a historical database spanning from 2001 to 2021, which includes variables such as forest cover and loss, climate, topography, hy- drographic networks, and timber forest concessions. Through this approach, the network successfully established five clusters. Following preliminary analysis, we designated these clusters as: low, medium, high, very high, and extremely high risk of forest loss. Kohonen networks demonstrated their effectiveness in clustering forest loss and forest cover. The results indicate a shifting trend among the classes over time, with an increase in the categories exhibiting high and very high risk of forest cover loss. This study provides valuable information for decision- making in the prevention and conservation of Peruvian forests. We strongly recommend maintaining vigi- lance, particularly in districts classified as a very high or extremely high risk of losing forest cover. 1. Introduction in 2020, when 203,272 ha of forest were lost. This was determined by satellite monitoring carried out by the Ministry of Environment through Peru has 56.51% of its surface area under forest cover, making it one its National Forest Conservation Program for Climate Change Mitigation of the ten countries with the highest forest density in the world and the (PNCBMCC). The preservation of biodiversity, regulation of the hydro- second in South America (FRA, 2020). The Peruvian Amazon occupies logical cycle and carbon storage are one of the main sources of the largest forest area (68,188,726 ha) followed by the Andean and dry ecosystem services of the rainforest (Fearnside, 2008; Phillips and forests (MINAM, 2016); therefore, it conserves a high biodiversity Brienen, 2017), services that are threatened due to an accelerated loss of (Alvarez-Montalván et al., 2021), but ranks fifth worldwide in forest loss forest cover (Brienen et al., 2015). The loss of forest cover in the Peru- (Sierra Praeli, 2021). According to the GeoBosques platform of the vian Amazon is caused by deforestation, which is mainly due to land use Peruvian Ministry of Environment, 2,774,562 ha of land area was change, established by the search for lad for agriculture and cattle deforested between 2001 and 2021 alone (GeoBosques, 2021). ranching by local people (Soares-Filho et al., 2006; Coomes et al., 2021; Primary forests in Peru are losing forest cover at an alarming rate Lal, 2021). each year. The highest rate of deforestation in the last 20 years occurred The Peruvian government, in order to take strategic actions and * Corresponding author. E-mail address: gianmarco.casas@ufv.br (G.G. Casas). https://doi.org/10.1016/j.tfp.2023.100440 Available online 21 September 2023 2666-7193/© 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by- nc-nd/4.0/). G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 implementation lines to reduce emissions from deforestation and forest comprehensive map (Sacco et al., 2017), it has been used to develop new degradation, approved the National Strategy on Forests and Climate data compression and classification algorithms (Kohonen, 2001). Change (ENBCC) (Supreme Decree No. 007-2016-MINAM); which con- The learning algorithm employed by the SOM method can be clas- tains as a main action the monitoring of forests, which is being carried sified into two distinct modalities: online learning and batch learning out progressively. In this context, governmental institutions have un- (Kikugawa et al., 2019). The application of batch learning in the evo- dertaken the task of classifying Peru’s diverse forest regions. This clas- lution of the SOM has proven particularly relevant in practical appli- sification process involves the utilization of advanced techniques such as cability contexts (Kohonen, 2013). This learning process is based on the remote sensing and geographic information systems. As a result of these prior construction of a model based on historical data, which is subse- efforts, six distinct ecozones have been identified: Coast, Highlands, quently used to evaluate the quality of recently obtained results (Oui- High Accessible Forest, High Forest of Difficult Access, Low Forest, and dadi et al., 2023). Similarly, in operational scenarios where data is Hydromorphic Zone. Remarkably, the Amazon rainforest contributes to processed sequentially, the incremental training strategy emerges as the four of these ecozones: High Accessible Forest, High Forest of Difficult preferred approach (Mariño and Carvalho, 2022). The applicability of Access, Low Forest, and Hydromorphic Zone. This stratification was the online learning method involves continuous analysis of input data guided by multifaceted criteria including floristic composition, physio- acquired during the monitoring process, promoting real-time updates of graphic characteristics, physiognomic attributes, carbon storage capac- model parameters as new predictions are generated (Ouidadi et al., ity, and accessibility factors (MINAM, 2014). 2023). The Peruvian National Vegetation Cover Map was also used, con- The SOM has been applied in a broad sense that includes visualiza- formed by special units defined and classified based on geographic, tions, feature map generation, pattern recognition and classification physiognomic, humidity condition and exceptionally floristic criteria (Miljkovic, 2017). The SOM has helped to understand different behav- (MINAM, 2015). The vegetation cover combines the criteria of vegeta- iors of the biological sector through classification, such as classifying tion formations with other physiographic, climatic, physiognomic, and biological relevé and finding epigean species significantly associated anthropic criteria. It is stratified by six coverages: wetland forest, with forest phytocenoses (Wolski and Kruk, 2020). The SOM also shows dryland forest, dryland scrub, wetland grassland, other vegetation for- superior performance in environmental studies, such as the assessment mations and anthropic cover (SERFOR, 2016). The latter stands for the of carbon stocks in forests, showing that it can reproduce the spatial coverages in which there was human intervention, such as forest plan- pattern of carbon stocks (Stümer et al., 2010) and for the assessment of tations, deforested areas, and agriculture. long-term changes in forest vegetation (Adamczyk et al., 2013). Like- A specific classification of the risk of losing forest cover in the wise, the SOM is workable for the classification of land use from remote Peruvian Amazon rainforest has not yet been studied. The classification sensing data (Ji, 2000). of a forest is extremely complex due to the great biological interactions These neural networks could be used for classification problems that exist between them, in addition to the physiographic and environ- where the output variable is difficult to obtain, such as classifying a mental conditions in which the areas are located (Moncrieff et al., 2016). forest area. The objective of this study was to classify the risk of loss of The relationship between climate and plants occurs in multiple ways, forest districts in the Peruvian Amazon rainforest using Kohonen neural influenced mainly by climate and geographic location (Zevallos and networks, which allows and guides conservation and sustainable forest Lavado-Casimiro, 2022). To understand the numerous interactions in management actions. forest ecosystems, such as carbon deposition, climate buffering, and hydrological and erosional control, deeper investigations are essential 2. Materials and methods (Ivanova et al., 2022). Simultaneously, the ability to collect extensive sets of forest-related data brings new perspectives, not only for the 2.1. Localization and database classification and monitoring of natural and altered habitats but also for their management (Gao et al., 2020; Xu et al., 2022; Zevallos and Lav- The study was conducted in the Amazon rainforest within the terri- ado-Casimiro, 2022). However, the considerable amount of new data, tory of Peru (Fig. 1). The Forest Cover Loss (Ha) and Forest Cover (Ha) characterized by their higher quality, structure, and analysis, poses a database in tabular form used for classification were obtained from the challenge (Ivanova et al., 2022). Geobosques platform, which is managed by the Peruvian Ministry of Artificial neural networks (ANNs) can be used to classify forests in an Environment. This database comprised historical data from 2001 to alternative way. ANNs are a type of machine learning that can be used to 2021, encompassing 400 districts located in the Amazon rainforest and solve a variety of problems, including regression, classification, and data distributed across 15 departments. Additionally, climate, topography, compression (Ran and Hu, 2017). Among the various methods classified hydrographic networks, and timber forest concessions data were ANNs, an alternative classification approach is represented by incorporated to enhance the classification process, as these variables are self-organizing map (SOM) (Asan and Ercan, 2012). closely associated with deforestation in the Peruvian Amazon (Bax and The SOM was introduced by Kohonen (1982), Kohonen and Honkela Francesconi, 2018; Cotrina Sánchez et al., 2021). (2007) in 1980s and it is known as Kohonen’s neural networks or Climate data, including Corrected Precipitation (mm/day), Kohonen map that can organize complex data into groups according to Maximum Temperature at 2 M (◦C), and Minimum Temperature at 2 M their relationships. The network is composed of an input layer and a (◦C), were acquired from the NASA Prediction of Worldwide Energy Kohonen layer, in which the neurons of the input layer distribute the Resources, Topographic data, such as elevation (meters above sea level), standard values for the Kohonen layer in a tabular form and keep the and hydrographic networks information were sourced from the national process updated during the whole training process (Kohonen, 1982; chart generated by the Instituto Geográfico Nacional – IGN, while the Dutra et al., 2021). timber forest concessions data was obtained from the Servicio Nacional The SOM is an abstract mathematical model for mapping the topo- Forestal y de Fauna Silvestre – SERFOR. Table 1 shows the source data graphical structure of visual sensors inspired by the structure of the used in the assessment of the risk of forest loss in the Peruvian Amazon cerebral cortex, which is organized in a two-dimensional grid of nodes, rainforest. The variables used in the processing were rescaled to a spatial which are the individual neurons in the network. Each node represents a resolution of 250 m (Cotrina Sánchez et al., 2021). particular region of the input space, so that similar data points are mapped to neighboring nodes. This allows the SOM to represent the 2.2. Configuration of the Kohonen neural network topological structure of the input data in a two-dimensional space (Yin, 2008). The SOM refers to an unsupervised neural network designed to The data underwent processing within the computational environ- condense input dimensions, thus portraying the distribution as a ment of the R programming language R Core Team (2020), and this 2 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 Fig. 1. Geographical distribution of districts in the Peruvian Amazon Rainforest biome. procedure was facilitated by leveraging the functionalities provided by operationalized as a vector. This vector spanned a spectrum from 2.65 to the aweSOM package (Boelaert et al., 2022). Specifically, the aweSOM -2.65, thus modulating the extent of node interaction during training. package assisted in implementing a Self-Organizing Map (SOM) algo- To ensure the compatibility of data structures and dimensions, the rithm, a form of unsupervised machine learning widely utilized for configuration of the array list was meticulously aligned with both the pattern recognition and data visualization tasks. length of the data list and the total number of variables present within As part of our methodology, we constructed a grid structure using the the dataset. somgrid class, an object designed to systematically organize and record We harnessed two distinct learning algorithms: the batch (offline) grid coordinates. To visualize and interpret the results of this grid and online. Notably, when employing the batch learning algorithm, the arrangement, we employed the plot method (Venables and Ripley, utilization of the learning rate was omitted, suggesting a different 2002), which aided in the graphical representation of the grid’s layout approach to weight updates and convergence dynamics compared to the and associated data points. online counterpart. In the input layer, all possible combinations of the variables in our database were selected, establishing, and evaluating three training models. To facilitate effective learning and pattern extraction, we sub- 2.3. Training quality jected the complete dataset to the SOM network on a total of 100 oc- casions. The learning rate, a crucial parameter influencing the rate of The quality of training was evaluated according to the following convergence during training, was dynamically configured ranging from criteria (Boelaert et al., 2022; Kaski and Lagus, 1996; Kohonen, 2001): an initial value of 0.05 and decreasing linearly to 0.01 in successive (1) Quantization error (QE): It is defined as the mean squared distance iterations. between the data point and the map prototype to which it is assigned. In addition, the concept of neighborhood radius, representing the The lower the distance, the better the quality of the model. (2) Per- extent to which neighboring nodes influence each other’s updates, was centage of explained variance (EV%): As with other clustering methods, the part of the total variance explained by clustering is one minus the 3 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 Table 1 Geospatial information related to the variables used in the assessment of the risk of forest loss in the Peruvian Amazon rainforest. Data Type Source Link Spatial resolution Cover Loss (Ha) .xls Geobosques https://geobosques.minam.gob.pe/geobosque/view/descargas.php? - 122345gxxe345w34gg Forest Cover (Ha) .xls Geobosques https://geobosques.minam.gob.pe/geobosque/view/descargas.php? - 122345gxxe345w34gg Corrected Precipitation Raster NASA https://power.larc.nasa.gov/data-access-viewer/ 10 kilometers (mm/day) Maximum Temperature (◦C) Raster NASA https://power.larc.nasa.gov/data-access-viewer/ 2 meters Minimum Temperature (◦C) Raster NASA https://power.larc.nasa.gov/data-access-viewer/ 2 meters Topographic data Raster Instituto Geográfico Nacional - IGN https://www.geoidep.gob.pe/instituto-geografico-nacional 1:500 000 (elevation) Hydrographic networks Vetor Instituto Geográfico Nacional - IGN https://www.geoidep.gob.pe/instituto-geografico-nacional 1:500 000 Forest concessions Vetor Servicio Nacional Forestal y de Fauna https://geo.serfor.gob.pe/visor/ - Silvestre - SERFOR 15 departments and 400 Vetor Ministerio del Ambiente - MINAM https://geoservidor.minam.gob.pe/ - districts Note: The data downloaded in .xls format from the Geobosques source are derived from a raster with a spatial resolution of 30 meters. ratio of the quantification errors to the total variance. The higher the 3. Results reference value, the better the quality of the model. (3) Topographic error (TE): Measures the degree of preservation of the topographic 3.1. Training quality assessment structure of the data on the map. It is calculated as the percentage of observations in which the best coincidence node is not adjacent to the Table 2 provides a comprehensive overview of multiple models second-best coincidence node of the map. It is taken as a reference that a developed to classify the risk of forest cover loss in the Peruvian Amazon low value presents better quality of the model. (4) Kaski-Lagus error rainforest using the Kohonen neural network. Each model is character- (K-L): It is the sum of the average distance between the most compatible ized by the number of variables considered, the algorithms employed, point and the best compatible prototype and the average geodesic dis- and various performance metrics. tance between the most compatible point and the second best compat- In the first model, which utilized three variables (loss of forest cover, ible prototype. forest cover, and climate), both the batch and online algorithms were Similarly, the classification achieved by the Kohonen neural network applied. The models demonstrated quantization errors (QE) of 0.7135 was validated by comparing it with the deforested areas (observed data), and 0.7022, respectively. The explained variance (EV) percentages were which are reported in the Geobosques plantform in vector format. This 91.06% and 91.20%. The topographic errors (TE) were 0.125 and 0.113, validation process helped assess the accuracy and reliability of the while the Kaski-Lagus errors (K-L) were 2.8526 and 2.5488, network’s classification results. respectively. The second model expanded the analysis by including an additional variable, topography, alongside the variables used in the first model. 2.4. Establishment and assessment of the number of classes The batch and online algorithms were again applied, resulting in quantization errors (QE) of 0.7504 and 0.7005, respectively. The To set up the number of classes, the scree graph (Cattell, 1966) was explained variance (EV) percentages improved to 91.64% and 92.20%. evaluated, which corresponds to the maximum value of the numerical The topographic errors (TE) were 0.125 and 0.158, while the Kaski- solution of the approximation of the second derivative by finite differ- Lagus errors (K-L) showed values of 2.3575 and 2.7730. ences, which can be defined as: The third model incorporated a comprehensive set of six variables, f ″ i f i 1 2f i f i 1 including loss of forest cover, forest cover, topography, climate, hy-( ) ≈ ( + ) − ( ) + ( − ) drographic networks, and timber forest concessions. The batch and Note here that the number of classes to be kept should be above the online algorithms were once again utilized, resulting in quantization elbow of the graph, moreover, if the classes are important, the class errors (QE) of 0.4540 and 0.4065, respectively. The explained variance should be steep while when the classes correspond to the error variance, (EV) percentages significantly increased to 94.32% and 94.92%, indi- the slope should be flat (Basto and Pereira, 2012). cating the enhanced ability of the models to capture the underlying patterns in the data. The topographic error (TE) increased slightly to 0.261 and 0.291, while the Kaski-Lagus error (K-L) values were recorded 2.5. Map of Kohonen as 1.9467 and 2.0938. The inclusion of additional variables led to improved explained The Kohonen map was created using a cloud as a scatter plot within variance percentages, indicating a better understanding of the under- each cell. We set the same continuous variables shown in the model lying factors contributing to classifying forest cover loss, as was the case training and added district location as a categorical variable. The size of with model three. It was also evident that the use of the online algorithm each cell was 400 pixels in that each cell is a visual representation of a presented better training quality. specific unit in the self-organizing map. A color palette was used to stand for the classes as cell backgrounds and transparency was also set up to focus the variables. The graph was configured to show the name of the 3.2. Analysis of the clusters categorical variable used in this study, which were the districts belonging to the Amazon rainforest, from the observations in each cell We have analyzed the clusters and cells of the best-trained model and class. Subsequently, the Kohonen map classification results were (Model 3 – Online Algorithm). According to the test and the scree plot, it exported to set up the district political map with their respective was necessary to classify five clusters (Fig. 2A). Kohonen’s map con- classifications. sisted of a 10 × 10 grid, which successfully classified the districts into 4 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 Table 2 Evaluation of training and clustering for each model and algorithm developed for forest cover loss risk assessment in the Peruvian Amazon rainforest. Model Variables Algorithms Quantization error Explained variance Topographic error Kaski-Lagus Number (QE) % (EV) (TE) error (K-L) 1 Loss of forest cover, Forest cover, Climate Batch 0.7135 91.06 0.125 2.8526 Online 0.7022 91.20 0.113 2.5488 2 Loss of forest cover, Forest cover, Topography, Climate Batch 0.7504 91.64 0.125 2.3575 Online 0.7005 92.20 0.158 2.7730 3 Loss of forest cover, Forest cover, Topography, Climate, Batch 0.4540 94.32 0.261 1.9467 Hydrographic networks , Timber forest concessions Online 0.4065 94.92 0.291 2.0938 five distinct groups. Each cluster exhibited a varying number of cells, with 6, 11, 20, 31, and 32 cells, respectively (Fig. 2B). The analysis involved calculating the average loss and forest cover for each year within each cluster. To facilitate comparison, the data needed to be normalized and analyzed separately. Based on the grouping of the data, a classification system was implemented, assigning ratings of low, medium, high, very high, and extremely high risk of forest cover loss to each cluster, agreeing with the classification used by the PNCBMCC (MINAM, 2019). In this classification, a low rating indicated that the forest cover exceeded the loss, while an extremely high rating indicated that the loss of forest cover exceeded the available forested area (Fig. 3). By applying this rating system, the study was able to effectively assess the relative risk levels associated with forest cover loss within each cluster. 3.3. Assessment of forest cover loss in Peruvian districts Forest loss and forest cover data were exported and categorized in vector form on the map of political districts, allowing visualization of the risk classification of each district. The procedure was performed for 4 years (2006, 2011, 2016, and 2021) to evaluate the succession and Fig. 3. Analysis of normalized average forest loss and forest cover data for the variation of the risk of cover loss for each district. Additionally, the years 2001-2021 in the Peruvian Amazon rainforest. deforested areas (observed data) were used, showing the effectiveness of the neural network classification (Fig. 4). In 2006, the districts of central Peru began to show an extremely The results reveal a notable trend over time, with a shift in the high-risk classification of forest cover loss, as shown in Fig. 4A. How- classification of risk levels for forest cover loss. Specifically, there is a ever, it is important to highlight that during that year, the country still tendency towards an increase in the number of districts classified as high maintained considerable forest cover compared to the affected areas. or very high risk, mainly in districts located in central and southern The distribution of risk classifications for that year was: 53 districts Peru. classified as low risk, 9 districts as medium risk, 168 districts as high Fig. 2. (A) Determination of the number of classes using the scree plot, and (B) Kohonen map with their respective clusters. Note that the conglomerated points in each cell represent the districts as categorical variables. 5 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 Fig. 4. District map of the risk of forest cover loss in the Peruvian Amazon rainforest in (A) 2006, (B) 2011, (C) 2016, and (D) 2021. Note: The blue areas represent the observed data on forest cover loss for each year. 6 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 risk, 158 districts as very high risk, and 12 districts as extremely high 2021. Fig. 5B highlights the variation in risk classification, demon- risk. strating an overall increase in the area classified as high or very high risk In 2011, there was an expansion of districts categorized as extremely from 2001 to 2021. These graphs provide a visual representation of the high-risk in central and southern Peru, as shown in Fig. 4B. The distri- changing landscape of risk levels over time, showing the increase in bution of risk classifications for that year was: 46 low-risk districts, 9 areas and districts with deforestation. medium-risk districts, 177 high-risk districts, 147 very high-risk dis- tricts, and 21 extremely high-risk districts. 4. Discussion In 2016, the focus remained on the central part of the country, with variations in risk classifications observed in the northern regions The neural network of Kohonen’s self-organizing map has been the (Fig. 4C). The distribution of risk classifications for that year was: 47 attraction of several researchers in different areas due to its great clas- low-risk districts, 9 medium-risk districts, 176 high-risk districts, 144 sification potential (Ribeiro et al., 2014; Wolski and Kruk, 2020; Yu very high-risk districts, and 24 extremely high-risk districts. et al., 2019), including unsupervised learning, resulting in the ease of In 2021, both the central and southern parts of the country experi- visualizing, and interpreting a classification with two or more variables enced significant forest loss, resulting in higher risk classifications with the linkage of categorical variables, which are sometimes not (Fig. 4D). The distribution of risk classifications for that year was: 45 visible by conventional statistical methods (Moreira et al., 2019). districts classified as low risk, 9 districts as medium risk, 166 districts as Explaining how Kohonen networks work is complex, for reasons that are high risk, 155 districts as very high risk, and 25 districts as extremely not self-explanatory (Stümer et al., 2010), in view of this, the evaluation high risk. of the quality of training is based on clustering methods. In this study, Fig. 5A illustrates the dispersion of forest cover and forest cover loss we evaluated different variables and their combination related to forest in hectares for each risk classification in the years 2006, 2011, 2016, and cover loss, keeping in mind that clustering is a complex task that Fig. 5. (A) Scatter plot by risk classification between forest cover and forest cover loss per hectare and (B) Variation in risk classification; for the Peruvian Amazon rainforest districts. 7 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 requires careful consideration of many factors, in which we observed alternative to prevent forest loss. This study provides valuable insights that the greater the number of variables related to each other, the better for forest management in Peru, offering essential information to guide the quality of the clustering of the data (Table 2). Additionally, a high EV decision-making and conservation efforts in the region. % was noted, indicating that the clustered data reflects the character- Considering that the removal of vegetation cover and the felling of istics of the original dataset. This means that the technique was suc- forest resources that do not have the corresponding authorization is a cessfully preserving the essential information and structure of the data very serious offense according to the Forestry Peruvian Law (Supreme while discarding noise or redundant features (Boelaert et al., 2022). Decree No. 007-2021-MIDAGRI). In addition to intensifying manage- The classification obtained in the present study agrees with the dis- ment and conservation practices starting today, the world can reduce tricts reported for forest cover loss by the Peruvian Forestry Authority one of the main causes of climate change, a phenomenon that has (MINAM, 2019). However, in this study, we have evaluated different already been modifying some ecosystem services (Reygadas et al., variables, understanding the relationship among all factors as the vari- 2023). ables used, thus determining the risk of cover loss in specific districts. It Forests are complex ecosystems that interact dynamically. Many is worth mentioning that this type of unsupervised learning helps to factors can influence forest loss, but some of them were not included in identify hidden patterns since the algorithm does not present guidance the study because of the lack of monitored data since the time of the (Latif et al., 2019; Raza and Singh, 2021). Furthermore, unsupervised study. These factors include human footprint, roads, socioeconomic, and learning possesses the advantage of being unaffected by the presence of ecological factors. It is important to continuously monitor these factors. outliers (Lassoued and Abderrahim, 2013). This capacity stems from its This is especially important for modeling, as accurate modeling requires integration of the missing value imputation (MVI) method, deeply accurate data. Data analysis allows us to better understand the complex rooted in machine learning methodologies (Hasan et al., 2021). Notably, dynamics of forest loss. This knowledge can be used to make informed the MVI method, specifically exemplified by the SOM, enhances the decisions about sustainable forest management and conservation. capacity to assign a singular data point to multiple clusters, a capability frequently harnessed within the realm of pattern recognition (Singh 5. Concussions et al., 2015). In the present study, the Kohonen Networks not only allowed us to The utilization of Kohonen networks in this study has demonstrated classify, but also to visualize, analyze and understand the historical their effectiveness in clustering forest loss and cover, enabling a quan- successions of the risk of loss of forest cover over the years (Fig. 5), even titative assessment, and facilitating the establishment of a risk classifi- though it is difficult to process forest data due to its complexity and the cation for forest cover loss in the districts of the Peruvian Amazon existence of outliers. rainforest. This methodology also allows for the inclusion of various According to Lal (2021), the deforestation rampant in the Peruvian other variables that can influence forest cover, such as ecological, bio- Amazon can be traced back to evolving land use patterns, predominantly logical, and geological information, as well as changes in cover and the driven by agricultural expansion and livestock rearing, as well as the interaction with other biomes. These aspects should be further investi- extensive engagement in gold mining operations (Caballero Espejo et al., gated in future studies. 2018). The construction of the Interoceanic Highway, which cuts It has been observed that out of the 12 districts initially classified as through the Madre de Dios region, is highly likely to have a detrimental extremely high risk, 25 districts have now been identified as extremely impact on forest loss. This is because it facilitates the influx of people high risk of losing their forest cover in the year 2021, indicating an from various parts of Peru into the Amazonia (Asner and Tupayachi, alarming trend. This study provides crucial information to support 2016). This demographic impact gives rise to the fulfillment of subsis- decision-making for forest preservation and conservation in Peru. It tence requirements, including agricultural and mining pursuits, which, offers technical expertise that can be used to develop Supreme Decrees in turn, engenders an extensive array of adverse socio-environmental or Ministerial Resolutions that prioritize monitoring and surveillance consequences (Alarcón et al., 2016; Moody et al., 2020; Swenson efforts in districts with a high or extremely high risk of forest cover loss. et al., 2011; Velásquez Ramírez et al., 2020). By 2021, 25 districts were classified as extremely high-risk, pre- CRediT authorship contribution statement dominantly those located in the central and southern zones, followed by those located in northern Peru. The causes of cover change or defores- Gianmarco Goycochea Casas: Conceptualization, Data curation, tation in the central zone are the easy transportation of timber to the city Formal analysis, Methodology, Writing – original draft, Writing – review of Lima (Bax et al., 2016); likewise, in recent years, cover changes have & editing. Juan Rodrigo Baselly-Villanueva: Conceptualization, been made for the installation of coca and oil palm plantations in the Methodology, Data curation, Formal analysis, Writing – review & edit- Departments of Ucayali and Huánuco (Bax and Francesconi, 2018; Vijay ing. Mathaus Messias Coimbra Limeira: Formal analysis, Data cura- et al., 2018). The increase in the number of districts with an extremely tion. Carlos Moreira Miquelino Eleto Torres: Methodology, high risk of cover loss in southern Peru is mainly due to illegal mining in Supervision. Hélio Garcia Leite: Methodology, Conceptualization, the Department of Madre de Dios (Asner and Tupayachi, 2016; Cotrina Supervision. Sánchez et al., 2021). In general terms, the number of districts classified as being at Declaration of Competing Interest extremely high risk of deforestation increased between 2001 and 2021. These districts were predominantly located in the center and south of The authors declare that they have no known competing financial Peru, followed by those in the north. These findings are consistent with interests or personal relationships that could have appeared to influence the study reported by Vicencio and de Vivanco (2023), who found that the work reported in this paper. the concentration of extremely high forest loss is mainly located in the center and south of Peru. Data availability There are many forest conservation programs that have mutual ef- forts and use different methodologies for forest monitoring (Cappello The data are enabled online and the sources are written in the et al., 2022). It is most important to focus on the districts classified in manuscript. this study as having an extremely high risk of losing cover and also on those districts that are about to move from a very high to an extremely high-risk class, as shown in Fig. 5. While the high risk of losing forest cover may persist in the long term, our classification study can be an 8 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 Acknowledgments Gao, Y., Skutsch, M., Paneque-Gálvez, J., Ghilardi, A., 2020. Remote sensing of forest degradation: a review. Environ. Res. Lett. 15, 103001 https://doi.org/10.1088/ 1748-9326/abaad7. The present work was carried out with support from the Coor- GeoBosques, 2021. Cobertura y pérdida de bosque húmedo amazónico. denação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil https://geobosques.minam.gob. (CAPES) – Financing Code 001. pe/geobosque/descargas_geobosque/perdida/documentos/ Reporte_Cobertura_y_Perdida_de_Bosque_Humedo_Amazonico_2021.pdf?Tue%20Feb %2021%202023%2012:05:58%20GMT-0500%20(hora%20est%C3%A1ndar%20de References %20Per%C3%BA) (accessed 6.11.23). Hasan, M.K., Alam, M.A., Roy, S., Dutta, A., Jawad, M.T., Das, S., 2021. Missing value Adamczyk, J.J., Kurzac, M., Park, Y.S., Kruk, A., 2013. Application of a Kohonen’s self- imputation affects the performance of machine learning: A review and analysis of the organizing map for evaluation of long-term changes in forest vegetation. J. Veg. Sci. literature (2010–2021). Inform. Med. Unlocked 27, 100799. https://doi.org/ 24, 405–414. https://doi.org/10.1111/j.1654-1103.2012.01468.x. 10.1016/j.imu.2021.100799. Alarcón, G., Díaz, J., Vela, M., García, M., Gutiérrez, J., 2016. Deforestación en el sureste Ivanova, N., Fomin, V., Kusbach, A., 2022. Experience of forest ecological classification de la amazonia del Perú entre los años 1999 - 2013; caso Regional de Madre de Dios in assessment of vegetation dynamics. Sustainability 14, 3384. https://doi.org/ (Puerto Maldonado – Inambari). Rev. Investig. Altoandin. J. High Andean Res. 18 10.3390/su14063384. https://doi.org/10.18271/ria.2016.221. Ji, C.Y., 2000. Land-use classification of remotely sensed data using Kohonen self- Alvarez-Montalván, C.E., Manrique-León, S., Fonseca, V.D., Cardozo-Soarez, J., Callo- organizing feature map neural networks. Photogramm. Eng. Remote Sens. 66, Ccorcca, J., Bravo-Camara, P., Castañeda-Tinco, I., Alvarez-Orellana, J., 2021. 1451–1460. Floristic composition, structure and tree diversity of an amazon forest in Peru. Sci. Kaski, S., Lagus, K., 1996. Comparing self-organizing maps. pp. 809–814. 10.1007/3- Agropecu. 12, 73–82. https://doi.org/10.17268/sci.agropecu.2021.009. 540-61510-5_136. Asan, U., Ercan, S., 2012. An introduction to self-organizing maps. In: Kahraman, C. (eds) Kikugawa, G., Nishimura, Y., Shimoyama, K., Ohara, T., Okabe, T., Ohuchi, F.S., 2019. Computational Intelligence Systems in Industrial Engineering. Atlantis Data analysis of multi-dimensional thermophysical properties of liquid substances Computational Intelligence Systems 6, 295–31. 10.2991/978-94-91216-77-0_14. based on clustering approach of machine learning. Chem. Phys. Lett. 728, 109–114. Asner, G.P., Tupayachi, R., 2016. Accelerated losses of protected forests from gold https://doi.org/10.1016/j.cplett.2019.04.075. mining in the Peruvian Amazon. Environ. Res. Lett. 12, 094004 https://doi.org/ Kohonen, T., 1982. Self-organized formation of topologically correct feature maps. Biol. 10.1088/1748-9326/aa7dab. Cybern. 43, 59–69. https://doi.org/10.1007/BF00337288. Basto, M., Pereira, J.M., 2012. An SPSS R -menu for ordinal factor analysis. J. Stat. Softw. Kohonen, T., 2001. Self-Organizing Maps. Springer Berlin Heidelberg, Berlin, Heidelberg. 46 https://doi.org/10.18637/jss.v046.i04. https://doi.org/10.1007/978-3-642-56927-2. Bax, V., Francesconi, W., 2018. Environmental predictors of forest change: an analysis of Kohonen, T., 2013. Essentials of the self-organizing map. Neural Netw. natural predisposition to deforestation in the tropical Andes region, Peru. Appl. Kohonen, T., Honkela, T., 2007. Kohonen network. Scholarpedia 2, 1568. https://doi. Geogr. 91, 99–110. https://doi.org/10.1016/j.apgeog.2018.01.002. org/10.4249/scholarpedia.1568. Bax, V., Francesconi, W., Quintero, M., 2016. Spatial modeling of deforestation processes Lal, R., 2021. Soil Organic Carbon and Feeding the Future, 1st ed. CRC Press, Boca Raton. in the Central Peruvian Amazon. J. Nat. Conserv. 29, 79–88. https://doi.org/ https://doi.org/10.1201/9781003243090. ed. 10.1016/j.jnc.2015.12.002. Lassoued, F.A.Z., Abderrahim, S.B.K., 2013. A Kohonen neural network based method for Boelaert, J., Ollion, E., Sodoge, J., Megdoud, M., Naji, O., Kote, A.L., Renoud, T., Hym, S., PWARX identification. IFAC Proc. Vol. 46, 742–747. https://doi.org/10.3182/ 2022. Interactive self-organizing maps. 20130703-3-FR-4038.00088. Brienen, R.J.W., Phillips, O.L., Feldpausch, T.R., Gloor, E., Baker, T.R., Lloyd, J., Lopez- Latif, J., Xiao, C., Imran, A., Tu, S., 2019. Medical imaging using machine learning and Gonzalez, G., Monteagudo-Mendoza, A., Malhi, Y., Lewis, S.L., Vásquez Martinez, R., deep learning algorithms: a review. In: Proceedings of the 2nd International Alexiades, M., Álvarez Dávila, E., Alvarez-Loayza, P., Andrade, A., Aragão, L.E.O.C., Conference on Computing, Mathematics and Engineering Technologies (ICoMET). Araujo-Murakami, A., Arets, E.J.M.M., Arroyo, L., Aymard, C.G.A., Bánki, O.S., IEEE, pp. 1–5. https://doi.org/10.1109/ICOMET.2019.8673502. Baraloto, C., Barroso, J., Bonal, D., Boot, R.G.A., Camargo, J.L.C., Castilho, C.V., Mariño, L.M., de Carvalho, F.D.A., 2022. Vector batch SOM algorithms for multi-view Chama, V., Chao, K.J., Chave, J., Comiskey, J.A., Cornejo Valverde, F., da Costa, L., dissimilarity data. Knowl. Based Syst. 258, 109994. de Oliveira, E.A., Di Fiore, A., Erwin, T.L., Fauset, S., Forsthofer, M., Galbraith, D.R., Miljkovic, D., 2017. Brief review of self-organizing maps. In: Proceedings of the 40th Grahame, E.S., Groot, N., Hérault, B., Higuchi, N., Honorio Coronado, E.N., International Convention on Information and Communication Technology, Keeling, H., Killeen, T.J., Laurance, W.F., Laurance, S., Licona, J., Magnussen, W.E., Electronics and Microelectronics (MIPRO). IEEE, pp. 1061–1066. https://doi.org/ Marimon, B.S., Marimon-Junior, B.H., Mendoza, C., Neill, D.A., Nogueira, E.M., 10.23919/MIPRO.2017.7973581. Núñez, P., Pallqui Camacho, N.C., Parada, A., Pardo-Molina, G., Peacock, J., Peña- MINAM, 2014. Estimación de los contenidos de carbono de la biomasa aérea en los Claros, M., Pickavance, G.C., Pitman, N.C.A., Poorter, L., Prieto, A., Quesada, C.A., bosques de Perú. Ramírez, F., Ramírez-Angulo, H., Restrepo, Z., Roopsind, A., Rudas, A., Salomão, R. MINAM, 2015. Mapa nacional de cobertura vegetal: memoria descriptiva /Ministerio del P., Schwarz, M., Silva, N., Silva-Espejo, J.E., Silveira, M., Stropp, J., Talbot, J., ter Ambiente, Dirección General de Evaluación. Valoración y Financiamiento del Steege, H., Teran-Aguilar, J., Terborgh, J., Thomas-Caesar, R., Toledo, M., Torello- Patrimonio Natural, Lima, Perú. Raventos, M., Umetsu, R.K., van der Heijden, G.M.F., van der Hout, P., Guimarães MINAM, 2016. Programa nacional de conservación de bosques para la mitigación del Vieira, I.C., Vieira, S.A., Vilanova, E., Vos, V.A., Zagt, R.J., 2015. Long-term decline cambio climático (accessed 6.11.23). o of the Amazon carbon sink. Nature 519, 344–348. https://doi.org/10.1038/ MINAM, 2019. Apuntes del Bosque N 1: Cobertura y Deforestación en Los Bosques nature14283. Húmedos Amazónicos 2018. Programa Nacional de Conservación de Bosque para la Caballero Espejo, J., Messinger, M., Román-Dañobeytia, F., Ascorra, C., Fernandez, L., Mitigación del Cambio Climático. https://sinia.minam.gob. Silman, M., 2018. Deforestation and forest degradation due to gold mining in the pe/documentos/apuntes-bosque-no-1-cobertura-deforestacion-bosques-humedos\. peruvian amazon: a 34-year perspective. Remote Sens. 10, 1903. https://doi.org/ Moncrieff, G.R., Bond, W.J., Higgins, S.I., 2016. Revising the biome concept for 10.3390/rs10121903 (Basel). understanding and predicting global change impacts. J. Biogeogr. 43, 863–873. Cappello, C., Pratihast, A.K., Pérez Ojeda del Arco, A., Reiche, J., de Sy, V., Herold, M., https://doi.org/10.1111/jbi.12701. Vivanco Vicencio, R.E., Castillo Soto, D., 2022. Alert-driven community-based forest Moody, K.H., Hasan, K.M., Aljic, S., Blakeman, V.M., Hicks, L.P., Loving, D.C., Moore, M. monitoring: a case of the peruvian amazon. Remote Sens. 14, 4284. https://doi.org/ E., Hammett, B.S., Silva-González, M., Seney, C.S., Kiefer, A.M., 2020. Mercury 10.3390/rs14174284 (Basel). emissions from Peruvian gold shops: potential ramifications for Minamata Cattell, R.B., 1966. The scree test for the number of factors. Multivar. Behav. Res. 1, compliance in artisanal and small-scale gold mining communities. Environ. Res. 182, 245–276. https://doi.org/10.1207/s15327906mbr0102_10. 109042 https://doi.org/10.1016/j.envres.2019.109042. Coomes, O.T., Cheng, Y., Takasaki, Y., Abizaid, C., 2021. What drives clearing of old- Moreira, L.S., Chagas, B.C., Pacheco, C.S.V., Santos, H.M., de Menezes, L.H.S., growth forest over secondary forests in tropical shifting cultivation systems? Nascimento, M.M., Batista, M.A.S., de Jesus, R.M., Amorim, F.A.C., Santos, L.N., da Evidence from the Peruvian Amazon. Ecol. Econ. 189, 107170 https://doi.org/ Silva, E.G.P., 2019. Development of procedure for sample preparation of cashew 10.1016/j.ecolecon.2021.107170. nuts using mixture design and evaluation of nutrient profiles by Kohonen neural Cotrina Sánchez, A., Bandopadhyay, S., Rojas Briceño, N.B., Banerjee, P., Torres network. Food Chem. 273, 136–143. https://doi.org/10.1016/j. Guzmán, C., Oliva, M., 2021. Peruvian Amazon disappearing: Transformation of foodchem.2018.01.050. protected areas during the last two decades (2001–2019) and potential future Ouidadi, H., Guo, S., Zamiela, C., Bian, L., 2023. Real-time defect detection using online deforestation modelling using cloud computing and MaxEnt approach. J. Nat. learning for laser metal deposition. J. Manuf. Process. 99, 898–910. Conserv. 64, 126081 https://doi.org/10.1016/j.jnc.2021.126081. Phillips, O.L., Brienen, R.J.W., 2017. Carbon uptake by mature Amazon forests has Dutra, R.A., Reis, E.L., Reis, C., Fidêncio, P.H., Reis, C.D.G., de Carvalho Damasceno, O. mitigated Amazon nations’ carbon emissions. Carbon Balanc. Manag. 12 (1) https:// I., 2021. Spatial and temporal analysis of physical and chemical data of superficial doi.org/10.1186/s13021-016-0069-2. waters, by self-organizing maps (SOM). Braz. J. Dev. 7, 57578–57594. https://doi. R Core Team, 2020. R: a language and environment for statistical computing (en línea). org/10.34117/bjdv7n6-251. Ran, Z., Hu, B., 2017. Parameter identifiability in statistical machine learning: a review. Fearnside, P.M., 2008. Amazon Forest maintenance as a source of environmental Neural Comput. 29, 1151–1203. https://doi.org/10.1162/NECO_a_00947. services. Acad. Bras. Cienc. 80, 101–114. https://doi.org/10.1590/S0001- Raza, K., Singh, N.K., 2021. A tour of unsupervised deep learning for medical image 37652008000100006. analysis. Curr. Med. Imaging Former. Curr. Med. Imaging Rev. 17, 1059–1077. FRA, 2020. Global forest resources assessment 2020: main report. https://fra-data.fao.or https://doi.org/10.2174/1573405617666210127154257. g/PER/fra2020/home/ (accessed 11.13.22). Reygadas, Y., Spera, S.A., Salisbury, D.S., 2023. Effects of deforestation and forest degradation on ecosystem service indicators across the Southwestern Amazon. Ecol. Indic. 147, 109996 https://doi.org/10.1016/j.ecolind.2023.109996. 9 G.G. Casas et al. T r e e s , F o r e s t s a n d P e o p le 14 (2023) 100440 Ribeiro, F.A.L., Rosário, F.F., Bezerra, M.C.M., Wagner, R.de C.C., Bastos, A.L.M., Velásquez Ramírez, M.G., Barrantes, J.A.G., Thomas, E., Gamarra Miranda, L.A., Melo, V.L.A., Poppi, R.J., 2014. Evaluation of chemical composition of waters Pillaca, M., Tello Peramas, L.D., Bazán Tapia, L.R., 2020. Heavy metals in alluvial associated with petroleum production using Kohonen neural networks. Fuel 117, gold mine spoils in the peruvian amazon. Catena 189, 104454. https://doi.org/ 381–390. https://doi.org/10.1016/j.fuel.2013.08.086. 10.1016/j.catena.2020.104454 (Amst). Sacco, D., Motta, G., You, L., Bertolazzo, N., Carini, F., Ma, T., 2017. Smart cities, urban Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S. Springer New York, sensing, and big data: mining geo-location in social networks. In Big Data and Smart New York, NY. https://doi.org/10.1007/978-0-387-21706-2. Service Systems, 59–84). 10.1016/B978-0-12-812013-2.00005-8. Vicencio, R.V., de Vivanco, D.D.R., 2023. Análisis geográfico de la concentración de la SERFOR, 2016. Memoria descriptiva del mapa de ecozonas. Inventario nacional forestal pérdida de bosques húmedos amazónicos del Perú. High Tech-Eng. J. 3, 2–20. y de fauna silvestre (INFFS) - Perú. Vijay, V., Reid, C.D., Finer, M., Jenkins, C.N., Pimm, S.L., 2018. Deforestation risks posed Sierra Praeli, Y., 2021. Perú alcanza cifra de deforestación más alta en los últimos 20 by oil palm expansion in the Peruvian Amazon. Environ. Res. Lett. 13, 114010 años. https://es.mongabay.com/2021/10/peru-aumenta-deforestacion-cifras-bos https://doi.org/10.1088/1748-9326/aae540. ques/ (accessed 10.13.22). Wolski, G.J., Kruk, A., 2020. Determination of plant communities based on bryophytes: Singh, N., Javeed, A., Chhabra, S., Kumar, P., 2015. Missing Value Imputation with the combined use of Kohonen artificial neural network and indicator species Unsupervised Kohonen Self Organizing Map, in: Emerging Research in Computing, analysis. Ecol. Indic. 113, 106160 https://doi.org/10.1016/j.ecolind.2020.106160. Information, Communication and Applications. Springer India, New Delhi, Xu, C., Zhang, X., Hernandez-Clemente, R., Lu, W., Manzanedo, R.D., 2022. Global forest pp. 61–76. https://doi.org/10.1007/978-81-322-2550-8_7. types based on climatic and vegetation data. Sustainability 14, 634. https://doi.org/ Soares-Filho, B., Nepstad, D., Curran, L., Cerqueira, G.C., Garcia, R.A., Ramos, C.A., 10.3390/su14020634. Voll, E., McDonald, A., Lefebvre, P., Schlesinger, P., 2006. Modelling conservation in Yin, H., 2008. The self-organizing maps: background, theories, extensions and the Amazon basin. Nature 440, 520–523. https://doi.org/10.1038/nature04389. applications. pp. 715–762. 10.1007/978-3-540-78293-3_17. Stümer, W., Kenter, B., Köhl, M., 2010. Spatial interpolation of in situ data by self- Yu, X., Xiao, F., Zhou, Y., Wang, Y., Wang, K., 2019. Application of hierarchical organizing map algorithms (neural networks) for the assessment of carbon stocks in clustering, singularity mapping, and Kohonen neural network to identify Ag-Au-Pb- European forests. For. Ecol. Manag. 260, 287–293. https://doi.org/10.1016/j. Zn polymetallic mineralization associated geochemical anomaly in Pangxidong foreco.2010.04.008. district. J. Geochem. Explor. 203, 87–95. https://doi.org/10.1016/j. Swenson, J.J., Carter, C.E., Domec, J.C., Delgado, C.I., 2011. Gold mining in the Peruvian gexplo.2019.04.007. amazon: global prices, deforestation, and mercury imports. PLoS One 6, e18875. Zevallos, J., Lavado-Casimiro, W., 2022. Climate change impact on Peruvian biomes. https://doi.org/10.1371/journal.pone.0018875. Forests 13, 238. https://doi.org/10.3390/f13020238. 10