A Novel Machine Learning Framework for Predicting 232Th Distribution in Radionuclide-Contaminated Soils Using Physicochemical Environmental Factors

Document Type : Original Research Paper

Authors

1 Department of Chemistry, Abubakar Tafawa Balewa University, Gubi Campus, 740102, Bauchi, Nigeria

2 Department of Environmental Management Technology, Abubakar Tafawa Balewa University, Yelwa Campus, 740272, Bauchi, Nigeria

3 Interdisciplinary Research Centre for Membranes and Water Security, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia

4 Operational Research Center in Healthcare, Near East University, TRNC, Mersin 10, 99138, Nicosia, Turkey

5 Department of Analytical Chemistry, Faculty of Pharmacy, Near East University, TRNC, Mersin 10, 99138, Nicosia, Turkey

6 Department of Chemical Engineering, Prince Mohammad Bin Fahd University, Al Khobar, Saudi Arabia

10.22059/poll.2025.400497.3073

Abstract

This study investigates the role of soil chemistry, specifically pH, organic carbon (OC), organic matter (OM), and cation exchange capacity (CEC), in influencing the mobility and distribution of 232Th radionuclides in abandoned mine soils using advanced machine learning (ML) models. Soil samples were collected from multiple locations across different seasons. Gaussian Process Regression (GPR), Long Short-Term Memory (LSTM) networks, Adaptive Neuro-Fuzzy Inference System (ANFIS), and Random Forest (RF) models were employed to predict 232Th distribution, with feature selection identifying optimal model combinations (C1, C2, and C3). The performance evaluation of machine learning models revealed distinct patterns in predicting 232Th distribution. The results indicate that GPR-C1 exhibited the highest predictive accuracy, with MAPE improving from 8.9909 to 3.0468 and MAE reducing from 3.5236 to 1.6044 during the verification phase. In addition, GPR-C1 emerged as the top-performing model during both training (RMSE = 7.0851, DC = 0.6482) and testing (RMSE = 4.5808, DC = 0.5848), demonstrating its robustness in capturing non-linear relationships between soil properties (pH, OC, OM, CEC) and 232Th mobility. In contrast, RF models (RF-C1, RF-C3) exhibited the poorest performance (training RMSE > 11.5123; testing RMSE > 7.6855), likely due to their inability to resolve complex geochemical interactions, as evidenced by their low DC (<0.2) and PCC (<0.3) values. A notable observation was that several models exhibited lower RMSE in the testing set than in calibration, reflecting the reduced variance within the held-out site–season blocks; however, nested cross-validation and a leave-site-out analysis consistently identified GPR-C1 as the most reliable and accurate model. This aligns with field data showing higher 232Th mobility during wet seasons due to leaching and runoff transport (p < 0.05). For instance, testing RMSE (4.5808) of GPR-C1 was significantly lower than its training RMSE (7.0851), reinforcing the role of seasonal dynamics in 232Th redistribution. Therefore, this model demonstrates significant potential for accurately predicting 232Th behaviour and distribution, crucial for environmental risk assessments. Hence, accurate predictions of 232Th distribution can guide targeted remediation efforts and inform land management practices, mitigating risks associated with 232Th exposure. 

Keywords

Main Subjects


Aba, A., Al-Boloushi, O., Ismaeel, A., & Al-Tamimi, S. (2021). Migration behaviour of radiostrontium and radiocesium in arid-region soil. Chemosphere, 281, 130953.
Abba, S. I., Benaafi, M., Usman, A. G. and Aljundi, I. H. (2022). Inverse groundwater salinization modeling in a sandstone’s aquifer using stand-alone models with an improved non-linear ensemble machine learning technique. Journal of King Saud University-Computer and Information Sciences, 34(10), 8162-8175.
Abba, S. I., Benaafi, M., Usman, A. G. and Aljundi, I. H. (2023). Sandstone groundwater salinization modelling using physicochemical variables in Southern Saudi Arabia: Application of novel data intelligent algorithms. Ain Shams Engineering Journal, 14(3), 101894.
Abba, S. I., Usman, A. G. and Selin, I. (2020). Simulation for response surface in the HPLC optimization method development using artificial intelligence models: A data-driven approach. Chemometrics and Intelligent Laboratory Systems, 201, 104007.
Abdullahi, H. U., Usman, A. G., Abba, S. I. and Abdullahi, H. U. (2020). Modelling the absorbance of a bioactive compound in HPLC method using artificial neural network and multilinear regression methods. Dutse Journal of Pure Applied Science, 6, 362-371.
Alamrouni, A., Aslanova, F., Mati, S., Maccido, H. S., Jibril, A. A., Usman, A. G. and Abba, S. I. (2022). Multi-regional modeling of cumulative COVID-19 cases integrated with environmental forest knowledge estimation: A deep learning ensemble approach. International Journal of Environmental Research and Public Health, 19(2), 738.
Alshahrani, B., Fares, S., Salman, M., & Korna, A. H. (2025). Assessment of natural radioactivity levels in black sand and sand sediments in the Mediterranean coast region, Egypt. Environmental Challenges, 18, 101061.
Alzubaidi, G., Hamid, F. B. and Abdul Rahman, I. (2016). Assessment of natural radioactivity levels and radiation hazards in agricultural and virgin soil in the state of Kedah, North of Malaysia. The Scientific World Journal, 2016(1), 6178103.
Amir, M. N. I., Ismail, N. I., Wood, A. K., Saat, A. and Hamzah, Z. (2015). Effectiveness of mineral soil to adsorb the natural occurring radioactive material (norm), uranium and thorium. In AIP Conference Proceedings (Vol. 1659, No. 1). AIP Publishing.
Barnekow, U., Fesenko, S., Kashparov, V., Kis-Benedek, G., Matisoff, G., Onda, Y., ... & Varg, B. (2019). Guidelines on soil and vegetation sampling for radiological monitoring, technical reports series no. 486 of International Atomic Energy Agency (IAEA) Vienna.
Beretta, A., Bassahum, D., & Musselli, R. (2014). ¿ Medir el pH del suelo en la mezcla suelo: agua en reposo o agitando?. Agrociencia (Uruguay), 18(2), 90-94.
Ferdosi, H., Abbasianjahromi, H., Banihashemi, S. and Ravanshadnia, M. (2023). BIM applications in sustainable construction: scientometric and state-of-the-art review. International Journal of Construction Management, 23(12), 1969-1981.
Gbadamosi, A., Adamu, H., Usman, J., Usman, A. G., Jibril, M. M., Salami, B. A. and Abba, S. I. (2024). New-generation machine learning models as prediction tools for modeling interfacial tension of hydrogen-brine system. International Journal of Hydrogen Energy, 50, 1326-1337.
Gillman, G. P. and Sumpter, E. A. (1986). Modification to the compulsive exchange method for measuring exchange characteristics of soils. Soil Research, 24(1), 61-66.
Guagliardi, I., Rovella, N., Apollaro, C., Bloise, A., Rosa, R. D., Scarciglia, F. and Buttafuoco, G. (2016). Modelling seasonal variations of natural radioactivity in soils: A case study in southern Italy. Journal of Earth System Science, 125, 1569-1578.
Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3, 1157-1182.
Habib, M. A., Basuki, T., Miyashita, S., Bekelesi, W., Nakashima, S., Phoungthong, K. and Techato, K. (2019). Distribution of naturally occurring radionuclides in soil around a coal-based power plant and their potential radiological risk assessment. Radiochimica Acta, 107(3), 243-259.
Hazelton, P. A and Murphy, B.W (2007) Interpreting Soil Test Results: What Do All The Numbers Mean?. CSIRO Publishing: Melbourne. Pp 51.
Hofmann, P., Achatz, M., Fohlmeister, J., Schmidt, K., Berg, T., & Sarvan, I. (2025). Levels of naturally occurring radionuclides in foods from the first German total diet study. Science of The Total Environment, 965, 178653.
IAEA-TECDOC-1415. (2004). Soil sampling for environmental contaminants.
Inakollu, P., Philip, T., Rai, A. K., Yueh, F. Y. and Singh, J. P. (2009). A comparative study of laser induced breakdown spectroscopy analysis for element concentrations in aluminum alloy using artificial neural networks and calibration methods. Spectrochimica Acta Part B: Atomic Spectroscopy, 64(1), 99-104.
Jibril, M. M., Malami, S. I., Jibrin, H. B., Muhammad, U. J., Duhu, M. A., Usman, A. G. and Abba, S. I. (2024). New random intelligent chemometric techniques for sustainable geopolymer concrete: Low-energy and carbon-footprint initiatives. Asian Journal of Civil Engineering, 25(2), 2287-2305.
Li, H., Wang, Q., Zhang, C., Su, W., Ma, Y., Zhong, Q., ... & Xiao, T. (2024). Geochemical distribution and environmental risks of radionuclides in soils and sediments runoff of a uranium mining area in South China. Toxics, 12(1), 95.
Maher, K., Bargar, J. R., & Brown Jr, G. E. (2013). Environmental speciation of actinides. Inorganic Chemistry, 52(7), 3510-3532.
Meng, Y., Jiang, J. and Wu, J. (2024). A physics-enhanced neural network for estimating longitudinal dispersion coefficient and average solute transport velocity in porous media. Geophysical Research Letters, 51, e2024GL110683.
Nourani, V., Gökçekuş, H., Umar, I. K., & Najafi, H. (2020). An emotional artificial neural network for prediction of vehicular traffic noise. Science of the Total Environment, 707, 136134.
Osman, R., Dawood, Y. H., Melegy, A., El-Bady, M. S., Saleh, A. and Gad, A. (2022). Distributions and risk assessment of the natural radionuclides in the soil of Shoubra El Kheima, South Nile Delta, Egypt. Atmosphere, 13(1), 98.
Ouyang, N., Zhang, P., Zhang, Y., Sheng, H., Zhou, Q., Huang, Y., & Yu, Z. (2023). Cation exchange properties of subsurface soil in mid-subtropical China: Variations, correlation with soil-forming factors, and prediction. Agronomy, 13(3), 741.
Pecha, P., Tichý, O. and Pechová, E. (2021). Determination of radiological background fields designated for inverse modelling during atypical low wind speed meteorological episode. Atmospheric Environment, 246, 118105.
Peng, F., Wen, J., Zhang, Y. and Jin, J. (2020, September). Monthly streamflow prediction based on random forest algorithm and phase space reconstruction theory. In Journal of Physics: Conference Series (Vol. 1637, No. 1, p. 012091). IOP Publishing.
Saravani, M. J., Noori, R., Jun, C., Kim, D., Bateni, S. M., Kianmehr, P. and Woolway, R. I. (2025). Predicting chlorophyll-a concentrations in the world’s largest lakes using Kolmogorov–Arnold Networks. Environmental Science & Technology, 59(3), 1801–1810.
Sarkar, B., Mukhopadhyay, R., Ramanayaka, S., Bolan, N., & Ok, Y. S. (2021). The role of soils in the disposition, sequestration and decontamination of environmental contaminants. Philosophical Transactions of the Royal Society B, 376(1834), 20200177.
Shang, Z. and He, J. (2018). Predicting Hourly $\mathbf {PM} _ {2.5} $ Concentrations Based on Random Forest and Ensemble Neural Network. In 2018 Chinese Automation Congress (CAC) (pp. 2341-2345). IEEE.
Smičiklas, I., & Šljivić-Ivanović, M. (2016). Pollutants Mobility with Implication to Remediation Strategies. Soil Contamination: Current Consequences and Further Solutions, 253.
Thakur, P., Ward, A. L., & González-Delgado, A. M. (2021). Optimal methods for preparation, separation, and determination of radium isotopes in environmental and biological samples. Journal of Environmental Radioactivity, 228, 106522.
UNSCEAR. (2008): Sources and Effects of Ionizing Radiation: Report to the General Assembly, With Scientific Annexes, 2, 1–219. United Nations, New York
Usman, A. G., Tanimu, A., Abba, S. I., Isik, S., Aitani, A. and Alasiri, H. (2023). Feasibility of the Optimal Design of AI-Based Models Integrated with Ensemble Machine Learning Paradigms for Modeling the Yields of Light Olefins in Crude-to-Chemical Conversions. ACS omega, 8(43), 40517-40531.
Usman, A. G., Işik, S., & Abba, S. I. (2021). Hybrid data-intelligence algorithms for the simulation of thymoquinone in HPLC method development. Journal of the Iranian Chemical Society, 18(7), 1537-1549.
Uzun-Ozsahin, D., Precious Onakpojeruo, E., Bartholomew Duwa, B., Usman, A. G., Isah Abba, S. and Uzun, B. (2023). COVID-19 Prediction Using Black-Box Based Pearson Correlation Approach. Diagnostics, 13(7), 1264.
Wan, H., Xiang, L., Cai, Y., Xie, Y. and Xu, R. (2025). Temporal and spatial feature extraction using graph neural networks for multi-point water quality prediction in river network areas. Water Research, 281, 123561.
Xu, X., Mumford, T. and Zou, P. X. (2021). Life-cycle building information modelling (BIM) engaged framework for improving building energy performance. Energy and Buildings, 231, 110496.
Zheng, Y., Zhang, X., Zhou, Y. and Zhang, Y.-P. (2025). Deep representation learning enables cross-basin water quality prediction under data-scarce conditions. npj Clean Water, 8(1), 466.
Zhi, W., Appling, A. P., Golden, H. E., Podgorski, J. and Li, L. (2024). Deep learning for water quality. Nature Water, 2, 228–241.
Zubaidi, S. L., Al-Bugharbee, H., Ortega-Martorell, S., Gharghan, S. K., Olier, I., Hashim, K. S. and Kot, P. (2020). A novel methodology for prediction urban water demand by wavelet denoising and adaptive neuro-fuzzy inference system approach. Water, 12(6), 1628.