A prediction distribution of atmospheric pollutants using support vector machines, discriminant analysis and mapping tools (Case study: Tunisia)

Document Type : Original Research Paper


1 Research Laboratory on Electronics and information Technologies: LETI National School of Engineers Sfax, University of Sfax, Tunisia

2 Micro Electro Thermal Systems METS Laboratory National School of Engineering of Sfax, University of Sfax, Tunisia



Monitoring and controlling air quality parameters form an important subject of atmospheric and environmental research today due to the health impacts caused by the different pollutants present in the urban areas. The support vector machine (SVM), as a supervised learning analysis method, is considered an effective statistical tool for the prediction and analysis of air quality. The work presented here examines the feasibility of applying the SVM to predict the ozone and particle concentrations in two Tunisian cities, namely Tunis and Sfax. We used the SVM with the linear kernel, SVM with the polynomial kernel and SVM with the RBF kernel to predict the ozone and particle concentrations in Tunisia for one year. The RBF kernel produced good results for the two pollutants with 0% error rate. Polynomial and linear kernels produced sufficiently low errors for the pollutants, at 9.09% and 18.18%, respectively. Discriminant Analysis (DA) was selected to analyze the datasets of two air quality parameters, namely ozone O3 and Suspended Particles SP. The DA results show that the spatial characterization allows for the successful discrimination between the two cities with an error rate of 4.35% in the case of the linear DA and 0% in the case of the quadratic DA. A thematic map of Tunisia was created using the MapInfo software.


Banja, M., Como, E., Murtaj, B. and Zotaj, A. (2010). Mapping air pollution in urban tirana area using GIS. (Paper presented at International Conference SDI 2010 – Skopje)
Bellander, T., Berglind, N., Gustavsson, P., Jonson, T., Nyberg, F., Pershagen G. and Järup, L. (2001). Using Geographic Information Systems to Assess Individual Historical Exposure to Air Pollution from Traffic and House Heating in Stockholm. Environ Health Persp, 109(6), 633-639.
Brahim-Belhouari, S. and Bermak, A. (2005). Gas identification using density models” Pattern Recogn Lett, 26, 699–706.
Chen, C., Zhang, Z., Ouyang, M.,Liu, X.,Yi, L.,Liang, Y. and Zhang, C. (2015). Shrunken centroids regularized discriminant analysis as a promising strategy for metabolomics data exploration. J. Chemometrics, 29(3), 154-164.
Edward, M.Y and Kuo, S.L. (2012). Applying a Multivariate Statistical Analysis Model to Evaluate the Water Quality of a Watershed. Water Environ Res, 84(12), 2075-2085.
Edward, M.Y and Kuo, S.L. (2013). A Study on the Use of a Statistical Analysis Model to Monitor Air Pollution Status in an Air Quality Total Quantity Control District. Atmosphere, 4(4), 349-364.
Jimoda, L. A. (2012). Effects of particulate matter on human health, the ecosystem, climate and materials: a review. FU„Work Liv Env Prot, 9(1), 27-44.
Lu, W.Z. and Wang, W.J. (2005). Potential assessment of the support vector machine method in forecasting ambient air pollutant trend. Chemosphere, 59(5), 693–701.
Lu, W., Wang, W., Wang, X. and Leung, A.Y.T. (2003). Prediction of Air Pollutant Levels using Support Vector Machines: An Effective Tool. ( in B.H.V. Topping, (Editor), "Proceedings of the Seventh International Conference on the Application of Artificial Intelligence to Civil and Structural Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 52, 2003. doi:10.4203/ccp.78.52).
Malec, L. and Skacel, F. (2008). Analyzing ground ozone formation regimes using a principal axis factoring method: A case study of Kladno Czech Republic indistrual area. Atmosfera, 21(3), 249-263.
Niharika., Venkatadri, M. and Padma, S. (2014). A survey on Air Quality forecasting Techniques. Int. j. comput. sci. inf. technol, 5(1), 103-107.
Wu, G. and Feng, T. (2015). A theoretical contribution to the fast implementation of null linear discriminant analysis with random matrix multiplication. Numer. Linear Algebra Appl, doi:10.1002/nla.1990.
Yang, J.Y., Ip, W.F., Vong, C.M. and Wong, P.K. (2011, June). Effect of Choice of Kernel in Support Vector Machines on Ambient Air Pollution Forecasting. (Paper presented at International Conference on System Science and Engineering, Macau, China)
Saithanu, K. and Mekparyup, J. (2014). Air quality assessment in the urban areas with multivariate statitical analysis at the east of Thailand. Int J Pure Appl Math, 91(2), 169-177.
Zhao, G., Song, J. and Song, J. (2013). Analysis about Performance of Multiclass SVM Applying in IDS. (Paper presented at International Conference on Information, Business and Education Technology ICIBIT)