Douha L, Benoudjit N, Douak F, Melgani F.
Support Vector Regression in Spectrophotometry: An experimental Study. Critical Reviews in Analytical Chemistry [Internet]. 2012;42 (03) :214-219.
Publisher's VersionAbstractIn this work, we present a detailed experimental assessment of an interesting regression approach based on support vector machines (SVMs), a technique relatively recently introduced in the literature. The experimental framework reports a thorough investigation of the performance of SVMs from different viewpoints, including: (i) the influence of the kernel type in the SVM regression task; (ii) the sensitivity to the number of input variables (spectra dimension); (iii) the sensitivity to the available number of training samples; and (iv) the overall stability. The obtained results are compared with those yielded by the radial basis function (RBF) and the multilayer perceptron (MLP) neural networks as well as the traditional multiple linear regression (MLR) method on two different spectrophotometric datasets.
Ferroudji K, Benoudjit N, Bouakaz A.
Microemboli Classification using Non-linear Kernel Support Vectors Machines and RF Signals. Journal of Automation & Systems Engineering (JASE). 2012;6 (2) :123-132.
Douha L, Benoudjit N, Melgani F.
A Robust Regression Approach For Spectrophotometric Signal Analysis. Journal of Chemometrics [Internet]. 2012;26 (07) :400-405.
Publisher's VersionAbstractThe effectiveness of a regression method strongly depends on the characteristics of the considered regression problem. As a consequence, this makes it difficult to choose a priori the most appropriate algorithm for a given dataset. This issue is faced in this work through a novel regression approach based on the fusion of an ensemble of different regressors. In order to implement the proposed robust multiple system (RMS), four different fusion strategies are explored. In this context, we propose a novel fusion strategy named selection‐based strategy (SBS) that provides as output the estimate obtained by the regression algorithm (included in the ensemble) characterized by the highest expected accuracy in the region of the feature space associated with the considered model. The SBS is based not on a direct combination of the estimates yielded by all the regressors but on a selection mechanism that identifies the expected best available estimate. For such purpose, it exploits the accuracies of the regressors included in the ensemble in different portions of the input feature space. The experimental assessment of the RMS was carried out on three different datasets: a wine, an orange juice, and an apple datasets. The obtained experimental results suggest that, in general, the fusion of an ensemble of different regression algorithms leads to a regression process that is more robust and sometimes also more accurate than traditional regression methods. In particular, the proposed SBS method represents an effective solution to carry out the fusion process.
Douak F, Melgani F, Alajlan N, Pasolli E, Bazi Y, Benoudjit N.
Active Learning for Spectroscopic Data Regression. Journal of Chemometrics [Internet]. 2012;26 (07) :374-383.
Publisher's VersionAbstractIn this work, we introduce an active learning approach for the estimation of chemical concentrations from spectroscopic data. Its main objective is to opportunely collect training samples in such a way as to minimize the error of the regression process while minimizing the number of training samples used, and thus to reduce the costs related to training sample collection. In particular, we propose two different active learning strategies developed for regression approaches based on partial least squares regression, ridge regression, kernel ridge regression, and support vector regression. The first strategy uses a pool of regressors in order to select the samples with the greatest disagreements among the different regressors of the pool, while the second one is based on adding samples that are distant from the current training samples in the feature space. For support vector regression, a specific strategy based on the selection of the samples distant from the support vectors is proposed. Experimental results on three different real data sets are reported and discussed.