In this paper, machine learning technology was introduced to forestry machine, and harvesting targets data were classified based on fuzzy support vector machine (FSVM). Fuzzy membership function largely determines the classifier results, so a new membership function was proposed to improve classification performance. Clustering center was selected based on a K-means algorithm, then the membership function was determined via comparing distance between samples to the positive and negative clustering centers in feature space, respectively. It has good learning ability and generalization performance by the experiment with common SVM and representative Fuzzy-SVM. Besides, this model is applied in harvesting target detection, which improves classification accuracy and satisfies with the request of forestry machine.
Open Peer Review Details | |||
---|---|---|---|
Manuscript submitted on 14-03-2016 |
Original Manuscript | Optimization and Analysis on Fuzzy SVM for Targets Classification in Forest |
Support vector machine (SVM) is a machine learning algorithm first proposed by Cortes and Vapnik in 1995 [1C. Cortes, and V. Vapnik, "Support vector machine", Mach. Learn., vol. 20, no. 3, pp. 273-297, 1995.
[http://dx.doi.org/10.1007/BF00994018] ]. Compared with traditional artificial neural networks, SVM does not only simplify the learning algorithm, but also improves the technical performance, especially the generalization. Consequently, SVM has become a hot spot in the field of machine learning in recent years.
Currently, SVM algorithm has been widely used in pattern recognition, regression estimation, probability density function estimation, etc. Agrawal S. proposed a novel color image classification approach by using SVM, and the experimental results showed that this SVM-based method could improve the color image classification accuracy [2S. Agrawal, N.K. Verma, and P. Tamrakar, "Content based color image classification using SVM", In: Information Technology: New Generations (ITNG), 8th International Conference on IEEE, 2011, pp. 1090-1094.
[http://dx.doi.org/10.1109/ITNG.2011.202] ]. Kai-Ying Chen et al. established an SVM-based model to analyze the failure of turbines in thermal power facilities, and the results showed that SVM outperformed linear discriminant analysis (LDA) and back-propagation neural networks (BPN) in classification performance [3K. Y. Chen, L. S. Chen, and M. C. Chen, "Using SVM based method for equipment fault detection in a thermal power plant", Comput. Ind., vol. 62, no. 1, pp. 42-50, 2011.
[http://dx.doi.org/10.1016/j.compind.2010.05.013] ]. Huimin Qian et al. put forward a system to recognize multiple kinds of activities from videos by an SVM multi-class classifier with binary tree architecture, and the tests on a home-brewed activity dataset and the public Schüldt’s dataset both confirmed the perfect identification performance and high robustness of the system [4H. Qian, Y. Mao, and W. Xiang, "Recognition of human activities using SVM multi-class classifier", Pattern Recogn. Lett., vol. 31, no. 2, pp. 100-111, 2010.
[http://dx.doi.org/10.1016/j.patrec.2009.09.019] ].
However, in practical applications, many outliers may make the sample data not ideal as expected, which have a significant weakening impact on the generalization performance of SVM. In view of this, Chunfu Lin and Shengde Wang put forward a fuzzy vector machine algorithm to calculate the fuzzy membership for each sample, by which the effects of noise on SVM could be eliminated effectively [5C.F. Lin, and S.D. Wang, "Fuzzy support vector machines", Neural Netw. IEEE Trans., vol. 13, no. 2, pp. 464-471, 2002.]. Based on FSVM, many kinds of estimating methods for fuzzy membership functions were presented to improve the performance of SVM. XF Jiang calculated fuzzy membership in the feature space, which is represented by kernels. It improved classification accuracy and generalization [6X. Jiang, Z. Yi, and J.C. Lv, "Fuzzy SVM with a new fuzzy membership function", Neural Comput. Appl., vol. 15, no. 3-4, pp. 268-276, 2006.
[http://dx.doi.org/10.1007/s00521-006-0028-z] ]. Rukshan Batuwita and Vasile Palade combined class imbalance learning (CIL) methods and FSVM to solve the imbalance classification problem, and the tests on many different datasets indicated that the proposed FSVM-CIL method was very effective for CIL, especially in the presence of outliers and noise in the datasets [7R. Batuwita, and V. Palade, "FSVM-CIL: fuzzy support vector machines for class imbalance learning", Fuzzy Syst. IEEE Trans., vol. 18, no. 3, pp. 558-571, 2010.]. RK Sevakula used clustering algorithm to find outliers, then assigned these outliers membership value lower than “1” and the other sample with value equal to “1” [8R.K. Sevakula, and N.K. Verma, "Clustering based outlier detection in fuzzy SVM", In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). vol. 1172-1177, 2014.
[http://dx.doi.org/10.1109/FUZZ-IEEE.2014.6891600] ]. YW Cheng proposed a hybrid system. They applied K-NN method to model a preprocessor to evaluate locally the contaminant level of the degenerated training data, and used FSVM as a postprocessor to learn globally an optimized classifier [9Y.W. Cheng, T.J. Wen, and H.C. Cheng, "Distance weighted fuzzy k-NN SVM", In: International Conference on Networking, Sensing, and Control, IEEE, 2016.]. ON Almasi presented a method to calculate membership function and solve model selection problems based on firefly algorithm, which is a nature-inspired optimization algorithm [10O.N. Almasi, and M. Rouhani, "A new fuzzy membership assignment and model selection approach based on dynamic class centers for fuzzy SVM family using the firefly algorithm", Turk. J. Electr. Eng. Comput. Sci, vol. 24, no. 3, pp. 1797-1814, 2016.
[http://dx.doi.org/10.3906/elk-1310-253] ].
All the above FSVM models had good performance according to the experimental results. However, they ignored an important problem that the outliers and the support vectors are all far away from the clustering center. The lower membership values for support vectors will also reduce accuracy of classifier. In order to solve this problem, in this paper, a new fuzzy membership estimating method is proposed for FSVM. The proposed model was compared with common SVM and representative Fuzzy-SVM. It was also applied to targets classification in forest. The experimental results showed that the proposed method can effectively eliminate the noise impact and improve the classification accuracy.
As a new machine learning algorithm, the core idea of SVM is to map non-linear problem in the low-dimensional space to high-dimensional space so that the non-linear problem can be transformed into linearly separable problem. SVM only concerns the points nearest to the hyperplane (called as support vectors), and the hyperplane can be defined by maximizing the distances from support vectors to itself. Thus, the objective function for solving the hyperplane can be expressed as follows:
However, outliers may make the hyperplane move easily. Considering the high sensitivity of SVM to outliers, slack variable ζ is introduced to the quadratic programming problems to permit the existence of outliers. Then, the objective function can be converted into the following form:
where penalty parameter C is used to limit the impact of outliers on the objective function. This modified model is called as soft interval classifier.
To solve the objective function, Lagrangian operator is introduced into the formula. By simplification and conversion, the hyperplane can be expressed as follows:
where K (xi, x) is the kernel function for solving non-linear separable problems. In this paper, radial basis function (RBF) is selected as the kernel function, which can be expressed as follows:
In soft interval classifier, the value of penalty parameter C cannot be too large or too small in order to ensure the effect of classifier. Hence, fuzzy membership si is introduced to SVM. For a training dataset {x1, x2, x1,…}, there is a label yi for each xi, denoted as {(x1, y1), (x2, y2),…}. After si is introduced, the dataset is denoted as {(x1, y1, s1), (x2, y2, s2),…}, and the objective function can be rewritten as:
where si takes a value between 0 and 1 (0< si ≤ 1). si represents the probability of each sample belonging to its label yi. Meanwhile, it counteracts the impact of penalty parameter C on classifiers.
Therefore, an appropriate membership function is very important for FSVM model. Firstly, the lower bound of the membership should be defined. Secondly, the membership function should be constructed according to the characteristic datasets. At present, the commonly used membership calculation methods are mainly based on the distances from sample points to the class center.
In order to calculate fuzzy membership, the class center for samples should be determined. Commonly used clustering algorithms include k-means, STING, CLIQUE and CURE. K-means algorithm is a clustering algorithm based on partition, and it is the most widely used clustering method. In this research, the class center was calculated based on a K-means algorithm proposed in Ref [11M. Huang, Z. He, and X. Xing, "New k-means clustering center select algorithm", Comput. Eng. App., vol. 47, no. 35, pp. 132-134, 2011.]. This method clusters samples by calculating density distribution, and its specific steps are as follows:
1. Calculate d(xi, xj), which is distances between every two points in the sample set.
2. Calculate means of all distances.
3. For i = 1: k, calculate density for all samples.
4. The sample with the max density value is chosen as class center.
Although SVMs often produce effective solutions for balanced datasets, they are sensitive to the imbalance in the datasets and produce sub-optimal models. Ref [12R. Batuwita, and V. Palade, "FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning", IEEE Trans. Fuzzy Syst., vol. 18, no. 3, pp. 558-571, 2010.
[http://dx.doi.org/10.1109/TFUZZ.2010.2042721] ] has improved the standard FSVM method by combining it with the DEC (Different Error Costs) method, which is called the FSVM-CIL. In the FSVM-CIL method, the membership values for data points are assigned in such a way to satisfy the following two goals:
1. To suppress the effect of between class imbalance.
2. To reflect the within class importance of different training examples in order to suppress the effect of outliers and noise.
Our research data and most of real-world data are unbalanced, therefore the FSVM-CIL method is adopted in experiment. In the FSVM-CIL method, the membership functions are defined as follows:
where f(xi) generates a value between 0 and 1, which reflects the importance of xi in its own class. The values for r+ and r- were assigned in order to reflect the class imbalance, such that r+ = 1 and r- = r, where r is the minority to majority class ratio (r+ > r-).
Many membership calculation methods didn’t discriminate outliers and support vectors. In order to solve this problem, a new membership was proposed in this paper based on Ref [6X. Jiang, Z. Yi, and J.C. Lv, "Fuzzy SVM with a new fuzzy membership function", Neural Comput. Appl., vol. 15, no. 3-4, pp. 268-276, 2006.
[http://dx.doi.org/10.1007/s00521-006-0028-z] ], in which the membership was calculated in feature space. In the proposed FSVM model, the samples nearest the boundary of two classes were acquired firstly to distinguish support vectors and outliers. All the calculations were conducted in feature space, thus the membership can be obtained by the following formulas:
Where
Formula (12) is the linear decay and exponential decay function to calculate membership value. In the membership function (11), Ф(xi+) is the sample mapped to the feature space with label yi=1, and Ф(xi–) is the one with label yi=-1; Ф(xcen+) is the positive class center in feature space and Ф(xcen–) is the negative one; Ф(x*+) is the positive sample nearest the boundary and Ф(x*–) is the negative one. di is the distance from sample to the positive or negative class center, and di* is the distance from sample to the sample nearest class boundary. In the feature space they can be calculated as follows:
where
di* can be obtained in the same way.
The experiments are achieved by the widely used “libsvm” toolbox under Matlab language environment. Classifier models are tested by 6 different datasets, and the geometric mean of sensitivity is used to evaluate models in order to demonstrate the comprehensive classification performance. In the results, SE represents the positive class classification accuracy, SP is the negative class classification accuracy, and denotes the overall performance.
In this study, the Tree dataset is constructed by our research, and the other five are the benchmark real-world datasets obtained from UCI machine learning repository. The Tree data were obtained by multi-sensor fusion. Features of trees such as the height, width, shape, temperature, and color were extracted from 2D laser points, visible images and infrared images. The details of the six datasets are listed as Table 1. These datasets all consist of two classes and contain different sample data and attributes, covering all kinds of areas. Hence, these datasets can be considered as representatives that effectively verify the proposed FSVM models.
First, Tree dataset was used to verify the proposed algorithm. Common SVM and FSVM proposed by Ref [6X. Jiang, Z. Yi, and J.C. Lv, "Fuzzy SVM with a new fuzzy membership function", Neural Comput. Appl., vol. 15, no. 3-4, pp. 268-276, 2006.
[http://dx.doi.org/10.1007/s00521-006-0028-z] ] were conducted in the experiment as comparisons. They were denoted by SVM and FSVM-1 respectively. By using linear decay formula and exponential decay formula, classification results were calculated with different penalty parameter C and shown in Table 2.
In Table 2, it can be seen that when C ≤ 10, models with linear decay formula performs not well; when C ≥ 50, the proposed FSVM-2 with linear decay formula has the best classification accuracy. While as C increases, the results of proposed FSVM become consistent with those of other models. That is because the increment of C can weaken the effect of fuzzy membership function so as to maintain the results of the proposed FSVM consistent with other models when C is too large. It is found that the models with C = 50 performs best, thus the subsequent experiments are implemented with C = 50. Overall, the proposed FSVM-2 outperformed common SVM and FSVM-1 and it has better generalization ability.
In order to better verify the classification performance, the FSVM-2 models are tested by another five different benchmark datasets, and the memberships are also calculated using linear and exponential functions respectively, as shown in Table 3.
From Table 3, it is obvious that the FSVM-2 has the best classification performance except the Wine and Pima datasets. According to the analysis, this discrepancy may be related with the dataset itself. In the two datasets, there is a close correlation among the data, and the class centers may overlap or approach each other. Therefore, a wholly positive or negative classification may occur according to the number of training samples in FSVM. In addition, each sample is multidimensional and has many attributes, so that it has better results when mapped to the high-dimensional space in FSVM-1 and FSVM-2. The proposed one considered the importance of support vectors and the class imbalance problem, thus it performed better than FSVM-1.
In summary, a new membership function was proposed to improve harvesting targets classification accuracy. In this method, a K-means algorithm was applied to determine the class center firstly; then decided the samples ratio to solve class imbalance problem; next, distances from sample points to the positive and negative class centers were calculated respectively, and the distances from sample points to the vectors nearest class boundary were calculated to locate support vectors; at last, fuzzy memberships were calculated by linear and exponential decay formula respectively with different distances as input. This method was applied in the feature space, with common SVM and FSVM-1 as comparison. After model construction, the performance of each model was evaluated by 6 datasets. The results showed that the proposed FSVM-2 could effectively eliminate the noise impact and improve the classification accuracy for the data with large association and many outliers. It also improved harvesting targets classification accuracy and satisfied with the request of forestry machine.
The authors confirm that this article content has no conflict of interest.
This study is financially supported by National Natural Science Foundation of China (51275272) and National Science Foundation for Young Scholars (51405263).
[1] | C. Cortes, and V. Vapnik, "Support vector machine", Mach. Learn., vol. 20, no. 3, pp. 273-297, 1995. [http://dx.doi.org/10.1007/BF00994018] |
[2] | S. Agrawal, N.K. Verma, and P. Tamrakar, "Content based color image classification using SVM", In: Information Technology: New Generations (ITNG), 8th International Conference on IEEE, 2011, pp. 1090-1094. [http://dx.doi.org/10.1109/ITNG.2011.202] |
[3] | K. Y. Chen, L. S. Chen, and M. C. Chen, "Using SVM based method for equipment fault detection in a thermal power plant", Comput. Ind., vol. 62, no. 1, pp. 42-50, 2011. [http://dx.doi.org/10.1016/j.compind.2010.05.013] |
[4] | H. Qian, Y. Mao, and W. Xiang, "Recognition of human activities using SVM multi-class classifier", Pattern Recogn. Lett., vol. 31, no. 2, pp. 100-111, 2010. [http://dx.doi.org/10.1016/j.patrec.2009.09.019] |
[5] | C.F. Lin, and S.D. Wang, "Fuzzy support vector machines", Neural Netw. IEEE Trans., vol. 13, no. 2, pp. 464-471, 2002. |
[6] | X. Jiang, Z. Yi, and J.C. Lv, "Fuzzy SVM with a new fuzzy membership function", Neural Comput. Appl., vol. 15, no. 3-4, pp. 268-276, 2006. [http://dx.doi.org/10.1007/s00521-006-0028-z] |
[7] | R. Batuwita, and V. Palade, "FSVM-CIL: fuzzy support vector machines for class imbalance learning", Fuzzy Syst. IEEE Trans., vol. 18, no. 3, pp. 558-571, 2010. |
[8] | R.K. Sevakula, and N.K. Verma, "Clustering based outlier detection in fuzzy SVM", In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). vol. 1172-1177, 2014. [http://dx.doi.org/10.1109/FUZZ-IEEE.2014.6891600] |
[9] | Y.W. Cheng, T.J. Wen, and H.C. Cheng, "Distance weighted fuzzy k-NN SVM", In: International Conference on Networking, Sensing, and Control, IEEE, 2016. |
[10] | O.N. Almasi, and M. Rouhani, "A new fuzzy membership assignment and model selection approach based on dynamic class centers for fuzzy SVM family using the firefly algorithm", Turk. J. Electr. Eng. Comput. Sci, vol. 24, no. 3, pp. 1797-1814, 2016. [http://dx.doi.org/10.3906/elk-1310-253] |
[11] | M. Huang, Z. He, and X. Xing, "New k-means clustering center select algorithm", Comput. Eng. App., vol. 47, no. 35, pp. 132-134, 2011. |
[12] | R. Batuwita, and V. Palade, "FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning", IEEE Trans. Fuzzy Syst., vol. 18, no. 3, pp. 558-571, 2010. [http://dx.doi.org/10.1109/TFUZZ.2010.2042721] |