目的 目前垃圾主要采用名称检索的方式开展分类，这类方法通常基于事先设定的数据分类，很难有效地包含现有所有的垃圾，更难应对未来持续增多的垃圾，针对上述问题，面向生活垃圾，本文提出一种基于自我训练的长效垃圾分类方法。方法 首先，采用Bagging将两类分类能力和训练机制不同的基分类器(KNNs和SVMs)，根据它们各自独立的投票和权重进行有机地组合，提出了一种新颖的集成分类器对生活垃圾进行分类；其次，基于直观的图像交互反馈，动态地更新分类器相应分类结果的置信度和基于云的训练样本集，提升后续分类的准确性和方法本身的自学习能力。结果 使用包含233条生活垃圾的训练样本集对原型系统进行训练，并使用151条垃圾样例进行测试，实验表明本文提出的集成分类器对生活垃圾的分类准确性可以达到95%左右。通过逐步提高训练样本集中错误样本的比例（≤30%）并重新训练集成分类器，之后，采用上述151条样例共开展了150次分类测试。相应的平均准确率分析表明，本文的集成分类器具有较高和较为稳定的分类准确率（≥93%）。此外，在上述实验中加入反馈机制后，平均准确率分析表明，该机制能有效地减轻错误样本对本文集成分类器准确率衰减带来的影响。结论 本文提出的方法对生活垃圾分类具有较高的分类准确率、鲁棒性且具有良好的长效性。
Long-term Garbage Classification based on Self-training
Liu Yaxuan,Pan Wanbin(School of Media and Design,Hangzhou Dianzi University)
Objective Along with the improvement of people''s consumption, more and more daily garbage both in the quantity and type are produced. Classifying them correctly is important to protect human in healthy and maintain environment in clean & safe, which needs the joint efforts of all of us. With the popularity of the WWW and the development of information technology, retrieving garbage by mobile based on garbage name is a popular garbage classification method. However, this method usually works on some static data classifications, which makes it difficult to cover all garbage and extend to include increasing while new kinds of garbage. To improve the above problem, a long-term garbage classification method based on self-training is present for domestic garbage in this paper. Method The proposed method, making full use of the capability of the machine learning, can update its corresponding training set and carry out self-trainings according to the users’ inputs and feedbacks realized through garbage image selection. Thus, the more user participation, the more classification accuracy of our method. Accordingly, the proposed method is mainly composed of two parts: (1) in order to make our method have a pretty classification ability, a novel ensembled classifier, integrating KNNs and SVMs (as basis classifiers) together by adopting Bagging based on their independent voting and weights, is adopted, where misclassification-oversampling technology is also combined with Bagging to promote the accuracies of these basis classifiers; (2) Secondly, a feedback mechanism, based on image selection, is used to automatically update our classifier’s confidence and extend our garbage training set, to upgrade its classification accuracy and self-training ability. Result A corresponding domestic garbage classifying prototype is developed to validate the effectiveness of the above method. Here, a training set containing 233 garbage samples is used to train our ensembled classifier while a test set within 151 garbage samples is used to evaluate the accuracy and the robustness of our ensembled classifier. The experiments demonstrate that the average classification accuracy rate of the ensembled classifier, near to 95%, is better than the performance of each basis classifier. Moreover, along with increasing the proportion of the incorrect sample in the training set gradually (≤30%), we, correspondingly, train the ensembled classifier on each of these data and then carry out classification test by using the above test set on each of them as well. The corresponding average accuracy analysises illustrate that our ensembled classifier can maintain a relatively high and a stable classification accuracy rate (≥93%) while the feedback mechanism can effectively help our method to alleviate the negative influence brought by the incorrect samples. Conclusion Classifying garbage is closely related to people''s healthy daily life and environment protection. However, long-term methods, to effectively do the above job along with the more and more garbage increasing both in number and kind, are still rare, especially in the mobile platform. Hence, a new long-term garbage classification based on self-training method for domestic garbage is presented in this work. The method has an accurate and robust domestic garbage classification ability as well as a self-learning ability, which are ensured by a novel ensembled classifier and feedback mechanism. However, the method is still having some disadvantages that should be improved, such as: 1) the garbage image input is mainly used by our feedback mechanism while its corresponding features are mainly described by text since the general and effective methods for extracting garbage features from images are still rare; 2) the automatic feedback mechanism should be studied in order to further the automation level of the whole method.