A High Growth-Rate Emerging Pattern for Data Classification in Microarray Databases
2013-05-06 10:16:56   来源:   评论:0 点击:

Abstract—In the data classification problem for microarray datasets, we consider two biology datasets which reflect two extreme different classes for the given same sets of tests. Basically, the classification process contains two phases: (1) the training phase, and (2) the testing phase. The propose of the training phase is to find the representative Emerging Patterns (EPs) in each of these two datasets, where an EP is an itemset which satisfies some conditions of the growth rate from one dataset to another dataset. Note that the growth rate represents the differences between these two datasets. The EJEP strategy considers only those itemsets whose growth rates are infinite, since it claims that those itemsets may result in the high accuracy. However, the EJEP strategy will not keep those useful EPs whose growth rates are very high but not infinite. But, the real-world data always contains noises. The NEP strategy considers noises and provides the higher accuracy than the EJEP strategy. However, it still may miss some itemsets with high growth rates, which may result in the low accuracy. Therefore, in this paper, we propose a High Growth-rate EP (HGEP) strategy to improve the accuracy of the NEP strategy. From the performance study, our HGEP strategy shows the higher accuracy than the NEP strategy.

Index Terms—classification, data mining, emerging pattern, gene expression, microarray

Cite: Ye-In Chang, Zih-Siang Chen, and Tsung-Bin Yang, "A High Growth-Rate Emerging Pattern for Data Classification in Microarray Databases," Lecture Notes on Information Theory, Vol.1, No.1, pp. 6-10, March 2013. doi: 10.12720/lnit.1.1.6-10



上一篇:Performance Comparison of Packet Combining Based Error Recovery Schemes for Wireless Sensor Network
下一篇:Readiness Analysis of the Implementation of Indonesian EHMS based on COBIT 5 Enablers

分享到: 收藏