Studying the Reduction Techniques for Mining Engineering Datasets

Mustafa Ali Abuzaraida


Over the world, companies often have huge datasets those are stored in databases. The huge size could make difficulty of data analysis because data are more complex in terms of attributes number and number of cases. To overcome this problem could be done by using a sufficient number of attributes and cases before mining this dataset. In data mining field, many techniques that can be used to reduce the number of attributes and similar cases. In this paper, three reduction techniques namely Genetic Algorithm (GA), Principal Component Analysis (PCA), and Johnson have been tested on engineering domain using five datasets which obtained from UCI machine learning archive. The study examines which reduction technique is most proper for Engineering datasets. In addition, the study also identifies the ranking of the three techniques based on percentage accuracy and number of selected attributes.

Full Text:



