Article Details

A Study on the Existing Feature Extraction Methods on Collected Dataset by Visualizing the Features | Original Article

Aarti Kaushik*, Vijay Pal Singh, in Journal of Advances and Scholarly Researches in Allied Education | Multidisciplinary Academic Research

ABSTRACT:

This paper shows many irrelevant attributes may be present in data to be mined. So they need to be removed. Also many mining algorithms don’t perform well with large amounts of features or attributes. Therefore feature selection techniques needs to be applied before any kind of mining algorithm is applied. The main objectives of feature selection are to avoid over fitting and improve model performance and to provide faster and more cost-effective models. The selection of optimal features adds an extra layer of complexity in the modelling as instead of just finding optimal parameters for full set of features, first optimal feature subset is to be found and the model parameters are to be optimised. Attribute selection methods can be broadly divided into filter and wrapper approaches. In the filter approach the attribute selection method is independent of the data mining algorithm to be applied to the selected attributes and assess the relevance of features by looking only at the intrinsic properties of the data. In most cases a feature relevance score is calculated, and low scoring features are removed. The subset of features left after feature removal is presented as input to the classification algorithm. Advantages of filter techniques are that they easily scale to high dimensional datasets are computationally simple and fast, and as the filter approach is independent of the mining algorithm so feature selection needs to be performed only once, and then different classifiers can be evaluated.