An Overview of Data Mining
This paper provides an overview of the concept of data mining, introduces some of the popular algorithms used to process data and describes how to use a CRISP-DM data mining process. This includes data collection, data pre-processing, tool selection, performance evaluation, and deployment. CRISP-DM (Cross-Industry Standard Process for Data Mining) is an industry standard methodology for data mining and predictive analytics (Shearer, 2000) which makes data mining and predictive analytics projects more efficient, better organized, more reproducible, more manageable, and more likely to yield business success. Finally, it presents data mining approaches which strive to preserve privacy.