Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD),[1] an interdisciplinary subfield ofcomputer science,[2][3][4] is the computational process of discovering patterns in large data sets involving methods at the intersection ofartificial intelligencemachine learningstatistics, and database systems.[2] The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.[2] Aside from the raw analysis step, it involves database and data management aspects, data pre-processingmodeland inference considerations, interestingness metrics, complexityconsiderations, post-processing of discovered structures, visualization, and online updating.[2]

The term is a misnomer, because the goal is the extraction of patterns and knowledge from large amount of data, not the extraction of data itself.[5] It also is a buzzword[6] and is frequently applied to any form of large-scale data or information processing (collectionextraction,warehousinganalysis, and statistics) as well as any application ofcomputer decision support system, including artificial intelligence,machine learning, and business intelligence. The popular book "Data mining: Practical machine learning tools and techniques with Java"[7](which covers mostly machine learning material) was originally to be named just "Practical machine learning", and the term "data mining" was only added for marketing reasons.[8] Often the more general terms "(large scale) data analysis", or "analytics" – or when referring to actual methods, artificial intelligence and machine learning – are more appropriate.

