The UCI KDD Archive of Large Data Sets for Data Mining Research and ExperimentationStephen D. Bay, Dennis Kibler, Michael J. Pazzani, and Padhraic SmythDepartment of Information and Computer Science University of California, Irvine Irvine, CA 92697, USA {sbay, kibler, pazzani, smyth}@ics.uci.edu |
|
Advances in data collection and storage have allowed organizations to create massive, complex and heterogeneous databases, which have stymied traditional methods of data analysis. This has led to the development of new analytical tools that often combine techniques from a variety of fields such as statistics, computer science, and mathematics to extract meaningful knowledge from the data. To support research in this area, UC Irvine has created the UCI Knowledge Discovery in Databases (KDD) Archive (http://kdd.ics.uci.edu) which is a new online archive of large and complex data sets that encompasses a wide variety of data types, analysis tasks, and application areas. This article describes the objectives and philosophy of the UCI KDD Archive. We draw parallels with the development of the UCI Machine Learning Repository and its affect on the Machine Learning community. |
|
Postscript. PDF.   Home |