The Independent Sign Bias: Gaining Insight from Multiple Linear Regression

Michael J. Pazzani and Stephen D. Bay
Department of Information and Computer Science
University of California, Irvine
Irvine, CA 92697, USA
{pazzani, sbay}@ics.uci.edu

Abstract

As electronic data becomes widely available, the need for tools that help people gain insight from data has arisen. A variety of techniques from statistics, machine learning, and neural networks have been applied to databases in the hopes of mining knowledge from data. Multiple regression is one such method from modeling the relationship between a set of explanatory variables and a dependent variable by fitting a linear equation to observed data. Here, we investigate and discuss some factors that influence whether the resulting regression equation is a credible model of the data.

PDF. Postscript. Slides.

Home