MachineLearningStatistics Given the following:

  1. i.i.d random variables:
  2. A target function: where for regression and for classification
  3. A probability measure:
  4. A loss function: Learning from data is to find a function that predicts the ‘s and generalize to unseen pairs , in the sense that minimizes a Risk functional or Generalization Error: where is the distribution function of the joint probability in the stochastic case of noisy labels/measurements. In the deterministic case, where the Bayes error , the expectation is the marginal over .

For classification, where , the canonical generalization error is the misclassification rate or equivalently the expectation of the indicator of the margin-based loss The estimation error is the difference between the generalization error and the approximation error and measures how close a particular hypothesis can get to the best-in-class error: