MachineLearningProbabilityTheoryStatistics Given a family of functions , the Rademacher complexity is a measure of the maximum correlation of to random noise, i.e. how well can fit random noise. Given a sample the (data-dependent) empirical Rademacher complexity is where are independent and is measurable. The empirical Rademacher complexity measures how well, on average, correlates with random noise on the data .

The (distribution dependent) Rademacher complexity is the expectation of the empirical Rademacher complexity for a given -dimensional marginal distribution :

  1. Bounds on families of losses: .

    • Classification:
    • Regression:
  2. Concentration inequalities: Let

  3. Generalization bounds: Follow immediately from 1) and 2)

    • Classification:
    • Regression:
  4. Upper bound by the Growth Function (): For \Phi(X_1, \cdots, X_n) := \sup_{f \in \mathcal{F}}( \overline{f(X_1, \cdots, X_n)} - \mathbb{E}[f(X)])\Phi\mathbb{E}_S[\Phi(X)]\frac{\delta}{2}$) and applying the union bound we arrive at the concentration bound for the empirical Rademacher complexity.