MachineLearning
A concept class
- Data distribution:
- i.i.d. data:
with - Deterministic target concept:
measurable and - Hypothesis set:
returns a hypothesis that approximates , in the sense that and where is the product measure of (identical) samples and is the generalization error. If the generalization error measured w.r.t. a data distribution that is a joint distribution over , so that , is PAC-learnable if Given an error tolerance (i.e. “approximately correct”) and confidence level (i.e. “probably”) the algorithm returns such a hypothesis given a large enough sample. As the number of samples is a lower bound on the computational complexity of , we call efficiently PAC-learnable if it runs in time.
Finite hypothesis set
Consider a finite hypothesis set
- Consistent:
i.e. If returns, for the sample , a consistent hypothesis then the following PAC-learning generalization bound holds Giving generalization bound: and so PAC-learning holds with sample complexity: - Inconsistent:
i.e. Where by Hoeffding’s Inequality Giving generalization bound: