ConvexAnalysisMachineLearningStatistics

Functional Margin

Given a hypothesis for binary classification, the margin, functional margin, or confidence margin for a datapoint is the quantity When , then and the magnitude can be interpreted as the confidence of the prediction. Any (??) loss function can be written as a non-decreasing function of the margin: furthermore the margin-loss is defined to penalize hypotheses for having a small margin One can view this loss as an approximation to the indicator loss and due to the fact the loss is -Lipschitz one has the following generalization bound as a result of Talagrand’s Contraction Lemma: $$$$ For multiclass classification the margin is so that when then is misclassified and the empirical margin-risk denotes the fraction of data points with margin less than . This gives the following generalization bound:

Geometric Margin

Given a linear classifier the geometric margin of is and given a set of linearly separable data points , the geometric margin for the sample is the minimum distance of the ‘s from a decision boundary: note that the functional margin for a linear classifier, in the linearly separable case, upper bounds the geometric margin