Skip to main content

learning theory

Questions

Explain what is the VC dimension of a hypothesis space and what it is used for. (6)

As needed background we need to define:

  • A dichotomy of a set $S$ is a partition of $S$ into two disjoint subsets
  • A set of instances $S$ is shattered by the hypothesis space $H$ is and only if for every dichotomy of $S$ there exists some hypothesis $h \in H$ consistent with the dichotomy

The Vapnik-Chervonenkis dimension, $VC(H)$, of hypothesis space $H$ defined over instance space $X$ is the size of the largest finite subset of $X$ shattered by $H$. If arbitrarily large finite sets of $X$ can be shattered by $H$, then $VC(H) ≡ \infty$.

The $VC(H)$ dimension is use to construct a probably approximately correct bound for the true error given the train error of an hypothesis $h \in H$ when $H$ is continuos. In particular the bound is: $$ L_{true}(h) \le L_{train}(h) + \sqrt{\frac{VC(H)(\ln \frac{2N}{VC(H)} +1)+ \ln \frac{4}{\delta}}{N}} $$ Where this bound means that with probability $1-\delta$ the gap between the true error $L_{true}$ and the train error $L_{train}$ of $h \in H$ is smaller or equal than some value which depends on:

  • $VC(H)$ the VC dimension of the hypothesis space
  • N the number of training samples
  • $\delta$ how much we want to be sure about the bound

The lower bound on the number of samples $N$ to guarantee error of at most $\epsilon$ with probability $(1-\delta)$ is: $$ N \ge \frac{1}{\epsilon} (4 \log_2 (\frac{2}{\delta}) + 8 VC(H) \log_2 (\frac{13}{\delta})) $$

Some other important properties of VC dimension are:

  • The VC dimension of a finite hypothesis space is bounded: $VC(H) \le \log_2(|H|)$
  • A concept class $C$ with $VC(C) = \infty$ is not PAC-learnable.