Abstract:
In computational learning theory, concepts are subsets of a set of instances and a concept class is a set of concepts. In many computational learning models, learning algorithms have the goal to identify the target concept in a concept class from a small number of examples, i.e., labelled instances.
We study graph-theoretic representations of concept classes and their connections to a particular notion of learning complexity, namely teaching complexity. Teaching complexity is a complexity measure for the number of training examples needed in teaching models in which a helpful teacher provides examples to the learner. One
type of graph that we study in this thesis is the one-inclusion graph; In this graph the vertices are concepts and there is an edge between two concepts if they disagree on exactly one instance. One-inclusion graphs have proven useful in analyzing some computational learning properties, e.g., sample compression sizes and expected mistake
bounds of prediction algorithms.
This thesis establishes strong connections between one-inclusion graphs and teaching complexity. We show that the teaching dimension of a concept, i.e., the minimum number of examples required for teaching that concept, is lower bounded by the degree
of the corresponding vertex in the one-inclusion graph. Furthermore, we prove that for concepts in shortest-path-closed classes, the lower bounds are always attained.
We then prove that the average case teaching complexity of any shortest-path-closed class is linearly upper-bounded by the Vapnik-Chervonenkis (VC) dimension of the class. As the VC dimension reflects the complexity of learning from randomly chosen
examples, our theorem gives the evidence that, for the shortest-path-closed concept classes, the average difficulty of learning from a helpful teacher is not higher than that of learning from random examples. We also provide the first characteristic property of
shortest-path-closure in terms of minimum teaching sets and the one-inclusion graph.
As the one-inclusion graphs only provide partial distinguishing information between concepts, we introduce a new type of graph called minimal Hamming graph which provides the complete distinguishing information between concepts. In contrast to the one-inclusion graph, the vertex degrees in the minimal Hamming graph provide non-trivial upper bounds on the teaching dimension of concepts. Furthermore, we define locally orthogonal concept classes in which the teaching dimension of the concepts always meets the upper bound. We also show that locally orthogonal concept classes generalize the shortest-path-closed classes. We prove that the minimal Hamming graphs of such classes are triangle-free and K2;3-free. Some interesting sufficient conditions for local orthogonality are also studied.
We show that the graph-theoretic results on teachability are useful for studying sample compression schemes. A universal sample compression scheme|the peeling sample compression scheme is presented. This compression scheme generalizes some
state-of-the-art compression schemes. In addition, we show that the size of this compression scheme is related to teaching complexity if certain graph-theoretic criteria are fulfilled.
Description:
A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. X, 85 l.