### Abstract:

In Computational Learning Theory, a binary concept is often represented by a set
of pairs (x; l), where x varies over a set of instances known as a universe or instance
space, and l is a label, either "+" or "-", indicating whether or not x belongs to
the concept. This thesis will consider a type of machine learning task known as
supervised learning, where an algorithm M is fed with a set of labelled data for a
target concept belonging to a given class of concepts, and M produces a hypothesis
function that is used to accurately predict the correct labels of unseen instances.
The learning process typically involves two agents, a teacher and a learner. The
task of a teacher is to provide labelled examples for a target concept to a learner,
who in turn makes a hypothesis based on the seen data. The present thesis focusses
on the teacher's role; specifically, the question: given any concept class C and a
learning algorithm M, what is the minimum number of labelled examples needed
by M to exactly identify any given concept belonging to C? This quantity may be
conceived as a measure of the information complexity or teaching complexity of C.
Our work will study the teaching complexity of three related families of concept
and the erasing pattern languages. Linear sets and pattern languages are mathematical
objects that are closely connected to automata theory and formal languages.
In this thesis, the teaching complexity of a concept class will be measured mainly
by two combinatorial parameters, the teaching dimension (TD) and the recursive
teaching dimension (RTD). The TD of a concept C with respect to a class C containing
C is defined as the size of a smallest sample (called a teaching set for C with
respect to C) that is labelled consistently with C but not with any C0 in C distinct
from C; the TD of C is the worst-case TD over all C in C (or infinity if the set
consisting of the teaching dimensions of all C in C with respect to C is unbounded).
Concerning the TD, we are chiefly interested in two questions with respect to any
given concept class C: first, is there a decidable characterization of the concepts in C
that have finite TD with respect to C; second, given a representation for concepts in
C, how does the TD of C with respect to C vary with the size of the representation
of C?
The second main parameter studied in this work, the recursive teaching dimension
(RTD), arises from the recently introduced recursive teaching model, and it is
of particular interest in learning theory due to its links to other learning models as
well as to the Vapnik-Chervonenkis dimension { one of the most important parameters
in Statistical Learning Theory { in the context of finite concept classes. Our
main finding about the RTD in this thesis is that for many classes of linear sets
(resp. pattern languages), recursive teaching is significantly more sample efficient
than the classical teaching protocol.

### Description:

A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science, University of Regina. vii, 194 p.