Complexity Parameters for Learning Multi-Label Concept Classes,
In Computational Learning Theory, one way to model a concept is to consider it as a member of the Cartesian products of instances (sets), where each instance may correspond to a binary or multi-valued domain. A concept class is a set of concepts, and the goal of learning algorithms is to identify the target concept in a concept class from a small number of examples, i.e., labeled instances. This thesis studies multi-label concept classes and three important learning complexity parameters for these classes. The first parameter examined in this work is the Vapnik-Chervonenkis-dimension (VCD) and its previously studied analogues for multi-label concept classes such as the Graph-dimension, Pollard's pseudo-dimension and the Natarajan dimension. This thesis also extends the study of sample compression schemes to multi-label concept classes. A sample compression scheme (SCS) for a concept class C compresses every set S of labeled examples for some concept in C to a subset, which is decompressed to some concept that is consistent with S. The size of the SCS is the cardinality of its largest compressed set. The best possible size of an SCS is the second parameter studied below. This work formulates a sufficient condition, which we call the reduction property, for a notion of VC-dimension (VCD ) to yield labeled compression schemes for maximum classes of VCD d in which the compression sets have size at most d. This compression scheme is in fact an extension of Floyd and Warmuth's binary scheme to the multi-label case. The same condition also yields a so-called tight SCS, which we de ne to generalize Kuzmin and Warmuth's unlabeled binary scheme to the multi-label case. The Graph dimension satis es our su cient condition, while neither Pollard's pseudo-dimension nor the Natarajan dimension does. Moreover, this thesis shows that any class of Graph-dimension 1 has an SCS of size 1. The third parameter studied in this thesis is the recursive teaching dimension (RTD), a complexity parameter of the recently introduced recursive teaching model, which is of interest in the study of binary concept classes because of its connection to the VCD. In the recursive teaching model, the teacher and learner agree to start the learning process with the easiest concept in the class, i.e., a concept with the minimum teaching dimension (minimum number of labeled examples needed to identify a concept among all other concepts in the class). This concept is then removed from the concept class and a concept with the minimum teaching dimension in the remaining class is chosen to teach and remove, and the process continues until no concepts are left in the class. This procedure is called a recursive teaching plan and the recursive teaching dimension of the class is the largest minimum teaching dimension encountered in the plan. This thesis establishes a further connection between RTD and VCD in both the multi-label and binary cases, by providing an upper bound on the size of classes of a given RTD, analogous to Sauer's bound on the size of classes of a given VCD . Maximum (largest possible) classes of a given VCD are proven to be RTD-maximum as well. It is shown that for any VCD -maximum class C where VCD fulfills the reduction property, there is a teaching plan for C in which the recursive teaching sets coincide with the compression sets resulting from a tight compression scheme. Methodologically, an algebraic approach turns out to be useful, for example to prove an interesting graph-theoretic property of VCD -maximum classes.