Sequential Approaches to Three-Way Data Analytics

Date
2019-04
Authors
Hu, Mengjun
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Graduate Studies and Research, University of Regina
Abstract

Data analytics is a process to discover useful information in data, draw valuable conclusions, and help users make wise decisions. In most occasions, the conclusions can be formally represented as decision rules in which the left-hand-sides describe conditions and the right-hand-sides give decisions. Ideally, given particular conditions and decisions, the decision rules are expected to indicate definite answers of yes or no. However, definite answers may not be easy to give due to incompleteness, uncertainty, and errors. In such situations, forcing to give definite answers may result in significant mistakes. Instead of bipolar options, a theory of three-way decision allows a third in-the-middle option. This idea moves us from yes/no to yes/maybe/no, from positive/negative to positive/neutral/negative, and from white/black to white/grey/black. The in-the-middle option appropriately handles uncertainty, which makes decisions more flexible and reliable. Three-way data analytics applies thinking in threes with its two stages, namely, data preparation and data analysis. For data preparation, we may have three-way data and feature selection, three-way data visualization, and many others. For data analysis, thinking in threes can be applied on various occasions, such as three-way classification, three-way clustering, and three-way recommendation. This thesis investigates sequential approaches to three-way data analytics, particularly regarding classification-related problems. I start from a specific model of three-way classification with rough set theory, where structured approximations are i proposed to learn three-way classification rules. Acceptance and rejection rules classify positive and negative instances of a class, respectively. Non-commitment rules allow users to make no definite decision in the case of insufficient information. The model is studied in a qualitative setting with both complete and incomplete information. Moving from qualitative to quantitative, I investigate the properties of quantitative subsethood measures in formulating certain quantitative rough set models. Going beyond rough set theory, I propose a general framework of sequential three-way classification where non-commitment rules are subsequently refined. The framework is examined with four specific modes, which demonstrates its usefulness in modelling uncertainty under different circumstances. From complete information to incomplete and uncertain information, the proposed approaches provide meaningful, practical, and reliable ways to draw valuable conclusions from data.

Description
A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science, University of Regina. xiv, 217 p.
Keywords
Citation