Integrating Ontology-Based Information Extraction Systems and Spatial Modeling for Land Use Analysis and Simulation
Abstract
Information Extraction (IE) is defined as the automatic identification and extraction of a predefined set of concepts, relevant to a specific domain of knowledge, ignoring irrelevant information. IE converts unstructured text into structured data and extracts the required information. IE is to some extent domain-specific. In IE, the information to be extracted must be specified a priori. In the context of IE, the ontology is used to structure and represent the domain knowledge. The IE system extracts information with respect to the domain ontology and then populates the ontology with the extracted information. This process is referred to as Ontology-Based Information Extraction (OBIE).
Many jurisdictions apply land use suitability analysis (LUSA) for land use planning. LUSA in this context is used to assess the appropriateness of a specific area of land for a particular kind of use. Each jurisdiction has its own regulations and policies that are applied to assess land use suitability for that geographic area. The regulations provide the criteria for the factors to be included in the multi-criteria analysis applied to determine the suitability value for each location for a particular land use. Manually finding and extracting the criteria and their specific values can be tedious and time-consuming.
The objective of this work is the building of an OBIE system for the domain of land use suitability analysis. The proposed system combines the use of ontology, domain-specific gazetteer lists, language processing tools and extraction rules based on regular expressions to automatically add semantic annotation to domain documents (such as regulations and bylaws) and then extract the criteria to be applied to the LUSA process. The knowledge engineering approach was followed to build the LUSA OBIE system. We built an ontology specific for LUSA criteria and created domain-specific gazetteer lists (with terminology specific to land use suitability) and extraction rules.
Our proposed OBIE system covers a domain that so far has not been investigated by researchers and therefore few web resources such as ontologies and domain-specific lexicons were available. This LUSA domain-specific OBIE system makes a significant start towards further investigation and development of resources and tools that will assist in land use suitability analysis and related GIS domains.
The OBIE system does not only identify the type of extracted entities but also links them to their semantic descriptions in the ontology. In addition, the output of the OBIE system is constructed from elements of the ontology, ensuring that the knowledge is captured and represented with respect to the domain model.
The output of the LUSA OBIE can be presented as an ontology populated with instances of the extracted criteria and property values, as a set of semantically (with ontology knowledge) annotated documents or the populated LUSA ontology can be exported to a database or a knowledge base or can be saved as an XML file. The ontology or the knowledge base can be semantically queried or be used to perform automatic reasoning.
The output from LUSA OBIE is applied here to help produce a land use suitability map for the City of Regina, Saskatchewan to assist in the identification of suitable areas for residential development. The resulting maps can be input for simulation models, such as cellular automata (e.g., for the prediction of urban growth), or can be used in a decision-making process; e.g., to assess an application for a residential subdivision.