An Image Generation Methodology for Game Engines in Real-Time Using Generative Deep Learning Inference Frameworks

Date
2021-01
Authors
Tilson, Adam Richard
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Graduate Studies and Research, University of Regina
Abstract

Modern video games require image assets for different uses, including representing surface textures of 3D models, like environments, characters or props. These images are commonly generated using photographic and artistic techniques, including hand-drawing and using real-world photographs with additional modifications. This labour-intensive task is typically undertaken by skilled technical artists. However, recently a subset of unsupervised deep machine learning known as Deep Generative Models, have demonstrated the ability to, after sufficient training from an appropriate dataset, create convincing novel counterfeit images mimicking that dataset. This thesis investigates using Deep Generative Models to create image assets for games in real time, directly in the game engine, on common hardware, comparing generation performance across different contexts. The methodology described leverages machine learning as a novel alternative or complementary workflow to traditional image development methods providing some unique advantages to machine learning models. Before these advantages can be exploited, the methodology first needs to be validated. This thesis validates the methodology through demonstrating several procedures for deploying and accessing trained models in a game engine, using inference frameworks running machine learning libraries on the CPU, on low-level matrix math libraries, and on the GPU using compute shaders. The thesis outlines the usage, advantages, limitations and trade-offs to each approach. Six inference frameworks are compared in terms of instantiation and generation time, for generating counterfeit hand-written digits, using various generative models.

Next, the thesis investigates a more complicated generation task: generating human faces, and compares performance across engines and platforms. Results show this may be realistically used for real-time generation by the criteria of generating more than ten samples per second in different real-world contexts. Additionally, the generators are confirmed to produce repeatable, consistent results across contexts. Finally, the thesis investigates face generation via disentangled models, observing generated faces may predictably change by traversing the latent space, allowing the model-user to fine tune the generated output. A parameter sweep is performed for the disentanglement hyper-parameter to count how many learned features may be modified independently and predictably, versus reduced reproduction fidelity. The most disentangled model had nine predictably modifiable features.

Description
A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Applied Science in Software Systems Engineering, University of Regina. xii, 136 p.
Keywords
Citation
Collections