Imperfect labels and imperfect classifiers

Aalborg University

Galadrielle Humblot-Renaux

Visual Analysis and Perception lab, Aalborg University, Denmark

Supervisors: Thomas B. Moeslund and Sergio Escalera

📌 Abstract

In supervised deep learning for classification, large sets of labeled examples are used to optimize neural networks’ weights. By penalizing predictions which do not align with the corresponding label, we incentivize the model to learn associations between the input contents and its correct class. If these learned associations are general enough, then the model can successfully be used to automatically classify new inputs after training.

However, the data used during training is inherently incomplete, as it only represents a subset of the input space. It also may not be entirely reliable: some (input,label) pairs may be incorrect or uncertain. Meanwhile, the data seen during testing can be unpredictable as well, and doesn’t necessarily align with the model’s “comfort zone”. And what if the input contains a class that is unknown to the model altogether? Together, these factors can contribute to the model failing to correctly classify inputs at test-time.

In this thesis, we focus on imperfections and ambiguity in classification, both from the perspective of data and from the perspective of model predictions. On the prediction side, we study the problem of detecting misclassifications and Out-Of-Distribution (OOD) inputs, and propose simple but effective baselines based on ensembling. On the data side, we explore the potential of soft labeling when using cheaply annotated data, and measure multi-annotator disagreement in an underrepresented domain. We also investigate the intersection of label noise and out-of-distribution detection, which has received little attention despite its practical relevance.

All in all, this thesis presents novel labeling approaches for 2D and 3D scene understanding and fresh critical perspectives on OOD detection. Beyond general computer vision, this work also makes contributions to two specialized domains - benthic habitat mapping and power system stability assessment - which we hope will encourage future applications of deep learning to consider imperfect labels and imperfect classification.

Funding

The PhD was supported by the Danish Data Science Academy, which is funded by the Novo Nordisk Foundation (NNF21SA0069429) and VILLUM FONDEN (40516).