A part of the project: Automated Sewer Inspection Robot

Multi-Task Classification of Sewer Pipe Defects and Properties using a
Cross-Task Graph Neural Network Decoder

Joakim Bruslund Haurum, Meysam Madadi, Sergio Escalera, and Thomas B. Moeslund
WACV 2022

Paper (ArXiv) | Paper (CVF) | Code | Models


This is the project page for the Cross-Task Graph Neural Network (CT-GNN) Decoder, a novel decoder-focused multi-task classification (MTC) approach, which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention. We classify the related and concurrent sewer pipe defects and properties concurrently achieving state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and water level classification by 5.3 and 8.0 percentage points, respectively. The CT-GNN also outperforms the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous model-focused approaches.

Cross-Task Graph Neural Network Decoder

Multi-task models are commonly structured in an encoder-decoder structure, where the tasks are predicted by separate task-specific heads. The CT-GNN Decoder is a decoder-focused MTC model i.e. a CNN encoder extracts task specific features which are feed to a decoder which shared information is across tasks. The CT-GNN consists of four parts: task-specific decoder heads, task-specific bottleneck layers, class-specific node embedding layers, and the essential cross-tasks GNN.

The Cross-Task GNN component effectively mixes class representations from different classes across tasks, based on an a prior determined adjacency matrix. The adjacency matrix is computed by first determining a conditional probability matrix across all considered classes in all tasks, and subsequently thresholding and (optionally) re-weighting the adjacency matrix, in order to remove spurious connections. We compare using two different types of GNNs: Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT).


Using the four classification tasks from the Sewer-ML dataset (Defects, pipe shape, pipe material, and water level) we compare the proposed CT-GNN decoder with a baseline hard-shared encoder-focused model, optimization-based methods, a soft-shared encoder-focused model, and single-task models where each task is predicted separately. All models use a ResNet-50 backbone. We find that the CT-GNN outperforms all other models as well as previous state-of-the-art methods on defect and water level classification, while also using 50 times fewer parameters than state-of-the-art encoder-focused MTC methods.

Model DeltaF2-CIWF1-NormalMF1-WatermF1-WaterMF1-ShapemF1-ShapeMF1-Mat.mF1-Mat.

From the per-class results we notice that the defect classes with the largest economic impact but few samples, OS, PB, and PS, are improved upon compared to the hard-shared encoder-focused baseline. On the pipe shape and material tasks we see that there is a large increase in performance on the rare classes such as Brickwork and Unknown pipe materials, and Egg-shape and Rectangular pipe shapes. Lastly, we see that the CT-GNN decoder has little effect on the water level task


author = {Haurum, Joakim Bruslund and Madadi, Meysam and Escalera, Sergio and Moeslund, Thomas B.},
title = {Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder},
booktitle={2022 IEEE Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2022}