A part of the project: Automated Sewer Inspection Robot
Multi-Task Classification of Sewer Pipe Defects and Properties using a
Cross-Task Graph Neural Network Decoder
Joakim Bruslund Haurum, Meysam Madadi, Sergio Escalera, and Thomas B. Moeslund
WACV 2022
Paper (ArXiv) | Paper (CVF) | Code | Models
Introduction
This is the project page for the Cross-Task Graph Neural Network (CT-GNN) Decoder, a novel decoder-focused multi-task classification (MTC) approach, which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention. We classify the related and concurrent sewer pipe defects and properties concurrently achieving state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and water level classification by 5.3 and 8.0 percentage points, respectively. The CT-GNN also outperforms the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous model-focused approaches.
Cross-Task Graph Neural Network Decoder
Multi-task models are commonly structured in an encoder-decoder structure, where the tasks are predicted by separate task-specific heads. The CT-GNN Decoder is a decoder-focused MTC model i.e. a CNN encoder extracts task specific features which are feed to a decoder which shared information is across tasks. The CT-GNN consists of four parts: task-specific decoder heads, task-specific bottleneck layers, class-specific node embedding layers, and the essential cross-tasks GNN.
data:image/s3,"s3://crabby-images/f89e8/f89e8cf7d7ed74ded6ec5d0477b04433b72a2f4e" alt=""
The Cross-Task GNN component effectively mixes class representations from different classes across tasks, based on an a prior determined adjacency matrix. The adjacency matrix is computed by first determining a conditional probability matrix across all considered classes in all tasks, and subsequently thresholding and (optionally) re-weighting the adjacency matrix, in order to remove spurious connections. We compare using two different types of GNNs: Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT).
Conditional Probability matrix GCN Binary Adjacency Matrix GAT Binary Adjacency Matrix GAT Reweighted Adjacency Matrix
Results
Using the four classification tasks from the Sewer-ML dataset (Defects, pipe shape, pipe material, and water level) we compare the proposed CT-GNN decoder with a baseline hard-shared encoder-focused model, optimization-based methods, a soft-shared encoder-focused model, and single-task models where each task is predicted separately. All models use a ResNet-50 backbone. We find that the CT-GNN outperforms all other models as well as previous state-of-the-art methods on defect and water level classification, while also using 50 times fewer parameters than state-of-the-art encoder-focused MTC methods.
Model | Delta | F2-CIW | F1-Normal | MF1-Water | mF1-Water | MF1-Shape | mF1-Shape | MF1-Mat. | mF1-Mat. |
STL | +0.00 | 58.42 | 92.42 | 69.11 | 79.71 | 46.55 | 98.06 | 65.99 | 96.71 |
R50-MTL | +10.36 | 59.73 | 91.87 | 70.51 | 80.47 | 71.64 | 99.34 | 80.28 | 98.09 |
CT-GCN | +12.39 | 61.35 | 91.84 | 70.57 | 80.47 | 76.17 | 99.33 | 82.63 | 98.18 |
CT-GAT | +12.81 | 61.70 | 91.94 | 70.57 | 80.43 | 74.53 | 99.40 | 86.63 | 98.24 |
Defect task Pipe material task Pipe shape task Water level task
From the per-class results we notice that the defect classes with the largest economic impact but few samples, OS, PB, and PS, are improved upon compared to the hard-shared encoder-focused baseline. On the pipe shape and material tasks we see that there is a large increase in performance on the rare classes such as Brickwork and Unknown pipe materials, and Egg-shape and Rectangular pipe shapes. Lastly, we see that the CT-GNN decoder has little effect on the water level task
Citation
@InProceedings{Haurum_2022_WACV, author = {Haurum, Joakim Bruslund and Madadi, Meysam and Escalera, Sergio and Moeslund, Thomas B.}, title = {Multi-Task Classification of Sewer Pipe Defects and Properties using a Cross-Task Graph Neural Network Decoder}, booktitle={2022 IEEE Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2022} }