Several terror incidents all over the world have revealed the practical limitation of surveillance cameras in public places for forensics applications, as the images captured by these cameras are of poor quality and of limited use for automated identification. These poor-quality images are generated due to poor imaging conditions, occlusion, varying lighting, motion blur, and most importantly the distance between picturing cameras and objects in the scene. This large distance results in a very small area devoted to each object parts, e.g., only a few pixels of an image can be used for representing a face, i.e., details of facial regions cannot be seen. Facial images of such small sizes, which are worsened even more by other mentioned degradation forms, can hardly be used in any computer vision tasks, like detection and recognition. To use such surveillance footage in forensics applications extensive manual work (usually in scales of thousands man-hours) is needed to, e.g., find suspects through networks of cameras or do any kind of recognition.
The objective of this project is to fill the gap between computer vision and forensics by improving the quality of facial images captured by surveillance cameras. We do so by improving existing image super-resolution methods by combining information from three different image modalities, namely the visible spectrum (RGB), depth, and thermal domains, and by using both spatial and temporal information.
Reveal More Details: Spatiotemporal RGB-D-T Super-Resolution is funded by Danmarks Frie Forskningsfond