The ever growing digital imagery in modern society has made computer vision an increasingly important field of research. Particularly, the classification systems of computer vision are becoming critical in resolving the pressing demand for extracting and recognizing various information from a massive archive of image data. Existing work on the classification of images is however facing severe challenges when there is a large amount of low-resolution image and video data. This is because the commonly used feature models in the literature, which often are techniques based on experience for a particular system, are lacking in flexibility and robustness.
Processing and classifying low-resolution images poses a unique set of challenges. First of all, low-resolution images usually do not provide sufficient information because of the small number of pixels. Most of the details in texture have been discarded during data compression. Secondly, low-resolution images often carry distortions and image artifacts. It is formidable to perform reliable feature detection and extraction in pixelated textures and shapes.
In order to address the aforementioned challenges, we investigate the statistical properties of low-resolution images and design efficient tools for low-resolution image classification. The proposed approaches utilise these statistical properties and learn receptive field models as in biological vision systems. We also investigate the possibility of improving the classification performance for large quantities of low-resolution data. Overall, we provide a set of statistical tools applicable to several image classification applications in various domains as follows.
The first domain is person re-identification in low-resolution CCTV systems under uncontrolled surveillance conditions. In contrast to controlled surveillance conditions which may contain sufficient biometric features for identification (e.g., irises, faces), surveillance video in uncontrolled scenarios often captures pedestrian images from a significant distance. The cropped person images are usually in very low resolution. In this work, we first investigate the non-Gaussian statistical property of low-resolution images. Then we propose to effectively utilise this property for person re-identification via statistical approximations, which is proved to be a simple, yet robust feature for real-time surveillance applications.
The second domain is the robust classification for low-resolution microscopy images in an automatic diagnostic system. We investigate pathology tests on Human Epithelial Type 2 (HEp-2) cells using Indirect Immunofluorescence (IIF). The screening process for Anti-Nuclear Antibody (ANA) is similar to a surveillance scenario. Advances in general object classification can be modified and applied to improve the accuracy of classification of HEp-2 cells. However, the existing cell classification systems suffer from numerous shortcomings due to the expert-based features. In order to avoid this, we propose a novel statistical learning process based on Independent Component Analysis (ICA), which is inspired by the receptive field model in biological vision. The ICA framework takes the previously discussed properties of low-resolution images in the person re-identification domain into consideration, as it is a statistical tool that computes independent components from the given data of non-Gaussian distribution. Furthermore, the particular statistical properties of HEp-2 cells are addressed by utilizing randomly generated spontaneous activity patterns. These patterns are inspired by the spontaneous neural activity found in newborn animals, and employed for classification tasks in this work for the first time.
The third domain is the robust classification on a boarder range of people/objects/locations in a large, low-resolution web video database. This problem is an extension of the applications of person re-identification and medical image classification. A novel approach based upon Independent Subspace Analysis, which improves on Independent Component Analysis, is proposed to handle a video database of more than a million frames. Unlike most other approaches that only handle specific features, the proposed approach focuses on the essential statistical properties of images and has shown better robustness over a wide range of object categories.
The fourth domain is the effective evaluation of the quality of images in a given video stream. This is critical when the datasets are very large, and contains much redundant information. In this work, we propose to employ a novel statistical evaluation on surveillance and online video streams. As a part of the statistical framework, the proposed key frame selection method can be effectively applied along with the statistical methods for classification.
The proposed methods in this thesis have been designed to work effectively under uncontrolled environments with an aim towards real-time performance. Experiments and comparative evaluation on benchmark datasets suggest that the proposed algorithms perform better than other well-known methods existing in the literature.