The neocognitron as a system for handwritten character recognition : limitations and improvements

Lovell, David R. (1994). The neocognitron as a system for handwritten character recognition : limitations and improvements PhD Thesis, School of Computer Science and Electrical Engineering, The University of Queensland. doi:10.14264/uql.2016.298

Attached Files (Some files may be inaccessible until you login with your UQ eSpace credentials)
Name Description MIMEType Size Downloads
THE8746.pdf Thesis (open access) application/pdf 11.95MB 0

Author Lovell, David R.
Thesis Title The neocognitron as a system for handwritten character recognition : limitations and improvements
School, Centre or Institute School of Computer Science and Electrical Engineering
Institution The University of Queensland
DOI 10.14264/uql.2016.298
Publication date 1994-03-14
Thesis type PhD Thesis
Supervisor Tom Downs
Ah Chung Tsoi
Total pages 306
Collection year 1994
Language eng
Subjects 280000 Information, Computing and Communication Sciences
Formatted abstract
This thesis is about the neocognitron, a neural network that was proposed by Fukushima in 1979. Inspired by Hubel and Wiesel's serial model of processing in the visual cortex, the neocognitron was initially intended as a self-organizing model of vision, however, we are concerned with the supervised version of the network, put forward by Fukushima in 1983. Through "training with a teacher", Fukushima hoped to obtain a character recognition system that was tolerant of shifts and deformations in input images. Until now though, it has not been clear whether Fukushima's approach has resulted in a network that can rival the performance of other recognition systems. 

In the first three chapters of this thesis, the biological basis, operational principles and mathematical implementation of the supervised neocognitron are presented in detail. At the end of this thorough introduction, we consider a number of important issues that have not previously been addressed. How should S-cell selectivity and other parameters be chosen so as to maximize the network's performance? How sensitive is the network's classification ability to the supervisor's choice of training patterns? Can the neocognitron achieve state-of-the-art recognition rates and, if not, what is preventing it from doing so? 

Chapter 4 looks at the Optimal Closed-Form Training (OCFT) algorithm, a method for adjusting S-cell selectivity, suggested by Hildebrandt in 1991. Experiments reveal flaws in the assumptions behind OCFT and provide motivation for the development and testing (in Chapter 5) of three new algorithms for selectivity adjustment: SOFT, SLOG and SHOP. Of these methods, SHOP is shown to be the most effective, determining appropriate selectivity values through the use of a validation set of handwritten characters. 

SHOP serves as a method for probing the behaviour of the neocognitron and is used to investigate the effect of cell masks, skeletonization of input data and choice of training patterns on the network*s performance. Even though SHOP is the best selectivity adjustment algorithm to be described to date, the system's peak correct recognition rate (for isolated ZIP code digits from the CEDAR database) is around 75% (with 75% reliability) after SHOP training. It is clear that the neocognitron, as originally described by Fukushima, is unable to match the performance of today's most accurate digit recognition systems which typically achieve 90% correct recognition with near 100% reliability. 

After observing the neocognitron's failure to exploit the distinguishing features of different kinds of digits in its classification of images, Chapter 6 proposes modifications to enhance the networks ability in this regard. Using this new architecture, a correct clcissification rate of 84.62% (with 96.36% reliability) was obtained on CEDAR ZIP codes, a substantial improvement but still a level of performance that is somewhat less than state-of-the-art recognition rates. Chapter 6 concludes with a critical review of the hierarchical feature extraction paradigm. 

The final chapter summarizes the material presented in this thesis and draws the significant findings together in a series of conclusions. In addition to the investigation of the neocognitron, this thesis also contains a derivation of statistical bounds on the errors that arise in multilayer feedforward networks as a result of weight perturbation (Appendix E).
Keyword Optical character recognition devices
Neural networks (Computer science)

Document type: Thesis
Collections: UQ Theses (RHD) - Official
UQ Theses (RHD) - Open Access
Version Filter Type
Citation counts: Google Scholar Search Google Scholar
Created: Wed, 29 Jul 2015, 16:17:45 EST by Mr Andrew Martlew on behalf of Learning and Research Services (UQ Library)