The Effects of Lateral Inhibition
on Pattern Recognition in a Neural Network

A Report produced as the result
of independant study

by Jim Lesko

2 June 1997


Full Report | Software | Raw Data | Charts & Tables | Java Demo | Home
The purpose of this project was to determine if a biologically motivated visual filtering system, in this case that of the Limulus polyphemus, has any effect on pattern recognition in an artificial neural network. The first part of the project consisted of creating and refining a model of the Limulus eye, and the creation of suitable data and training a neural net on that data was the second part. Some difficulties arose during the project; the biggest being that the original neural net code could simply not handle the volume of data that needed processing. Different software was found, namely NevProp, a package produced at the University of Nevada. The Limulus - style filter had an effect on the accuracy reached by each of the three primary training sets - in two cases the effect was to increase accuracy, while in the last, accuracy was decreased. Several tests were run to test the generalization capabilities of NevProp as well.

The very first thing that was done to model the Limulus eye was to take pre-existing software and change it from a 1D model to a 2D model. A 20 x 20 matrix of "cells" was chosen for the "retina" -- an arbitrary choice, but one that allowed flexibility while keeping the total number of cells at a reasonable level. The model itself acted on the inputs by utilizing lateral inhibition - all cells in the matrix are connected to the entire set of cells and inhibit the firing rate of those cells. The effect of the inhibition in the Limulus eye has been shown to follow a decaying exponential, and that is what this model used. The model allows the user to modify the maximum inhibition and the length constant (the distance at which the inhibition is at 1/3 its maximum) of inhibition. In addition, the model allows the user to set contrast levels, noise rate and maximum level, and offset of the stimulus. For this project, the more realistic delta rule was rejected in favor of a one-step model. Differences do exist between the two sets of results, but the delta rule model takes roughly ten times longer to reach its results.

Eleven stimuli were chosen: 8 letters and 3 shapes. The letters used were the first 8 of the alphabet; coincidentally, these 8 letters form 4 orthographically similar pairs (A/H, B/D, C/G, E/F). A triangle and two different squares were used to round out the stumuli. The strokes of each letter were only 1 pixel wide, and so the thicker shapes were added to increase the generality of the simulation.

The parameters of the Limulus model were as follows: Each cell could fire at a rate between 0 and 100 (inclusive). For the data marked Filtered (1) or just Filtered, a maximum inhibition of .3 and a length constant of 2 was used; for Filtered (2) data, a maximum inhibition of .2 was used in conjunction with a length constant of 4. High contrast data was assumed to project the stimulus at a rate of 100 against a background of 0 -- here, the equivalent of a black letter on white. Low contrast data had a stimulus of 100 on a background of 50 (black on gray). Medium contrast used a 100-level stimulus against a background of 20 (black on light gray). Noisy data was calculated as follows: low noise consisted of each cell having a 50% chance of having a firing rate of 0-50 (in 6 discrete steps; noise was non-continuous); high noise was created by giving each cell a 100% chance of having a firing rate of 0-100 (again, non-continuous). The stimulus was set at 100 for each of the noisy simulations. It should be noted that the noise was completely (pseudo-) random for each stimulus; no stimulus had exactly the same noise. The offset datasets simply contained 100-rate stimuli against a background of 0, but moved towards a corner by 3-6 cells in the horizontal and vertical. Most, but not all, stimuli in the offset group were moved by 4 along each axis.

Eight different simulations were run on each of the eleven stimuli. Contrast was the primary factor that was to be tested; unfiltered matrices were produced for high, medium, and low contrast situations, while the filtered matrices were produced for high and low contrast situations. Noise is also an important factor in vision, and both low and high noise situations were created. The last four sets of stimuli were offset by varying the distance of each stimulus towards each of the corners of the "eye." Two sets of filters were applied to the low contrast data to supply alternate results.

Once individual matrices were created for each of the above situations, they were grouped together in a format usable by NevProp. Individual tests were then created by copying groups into test (".net") files. Three primary test sets were produced: Set 1 trained on high contrast data and tested on low contrast, Set 2 trained on high contrast and tested on low noise, and Set 3 tested on high contrast and tested on high noise. A generalization set was produced, with 2 of the tests training on centered high contrast data and testing on offset stimuli and the third training on a random set of offset and centered stimuli and testing on similarly positioned stimuli not in the training set. Other data were produced by training on a variety of contrasts (and sometimes low noise) and testing on low contrast or noisy situations.

NevProp uses quickprop and gradient decent with momentum to train networks. Simulations were run without quickprop, and no usable results were obtained over 2000+ epochs of training. All data in this report were obtained using the hybrid method of training. NevProp tests while it trains; during each epoch, results are produced, backpropagation applied for the training set, and then the error for the test set is calculated. NevProp stops training (and the simulation) after a number of epochs have passed with no measurable gain in accuracy. The actual configuration of the network was thus: 400 cells accepted the input of the "eye." A hidden layer of 120 cells was used to allow good generalization in weight space; the output layer consisted of 11 cells. The output was to consist of grandmother cell identification of the stimulus. (See Configuration.)

The three primary sets of data and the generalization set were all run with the same seed for the weights to avoid possible variations in the final results arising from varying initial conditions. All sets were set to run for a maximum of 750 epochs; most were stopped at about the 250th epoch by NevProp. Most showed positive results by the 20th epoch, and accurate (~75%) results by the 50th. By the 100th epoch most sets had reached their final results, although accuracy would occasionally fluctuate. It should be noted that during the early training epochs, the testing accuracy would occasionally be well above the accuracy for the training set. Whether this is due purely to noise (or blind luck) is not known at this time. The results are summarized in Table 1.

Roughly 8 hours were spent running NevProp, and the results were interesting. As has been seen in most neural nets, getting to about 90% accuracy is quite easy; getting to 100% is nearly impossible. Three of the eight sets were within 1% of 90% accuracy; three were below 90% by 7%; the remaining two sets were above 90% accuracy. The generalized data set clearly shows that this neural network instantiation is quite capable of extrapolation based on the training set. The most random generalized set, "Mix'n'Match," consisted of 55 different stimuli: each shape was presented in the center and in each corner. 44 were chosen as training data, and 11 as testing data. Accuracy was roughly 80% on this set. If the network had "guessed," an accuracy of about 9% would be expected, and so its performance with "Mix'n'Match" is taken as proof that generalization occurred.

Finding the effects of laterally inhibition on low contrast data was the primary goal of this experiment, and Set 1 was created with this situation in mind (Chart 1). The network was trained to 100% accuracy with the high-contrast set and tested against the low-contrast set. The Filtered (2) set was slightly less accurate than the Filtered (1) set, but both did greater than 6% better than the Unfiltered set. The filter seems to work well for bringing out a stimulus that lays on a poorly contrasted background.

Testing on noisy sets was then performed. Again, the network was trained on highly contrasted (and centered) data. Set 2 (Chart 2) consisted of tests using the low noise situation; here the Unfiltered test performed better than the Filtered test. This is likely due to the fact that inhibition works on all the cells, and with only scattered noise the primary effect of inhibition was on the stimulus itself. The ultimate effect of the inhibition is to lower the contrast of the stimulus against the background, so the Unfiltered set performs roughly 4% better.

Set 3 was the high noise test set (Chart 3). Two Unfiltered sets were used; since noise was completely random, it was thought that wildly different results may occur based solely on the noise level. While in extreme cases the randomness may have undo influence, the two sets (admittedly a small number for statistical purposes) showed only a small (.3%) difference in accuracy. The Filtered set did over 7% better than the Unfiltered sets in this test. Every cell was firing, and the spatial locality of the stimulus coupled with its consistant strength relative to the background resulted in an effect similar to that found in the low-contrast situation.

One test was run without comparison to Filtered data; when tested against medium contrast stimuli, the neural net trained on high contrast data scored over 93% accuracy. This result is 10% better than the results of the test on the low contrast data. The generalization tests had no Filtered component; they were a more "traditional" neural net test. As explained previously, the stimuli were moved by a small but significant number of cells both horizontally and vertically in a high-contrast situation. The accuracy under these conditions was around 84% (+/- 5%) for each of the tests (Chart 4). Before the above sets were run, a number of test sets were created to better understand both the network and the testing process. These sets had random seeds for the weights, resulting in a situation where comparison across tests becomes less reliable. Still, some useful data resulted. When trained on high, medium, and low contrast data, the networked achieved 100% accuracy with a low noise test set. When trained on all three contrast sets in addition to a low noise set, and tested with offset data, an accuracy of 82% was reached (Table 2). Training with a noisy set in addition to the evenly contrasted sets seems to have little effect on the accuracy (Table 3).

The model used in this project could be expanded to become even more like a biological system; for one, it could run continuously on data, with recurrent links between the inhibition level and the recognition level -- this would make it much like the Interactive Activation Model used by McClelland and Rumelhart (Psychological Review, Volume 88, Number 5, Sept. 1981). A forward projection of the stimulus would activate a feature detection level, which would in turn inhibit perception of cells not part of features while passing along information to a letter (or shape) recognition level. Training such a system would present difficulties, and there is likely to be no easy way to choose the correct network configuration.

This was not a comprehensive test of biologically inspired filtration on pattern recognition. An understanding of the effects of Limulus - style was the goal, and the data show that there is indeed an effect, although not always a positive one. The effect of strong lateral inhibition on the solid square often results in its appearing as an empty square; this may have caused some loss of accuracy. Other simulations that have been run with the model used for filtering the data show that a more complex method of inhibition of the type used by mammals ("center-surround") produce a less cluttered picture than simple lateral inhibition, and could produce even better results than those presented here. The constraints placed on accuracy by the network configuration are not known; of course even the experts concede that at this point, choosing the proper configuration is an art rather than a science. This project shows that biological models often will have an effect when applied to artificial neural networks. There are limitations to what neural networks can do reliably, and it is important to know what factors limit their performance.




Back to top | Back to AI